Server

The agdb_server is the OpenAPI REST server that provides remote agdb database management. Running the server is trivial as there are no dependencies, no complicated configuration etc. It can be run on any platform supported by Rust. Please follow the guide:

How to run the server?

The server is based on axum and uses OpenAPI to specify its API (via utoipa) and rapidoc for the OpenAPI GUI. To interact with the server you can use the rapidoc GUI, curl or any of the available API clients. Internally it uses the agdb database:

GUI accessible at (run in a browser when the server is running):

http://localhost:3000/api/v1

Configuration

The server will create default configuration when run and always reads it from the working directory if it exists:

# agdb_server.yaml
bind: ":::3000" # address to listen at (bind to)
address: "localhost:3000" # address the incoming connections will come from
basepath: "" # base path to append to the address in case the server is to be run behind a reverse proxy
static_roots: [] # list of static folders to serve in format <path>:<dir> (e.g. /static:/home/user/www)
admin: admin # the admin user that will be created automatically for the server, the password will be the same as name (admin by default, recommended to change after startup)
data_dir: agdb_server_data # directory to store user data
log_level: INFO # Options are: OFF, ERROR, WARN, INFO, DEBUG, TRACE
log_body_limit: 10240 # maximum length of the body of the request that will be logged in bytes, default is 10KB
request_body_limit: 10485760 # maximum length of the body of the request that will be accepted in bytes, default is 10MB
pepper_path: "" # Optional path to a runtime secret file containing 16 bytes "pepper" value for additionally "seasoning" (hashing) passwords. If empty a built-in pepper value is used - see "How to run the server?" guide for details
tls_certificate: "" # path to the TLS certificate file
tls_key: "" # path to the TLS key file
tls_root: "" # path to the TLS root CA file
cluster_token: cluster # token used between members of the cluster for authentication, treat this value as secret
cluster_heartbeat_timeout_ms: 1000 # number of milliseconds since last message sent to a node in the cluster before the leader sends a heartbeat message
cluster_term_timeout_ms: 3000 # number of milliseconds without receiving a message from the leader after which the nodes will consider leader to be off and begin a new term
cluster: [] # list of "address" fields of all nodes in the cluster including local node - the order and values must be the same in all members of the cluster

You can prepare it in advance in a file agdb_server.yaml. After the server database is created changes to the admin field will have no effect, but the other settings can be changed later. All config changes require server restart to take effect.

The server is built with a default pepper (can be changed as part of the build ) that is used if pepper_path is not specified. If unique it makes sure the two different agdb_server instances do not produce the same password hashes and enhances the security. The pepper value is to be considered a secret.

The TLS is turned on by specifying the tls_certificate and tls_key and internally uses (rustls)[https://github.com/rustls/rustls]. The tls_root is optional (can be empty) and needs to be specified only if you are using self-signed certificates. The certificate use in tls_certificate must be issued for the name (or one of alternative names) used in the address and cluster fields. For self-signed certificate and root CA (usable in docker compose or K8s deployments) you can use the following:

cargo install rustls-cert-gen
rustls-cert-gen \
    --common-name=agdb \
    --ca-file-name=root_ca \
     --cert-file-name=cert \
     --country-name=CZ \
     --organization-name=Agnesoft \
     --san=localhost \
     --san=agdb0 \
     --san=agdb1 \
     --san=agdb2 \
     --output=.

Users

The server has a single admin account (admin by default, configurable with password being the name) that can perform any regular user action + all admin actions such as creating users. You can use this account for using the database locally, but it would be advisable to use it only for maintaining the server and to create a regular user for use with the databases:

 # produce an admin API token, e.g. "bb2fc207-90d1-45dd-8110-3247c4753cd5"
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"admin","password":"admin"}')
# using admin token to create a user
curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/admin/user/my_db_user/add -d '{"password":"password123"}'
# login as the new user and producing their token
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"my_db_user","password":"password123"}')

Users are allowed to change their password and to create and manipulate databases.

Available user APIs:

Action	Description
/api/v1/user/change_password	changes the current user’s password
/api/v1/user/login	logs in the user returning an API token
/api/v1/user/logout	logs out the user and invalidating the API token *
/api/v1/user/status	returns current user’s username and whether it is a server admin *
/api/v1/cluster/user/login	logs in the user cluster wide so that using the same token accessing other nodes will not require new login, this might overwrite existing tokens on other nodes
/api/v1/cluster/user/logout	invalidate tokens on all nodes in the cluster

The login is shared meaning if you log in twice even from different devices you will get the same shared API token of that user. Similarly, when the logout endpoint is used this token is invalidated across all sessions.

Databases

Any user can create, remove and manipulate their own databases. To create a database:

curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/db/my_db_owner/my_db/add?db_type=mapped

Note that a user can only create databases under their own name. The db_type can be one of:

memory # memory only database, basically a cache
mapped # memory mapped database, using memory for reading but persisting changes to the disk
file # file based database only, no memory caching, reading/writing from/to disk

It is possible to add an existing database to the server. Move the db file to the server data folder and run /api/v1/db/{owner}/{db}/add API as if you were creating a new database with the db’s name. If the file exists it will be added rather than created. Similarly, you can remove database (instead of deleting it) from the server with /api/v1/db/{owner}/{db}/remove API that will disassociate the db from the server which you can then move and use elsewhere.

Database Users

Each database is scoped to one user (owner) who can exercise full control over it. The owner can add more users (they must exist on the server) to the database including admin level users with one of three roles:

read # can only run immutable exec queries
write # can run mutable and immutable exec queries
admin # same as write but can also admin the database

The admin users can do some (but not all) actions that the owner can:

Database Actions

Action	Permission	Description
/api/v1/db/{owner}/{db}/add	owner	adds (from existing files) or creates a database (memory, memory mapped, file only)
/api/v1/db/{owner}/{db}/audit	read	returns the log of all mutable queries that ran against the database (with user who ran them)
/api/v1/db/{owner}/{db}/backup	admin	creates an automatic backup snapshot of the database (see backup docs below)
/api/v1/db/{owner}/{db}/clear	admin	clears the content of the database (either all, db only, audit only, backup only)
/api/v1/db/{owner}/{db}/convert	admin	converts db between memory/mapped/file
/api/v1/db/{owner}/{db}/copy	read	creates a copy of the database under the current user
/api/v1/db/{owner}/{db}/delete	owner	deletes the database including files on disk
/api/v1/db/{owner}/{db}/exec	read	executes queries against the database (does not allow mutable queries)
/api/v1/db/{owner}/{db}/exec_mut	write	executes queries against the database (allows both mutable and immutable queries)
/api/v1/db/{owner}/{db}/list	read	lists the databases with role of the current user (owned and others’)
/api/v1/db/{owner}/{db}/optimize	write	optimizes the underlying file storage packing the data reclaiming unused regions (defragmenting)
/api/v1/db/{owner}/{db}/remove	owner	removes the database from the server but keeps the files on disk (main, WAL, backup, audit)
/api/v1/db/{owner}/{db}/rename	owner	changes the name of the database (this API can be used to transfer db ownership)
/api/v1/db/{owner}/{db}/restore	admin	restores the database from the automatic backup - the current database will become the backup
/api/v1/db/{owner}/{db}/user/add	admin	adds a user to the database
/api/v1/db/{owner}/{db}/user/list	read	list users of the database with their roles
/api/v1/db/{owner}/{db}/user/remove	admin	removes a user from the database

Backups

Each database can be backed up. The backup API /api/v1/db/{owner}/{db}/backup has no parameters and will always back up the database under the same name to the “backups” subfolder in the owner’s data. The database can be restored with /api/v1/db/{owner}/{db}/restore at which point the existing backup will become the main database and the current database will become the backup so it is not possible to “lose” the current state even if accidentally “restoring”. You can revert to the state before backup by running another /api/v1/db/{owner}/{db}/restore. If you need more granular backup or multiple backups you can devise your own scheme using the /api/v1/db/{owner}/{db}/copy, /api/v1/db/{owner}/{db}/rename and possibly /api/v1/db/{owner}/{db}/remove or /api/v1/db/{owner}/{db}/delete APIs.

Queries

All queries are executed using the single /api/v1/db/{owner}/{db}/exec (read only queries) and /api/v1/db/{owner}/{db}/exec_mut (for queries that also write to the database) endpoint and are exactly the same as in the embedded/application database (see Queries documentation). However, depending on the user’s role the server may reject executing the queries (i.e. mutable queries executed by the user with read role in the database). The endpoints accept a list of queries and the entire list is run as a transaction meaning either all queries succeed or none of them do. The endpoint will return list of results, one per executed query.

It is possible to reference queries from each other in the list and the server will inject results of the referenced queries to the next one. This is slight extension to the vanilla agdb queries. It is best illustrated by an example:

let queries = &vec![
    QueryBuilder::insert().nodes().count(1).query().into(), // :0
    QueryBuilder::insert().nodes().count(1).query().into(), // :1
    QueryBuilder::insert().edges().from(":0").to(":1").query().into(), // :2
    QueryBuilder::search().from(":0").to(":1").query().into(), //:3
];

In places where the alias can be used (an ids identifier) you can use an index prefixed by : to inject result of the previous query in the list. In the example we are inserting two separate nodes and then creating an edge between them and finally searching from one node to the other. An index to the results can be used in most places including conditions. What is currently not possible is to inject data (i.e. key-value properties) from results to subsequent queries.

Transactions

While the server serves each request asynchronously and can serve any number of clients at the same time the queries and individual databases must still follow the basic principles of the agdb that are the same as in the embedded variant (and derived from Rust itself):

There can be either:

unlimited amount of immutable transactions
exactly one mutable transaction

However, the agdb is written in such a way that it performs excellently even under heavily contested read/write load. See agdb_benchmarks and performance documentation.

Admin

Each agdb_server has exactly one admin account (admin by default) that acts as a regular user but additionally is allowed to execute APIs under /admin/. These mostly copies the APIs for regular users but some of the restrictions are not enforced (i.e. ownership or db role). Furthermore, the admin has access to the following exclusive APIs:

Action	Description
/api/v1/admin/db/*	provides same endpoints as for regular users but without owner/role restrictions
/api/v1/admin/shutdown	gracefully shuts down the server
/api/v1/admin/status	lists extended statistics of the server - uptime, # dbs, # users, # logged users, server data size
/api/v1/admin/user/{username}/add	adds new user to the server
/api/v1/admin/user/{username}/change_password	changes password of a user
/api/v1/admin/user/{username}/logout	force logout of any user
/api/v1/admin/user/logout_all	force logout of all users except admins
/api/v1/admin/user/{username}/delete	deletes user and all their data (databases) from the server
/api/v1/admin/user/list	lists the all users on the server
/api/v1/cluster/admin/user/{username}/logout	force logout of any user from all nodes in the cluster
/api/v1/cluster/admin/user/logout_all	force logout of all user from all nodes in the cluster except admins

Shutdown

The server can be gracefully shutdown with CTRL+C or programmatically by using the /api/v1/admin/shutdown endpoint which requires admin token, e.g.

token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"admin","password":"admin"}') #will produce a token, e.g. "bb2fc207-90d1-45dd-8110-3247c4753cd5"
curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/admin/shutdown

Cluster

You can an in most cases should run the server in a cluster for resiliency and durability of the data. The exception would be if you require speed at the cost of resiliency and durability and the agdb is used more as a cache rather than main database.

The agdb_server is using custom implementation of a Raft consensus algorithm. The benefits of the algorithm are durability, data consistency and speed as the actions do not need to be persisted to all nodes (only majority) before acknowledging the client. The custom implementation in agdb additionally offers the following features and guarantees atop of Raft:

Write action acknowledgement happens only when the respective action was executed fully (not just that the action was received by the majority as is the standard implementation).
Clients can request write operations through any member of the cluster (not only the leader) and the node will forward it to the current leader and act as a proxy.
When forwarding the action the node will only acknowledge the client when both of the following becomes true:
- The leader committed the operation meaning the majority of the nodes persisted the action.
- The node through which the action was performed executed the action itself and in full.
Even when there is no leader elected such as when the cluster is being (re)deployed the read operations are always available

This means that you can freely choose a node and perform any action through it and observe consistent results at the minor performance penalty due to the forwarding (unless you picked the leader node). It might be useful in situations where reads are more frequent as they would be more spread out across the cluster.

This however does NOT prevent inconsistencies in situations where every request is sent to a different node (e.g. when accessing the cluster through single service in Kubernetes). The agdb does not offer consistency across all nodes at all times - only leader + local node (if different) + number of other (unspecified) nodes required to reach majority.

💡

The pepper value (used for additionally hashing the passwords, see the configuration section) also needs to be the same across the cluster as when sending actions to do with passwords only the hashed values are sent between nodes rather than raw passwords for security reasons.

The nodes use cluster_token from the config (see above) and calculated cluster hash to authenticate to each other. The hash is the calculated value based on the cluster array in the config listing all the nodes (address values) in the cluster. It is required for the cluster array to be the same in the config files of all nodes. Additionally, when the cluster is not empty the local address value must match one of the addresses. The index of such address is the index of the node in the cluster.

Example:

#node0 agdb_server.yaml
address: http://localhost:3000
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]

#node1 abdb_server.yaml
address: http://localhost:3001
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]

#node2 agdb_server.yaml
address: http://localhost:3002
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]

💡

You should run a cluster of at least 2 nodes. Recommended number is 3 or any odd number. The even number of nodes will still work but the benefits of the consensus algorithm will be somewhat diminished due to the majority rule. Majority of 2 is 2 nodes, of 3 nodes is 2 nodes, of 4 nodes is 3 nodes, of 5 nodes is also 3 nodes etc.

The node0 will have the index 0, the node1 will have the index 1 and node2 will have the index 2 as their respective addresses (address field) match these indexes in the cluster array which itself is the same in all configs and thus produces the same cluster hash. This mechanism helps protect the cluster during topology change. However, if you need to change number of nodes in the cluster, you risk the split brain issue. For instance if you changed the number of nodes from 3 to 5 on node0 then the node1 and node2 would still see the old topology of 3 nodes, elect either as leader and continued as normal. Whereas the node0 with the 2 new nodes could do the same (as majority in cluster of 5 is 3 nodes). If you must for any reason change the topology:

Perform backup of all nodes
Prevent new nodes from starting (do NOT start new nodes under any circumstances yet)
Update the configuration of the OLD (existing) nodes with the new nodes one by one
Wait for the cluster leader to be established (this is only possible if the majority rules allow it: going from 3 to 5 is fine, going from 3 to 7 is not)
Bring the new nodes up and let them synchronize with the leader

⚠️

Make topology changes only when absolutely necessary and excercise great care when doing it.

Misc

Following are the special or miscellaneous endpoints:

Endpoint	Description
/api/v1	serves rapidoc OpenAPI GUI (use this in the browser)
/api/v1/openapi.json	returns the server’s OpenAPI specification as json
/api/v1/status	returns 200 OK if the server is ready (up)
/api/v1/cluster/status	returns the list of cluster nodes indicating which nodes are reachable from the current node and which node is the leader

Queries Studio