Server
The agdb_server
is the OpenAPI REST server that provides remote agdb
database management. Running the server is trivial as there are no dependencies, no complicated configuration etc. It can be run on any platform supported by Rust. Please follow the guide:
The server is based on axum
and uses OpenAPI to specify its API (via utoipa
) and rapidoc
for the OpenAPI GUI. To interact with the server you can use the rapidoc GUI, curl
or any of the available API clients. Internally it uses the agdb
database:
GUI accessible at (run in a browser when the server is running):
http://localhost:3000/api/v1
Configuration
The server will create default configuration when run and always reads it from the working directory if it exists:
# agdb_server.yaml
bind: ":::3000" # address to listen at (bind to)
address: "localhost:3000" # address the incoming connections will come from
basepath: "" # base path to append to the address in case the server is to be run behind a reverse proxy
admin: admin # the admin user that will be created automatically for the server, the password will be the same as name (admin by default, recommended to change after startup)
data_dir: agdb_server_data # directory to store user data
log_level: INFO # Options are: OFF, ERROR, WARN, INFO, DEBUG, TRACE
pepper_path: "" # Optional path to a runtime secret file containing 16 bytes "pepper" value for additionally "seasoning" (hashing) passwords. If empty a built-in pepper value is used - see "How to run the server?" guide for details
cluster_token: cluster # token used between members of the cluster for authentication, treat this value as secret
cluster_heartbeat_timeout_ms: 1000 # number of milliseconds since last message sent to a node in the cluster before the leader sends a heartbeat message
cluster_term_timeout_ms: 3000 # number of milliseconds without receiving a message from the leader after which the nodes will consider leader to be off and begin a new term
cluster: [] # list of "address" fields of all nodes in the cluster including local node - the order and values must be the same in all members of the cluster
You can prepare it in advance in a file agdb_server.yaml
. After the server database is created changes to the admin
field will have no effect, but the other settings can be changed later. All config changes require server restart to take effect.
The server is built with a default pepper
(can be changed as part of the build ) that is used if pepper_path
is not specified. If unique it makes sure the two different agdb_server
instances do not produce the same password hashes and enhances the security. The pepper
value is to be considered a secret.
|
Users
The server has a single admin account (admin
by default, configurable with password being the name) that can perform any regular user action + all admin actions such as creating users. You can use this account for using the database locally, but it would be advisable to use it only for maintaining the server and to create a regular user for use with the databases:
# produce an admin API token, e.g. "bb2fc207-90d1-45dd-8110-3247c4753cd5"
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"admin","password":"admin"}')
# using admin token to create a user
curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/admin/user/my_db_user/add -d '{"password":"password123"}'
# login as the new user and producing their token
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"my_db_user","password":"password123"}')
Users are allowed to change their password and to create and manipulate databases.
Available user APIs:
Action | Description |
---|---|
/api/v1/user/change_password | changes the current user’s password |
/api/v1/user/login | logs in the user returning an API token |
/api/v1/user/logout | logs out the user and invalidating the API token * |
/api/v1/user/status | returns current user’s username and whether it is a server admin * |
The login is shared meaning if you log in twice even from different devices you will get the same shared API token of that user. Similarly, when the logout
endpoint is used this token is invalidated across all sessions.
Databases
Any user can create, remove and manipulate their own databases. To create a database:
curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/db/my_db_owner/my_db/add?db_type=mapped
Note that a user can only create databases under their own name. The db_type
can be one of:
memory # memory only database, basically a cache
mapped # memory mapped database, using memory for reading but persisting changes to the disk
file # file based database only, no memory caching, reading/writing from/to disk
It is possible to add an existing database to the server. Move the db file to the server data folder and run /api/v1/db/{owner}/{db}/add
API as if you were creating a new database with the db’s name. If the file exists it will be added rather than created. Similarly, you can remove database (instead of deleting it) from the server with /api/v1/db/{owner}/{db}/remove
API that will disassociate the db from the server which you can then move and use elsewhere.
Database Users
Each database is scoped to one user (owner) who can exercise full control over it. The owner can add more users (they must exist on the server) to the database including admin level users with one of three roles:
read # can only run immutable exec queries
write # can run mutable and immutable exec queries
admin # same as write but can also admin the database
The admin users can do some (but not all) actions that the owner can:
Database Actions
Action | Permission | Description |
---|---|---|
/api/v1/db/{owner}/{db}/add | owner | adds (from existing files) or creates a database (memory, memory mapped, file only) |
/api/v1/db/{owner}/{db}/audit | read | returns the log of all mutable queries that ran against the database (with user who ran them) |
/api/v1/db/{owner}/{db}/backup | admin | creates an automatic backup snapshot of the database (see backup docs below) |
/api/v1/db/{owner}/{db}/clear | admin | clears the content of the database (either all, db only, audit only, backup only) |
/api/v1/db/{owner}/{db}/convert | admin | converts db between memory/mapped/file |
/api/v1/db/{owner}/{db}/copy | read | creates a copy of the database under the current user |
/api/v1/db/{owner}/{db}/delete | owner | deletes the database including files on disk |
/api/v1/db/{owner}/{db}/exec | read | executes queries against the database (does not allow mutable queries) |
/api/v1/db/{owner}/{db}/exec_mut | write | executes queries against the database (allows both mutable and immutable queries) |
/api/v1/db/{owner}/{db}/list | read | lists the databases with role of the current user (owned and others’) |
/api/v1/db/{owner}/{db}/optimize | write | optimizes the underlying file storage packing the data reclaiming unused regions (defragmenting) |
/api/v1/db/{owner}/{db}/remove | owner | removes the database from the server but keeps the files on disk (main, WAL, backup, audit) |
/api/v1/db/{owner}/{db}/rename | owner | changes the name of the database (this API can be used to transfer db ownership) |
/api/v1/db/{owner}/{db}/restore | admin | restores the database from the automatic backup - the current database will become the backup |
/api/v1/db/{owner}/{db}/user/add | admin | adds a user to the database |
/api/v1/db/{owner}/{db}/user/list | read | list users of the database with their roles |
/api/v1/db/{owner}/{db}/user/remove | admin | removes a user from the database |
Backups
Each database can be backed up. The backup API /api/v1/db/{owner}/{db}/backup
has no parameters and will always back up the database under the same name to the “backups” subfolder in the owner’s data. The database can be restored with /api/v1/db/{owner}/{db}/restore
at which point the existing backup will become the main database and the current database will become the backup so it is not possible to “lose” the current state even if accidentally “restoring”. You can revert to the state before backup by running another /api/v1/db/{owner}/{db}/restore
. If you need more granular backup or multiple backups you can devise your own scheme using the /api/v1/db/{owner}/{db}/copy
, /api/v1/db/{owner}/{db}/rename
and possibly /api/v1/db/{owner}/{db}/remove
or /api/v1/db/{owner}/{db}/delete
APIs.
Queries
All queries are executed using the single /api/v1/db/{owner}/{db}/exec
(read only queries) and /api/v1/db/{owner}/{db}/exec_mut
(for queries that also write to the database) endpoint and are exactly the same as in the embedded/application database (see Queries documentation). However, depending on the user’s role the server may reject executing the queries (i.e. mutable queries executed by the user with read
role in the database). The endpoints accept a list of queries and the entire list is run as a transaction meaning either all queries succeed or none of them do. The endpoint will return list of results, one per executed query.
It is possible to reference queries from each other in the list and the server will inject results of the referenced queries to the next one. This is slight extension to the vanilla agdb
queries. It is best illustrated by an example:
let queries = &vec![
QueryBuilder::insert().nodes().count(1).query().into(), // :0
QueryBuilder::insert().nodes().count(1).query().into(), // :1
QueryBuilder::insert().edges().from(":0").to(":1").query().into(), // :2
QueryBuilder::search().from(":0").to(":1").query().into(), //:3
];
In places where the alias
can be used (an ids
identifier) you can use an index prefixed by :
to inject result of the previous query in the list. In the example we are inserting two separate nodes and then creating an edge between them and finally searching from one node to the other. An index to the results can be used in most places including conditions. What is currently not possible is to inject data (i.e. key-value properties) from results to subsequent queries.
Transactions
While the server serves each request asynchronously and can serve any number of clients at the same time the queries and individual databases must still follow the basic principles of the agdb
that are the same as in the embedded variant (and derived from Rust itself):
There can be either:
- unlimited amount of immutable transactions
- exactly one mutable transaction
However, the agdb
is written in such a way that it performs excellently even under heavily contested read/write load. See agdb_benchmarks
and performance documentation.
Admin
Each agdb_server
has exactly one admin account (admin
by default) that acts as a regular user but additionally is allowed to execute APIs under /admin/
. These mostly copies the APIs for regular users but some of the restrictions are not enforced (i.e. ownership or db role). Furthermore, the admin has access to the following exclusive APIs:
Action | Description |
---|---|
/api/v1/admin/db/* | provides same endpoints as for regular users but without owner/role restrictions |
/api/v1/admin/shutdown | gracefully shuts down the server |
/api/v1/admin/status | lists extended statistics of the server - uptime, # dbs, # users, # logged users, server data size |
/api/v1/admin/user/{username}/add | adds new user to the server |
/api/v1/admin/user/{username}/change_password | changes password of a user |
/api/v1/admin/user/{username}/logout | force logout of any user |
/api/v1/admin/user/{username}/remove | deletes user and all their data (databases) from the server |
/api/v1/admin/user/list | lists the all users on the server |
Shutdown
The server can be gracefully shutdown with CTRL+C
or programmatically by using the /api/v1/admin/shutdown
endpoint which requires admin token, e.g.
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"admin","password":"admin"}') #will produce a token, e.g. "bb2fc207-90d1-45dd-8110-3247c4753cd5"
curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/admin/shutdown
Cluster
You can an in most cases should run the server in a cluster for resiliency and durability of the data. The exception would be if you require speed at the cost of resiliency and durability and the agdb
is used more as a cache rather than main database.
The agdb_server
is using custom implementation of a Raft consensus algorithm. The benefits of the algorithm are durability, data consistency and speed as the actions do not need to be persisted to all nodes (only majority) before acknowledging the client. The custom implementation in agdb
additionally offers the following features and guarantees atop of Raft:
- Write action acknowledgement happens only when the respective action was executed fully (not just that the action was received by the majority as is the standard implementation).
- Clients can request write operations through any member of the cluster (not only the leader) and the node will forward it to the current leader and act as a proxy.
- When forwarding the action the node will only acknowledge the client when both of the following becomes true:
- The leader committed the operation meaning the majority of the nodes persisted the action.
- The node through which the action was performed executed the action itself and in full.
- Even when there is no leader elected such as when the cluster is being (re)deployed the read operations are always available
This means that you can freely choose a node and perform any action through it and observe consistent results at the minor performance penalty due to the forwarding (unless you picked the leader node). It might be useful in situations where reads are more frequent as they would be more spread out across the cluster.
This however does NOT prevent inconsistencies in situations where every request is sent to a different node (e.g. when accessing the cluster through single service in Kubernetes). The agdb
does not offer consistency across all nodes at all times - only leader + local node (if different) + number of other (unspecified) nodes required to reach majority.
The pepper value (used for additionally hashing the passwords, see the configuration section) also needs to be the same across the cluster as when sending actions to do with passwords only the hashed values are sent between nodes rather than raw passwords for security reasons.
The nodes use cluster_token
from the config (see above) and calculated cluster hash
to authenticate to each other. The hash is the calculated value based on the cluster
array in the config listing all the nodes (address
values) in the cluster. It is required for the cluster
array to be the same in the config files of all nodes. Additionally, when the cluster
is not empty the local address
value must match one of the addresses. The index of such address is the index of the node in the cluster.
Example:
#node0 agdb_server.yaml
address: http://localhost:3000
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]
#node1 abdb_server.yaml
address: http://localhost:3001
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]
#node2 agdb_server.yaml
address: http://localhost:3002
cluster: ["http://localhost:3000", "http://localhost:3001", "http://localhost:3002"]
You should run a cluster of at least 2 nodes. Recommended number is 3 or any odd number. The even number of nodes will still work but the benefits of the consensus algorithm will be somewhat diminished due to the majority rule. Majority of 2 is 2 nodes, of 3 nodes is 2 nodes, of 4 nodes is 3 nodes, of 5 nodes is also 3 nodes etc.
The node0 will have the index 0
, the node1 will have the index 1
and node2 will have the index 2
as their respective addresses (address
field) match these indexes in the cluster
array which itself is the same in all configs and thus produces the same cluster hash. This mechanism helps protect the cluster during topology change. However, if you need to change number of nodes in the cluster, you risk the split brain issue. For instance if you changed the number of nodes from 3 to 5 on node0 then the node1 and node2 would still see the old topology of 3 nodes, elect either as leader and continued as normal. Whereas the node0 with the 2 new nodes could do the same (as majority in cluster of 5 is 3 nodes). If you must for any reason change the topology:
- Perform backup of all nodes
- Prevent new nodes from starting (do NOT start new nodes under any circumstances yet)
- Update the configuration of the OLD (existing) nodes with the new nodes one by one
- Wait for the cluster leader to be established (this is only possible if the majority rules allow it: going from 3 to 5 is fine, going from 3 to 7 is not)
- Bring the new nodes up and let them synchronize with the leader
Make topology changes only when absolutely necessary and excercise great care when doing it.
Available cluster endpoints:
Action | Description |
---|---|
/api/v1/cluster/status | returns the list of cluster nodes indicating which nodes are reachable from the current node and which node is the leader |
/api/v1/cluster/user/login | logs in the user cluster wide so that using the same token accessing other nodes will not require new login, this might overwrite existing tokens on other nodes |
/api/v1/cluster/user/logout | invalidate tokens on all nodes in the cluster |
Misc
Following are the special or miscellaneous endpoints:
Endpoint | Description |
---|---|
/api/v1 | serves rapidoc OpenAPI GUI (use this in the browser) |
/api/v1/openapi.json | returns the server’s OpenAPI specification as json |
/api/v1/status | returns 200 OK if the server is ready (up) |