Version: 0.3.51
Date: October 10, 2025
GraphLake is a graph database that supports a unified OneGraph model, seamlessly combining property graphs and RDF. This means you do not need to choose between the two graph types; it supports both simultaneously, offering flexibility and versatility for a wide range of use cases.
Designed for scalability and performance, GraphLake supports both large analytical workloads and Online Transaction Processing (OLTP) scenarios. Its architecture draws inspiration from the MPP style of systems like Apache Iceberg and DeltaLake. However, GraphLake introduces automatic data partitioning, with representation and filtering mechanisms specifically optimized for graph data structures.
GraphLake is engineered to work efficiently with files stored locally on high-performance NVMe or SSD drives or in cloud storage solutions such as Amazon S3 (and S3-compatible options like MinIO) or Azure Blob Storage. The system allows independent scaling of compute and storage resources to match workload requirements. For instance, you can store large volumes of data cost-effectively while using a small compute instance, or opt for a configuration with smaller datasets and larger compute resources to support high-performance querying.
Leveraging parallel processing and memory, GraphLake delivers impressive performance, with the ability to scale hardware resources for even greater efficiency.
GraphLake is not offered as a managed service. Instead, it is designed to be deployed within your cloud environment or run on your hardware, giving you complete control over your setup and infrastructure. We provide a developer edition that is free to use indefinitely, making it easy to get started with GraphLake and explore its capabilities.
GraphLake developer edition is available now, commercial releases will be coming soon. If you would like to know more or start using GraphLake commercially, please email "contact @ dataplatformsolutions.com"
Pull the docker image with:
docker pull dataplatformsolutions/graphlake:latest
Download the Visual Studio Code extension from https://graphlake.net/downloads/graphlake-1.0.0.vsix
To support deploying GraphLake into your cloud we have provided example terraform scripts that can be used and adapted to a specific setup. Deployment of GraphLake through the AWS and Azure marketplace is coming soon.
AWS terraform on https://github.com/dataplatformsolutions/graphlake-deploy
Available in the AWS marketplace - coming soon!
Azure terraform on https://github.com/dataplatformsolutions/graphlake-deploy
Available in the Azure marketplace - coming soon!
Digital Ocean terraform on https://github.com/dataplatformsolutions/graphlake-deploy
GraphLake supports the following startup config environment variables:
| Environment Variable | Description |
|---|---|
| GRAPHLAKE_BACKEND_TYPE | The storage backend to use (local, s3, azure) - only local is supported at the moment |
| GRAPHLAKE_STORE_PATH | The path to store data when using local backend |
| GRAPHLAKE_PORT | The port to listen on (default 7642) |
| GRAPHLAKE_LOG_LEVEL | The log level to use - debug or info |
| GRAPHLAKE_FILE_CACHE_SIZE | Size of data file cache (defaults to 100 entries per shard) |
| GRAPHLAKE_METADATA_CACHE_SIZE | Size of metadata (.meta) cache (defaults to 100 entries) |
| GRAPHLAKE_SNAPSHOT_CACHE_SIZE | Number of snapshots cached in memory (defaults to 5) |
| GRAPHLAKE_ADMIN_PASSWORD | Password for user admin. Used for Business edition and up to secure GraphLake. |
| GRAPHLAKE_AZURE_ACCOUNT_NAME | Azure storage account name used when backend type is azure |
| GRAPHLAKE_AZURE_ACCOUNT_KEY | Azure storage account key |
| GRAPHLAKE_AWS_ACCESS_KEY | AWS access key for S3 backend |
| GRAPHLAKE_AWS_SECRET_KEY | AWS secret key for S3 backend |
| GRAPHLAKE_AWS_REGION | AWS region for S3 backend |
| GRAPHLAKE_S3_BUCKET | Bucket name used for S3 backend |
| GRAPHLAKE_S3_ENDPOINT | Custom S3 endpoint (optional) |
| GRAPHLAKE_DOCKER_VERSION_FILE | File containing version information when running inside Docker |
| GRAPHLAKE_MULTI_NODE | Set to true to enable multi-node coordination |
To run GraphLake in a Docker container as a detached process, passing the environment variables and mapping a local folder to the configured data path, you would run:
docker run -d \
-e GRAPHLAKE_BACKEND_TYPE=local \
-e GRAPHLAKE_STORE_PATH=/data \
-v /path/to/local/data:/data \
-p 7642:7642 \
dataplatformsolutions/graphlake:latest
There are 4 versions of GraphLake
For enterprise editions please reach out to us on email at contact@dataplatformsolutions.com to discuss your requirements.
# Example: Create a store and a graph, then import data.
# 1. Create a new store
curl -X POST http://localhost:8080/stores/myNewStore
# 2. Create a new graph in that store
curl -X POST http://localhost:8080/stores/myNewStore/graphs/myGraph
# 3. Import N-Triples data from a file
curl -X POST http://localhost:8080/stores/myNewStore/graphs/myGraph/import \
-H "Content-Type: text/plain" \
--data-binary @path/to/your/file.nt
# 4. Query the data
curl -X POST http://localhost:8080/stores/myNewStore/query \
-H "Content-Type: text/plain" \
--data "SELECT ?s ?p ?o WHERE { ?s ?p ?o }"
# 5. Update with an invariant
curl -X POST http://localhost:8080/stores/myNewStore/update_with_invariant \
-H "Content-Type: application/json" \
-d '{ "update": "INSERT DATA { GRAPH <http://example.com/graph> { <http://example.com/alice> <http://example.com/name> "Alice" . } }", "invariant": "ASK WHERE { <http://example.com/alice> <http://example.com/name> "Alice" . }" }'
Data can be loaded into a graph using the /stores/:store/import endpoint. N-Triples, Turtle, and CSV are supported. Provide data in the request body or reference files that have been copied to the server.
If the format query parameter is omitted, the server infers the format from the Content-Type header. Use text/turtle for Turtle data and application/n-triples for N-Triples.
When importing from files, place them under the store’s import directory. For a store named myStore this path is /stores/myStore/import inside the store path (for the Docker image the store path is mounted at /store, so the full path would be /store/stores/myStore/import). The location parameter refers to a file or folder inside this directory.
Example: Import data sent in the request body
POST /stores/myStore/import?graph=myGraph
Content-Type: application/n-triples
<http://example.org/s> <http://example.org/p> "obj" .
Example: Import Turtle data
POST /stores/myStore/import?graph=myGraph
Content-Type: text/turtle
@prefix ex: <http://example.org/> .
ex:s ex:p "obj" .
Example: Import data from a file on the server
Copy data.nt to /store/stores/myStore/import/data.nt and then call:
POST /stores/myStore/import?graph=myGraph&location=data.nt
For CSV imports, include format=csv and specify csvType=nodes or csvType=edges:
POST /stores/myStore/import?graph=myGraph&location=nodes.csv&format=csv&csvType=nodes
CSV files describe either nodes or edges:
Node CSV
rid (optional): existing node identifier. If omitted a blank node identifier is generated.type, type_2, ... (optional): values become rdf:type triples.graph (optional): named graph; defaults to the import graph.predicatePrefix, typePrefix, and idPrefix.With predicatePrefix=http://example.com/, typePrefix=http://example.com/, and idPrefix=http://example.com/:
rid,type,graph,name,tags
1,person,people,Alice,"rdf;graph;ai"
Edge CSV
from and to: subject and object node ids or IRIs.pid or predicate: predicate for the edge.graph (optional): named graph for the edge.Values in from and to without a prefix are expanded using idPrefix. The pid/predicate value and any property column names are expanded using predicatePrefix.
Edges must reference existing nodes by rid or full IRI. Nodes imported without a rid receive generated identifiers that cannot be known ahead of time, so provide explicit rid values for nodes that will be linked by edge CSVs.
With idPrefix=http://example.com/ and predicatePrefix=http://example.com/:
from,to,pid,since
1,2,knows,2010
This yields <http://example.com/1> <http://example.com/knows> <http://example.com/2> and a property <http://example.com/since> "2010".
All CSV files require a header row.
A graph manifest lets you import multiple graphs in one request. Create a folder inside the import directory and add a graphs.json file mapping graph names to subdirectories that hold the data files:
{
"people": "people-data",
"products": "products-data"
}
The folder structure would be:
/store/stores/myStore/import/batch/
graphs.json
people-data/
people.nt
products-data/
products.nt
Import all graphs listed in the manifest with:
POST /stores/myStore/import?location=batch&graphManifest=true
The optional branch parameter imports into a specific branch (default is main).
GraphLake supports a subset of SPARQL features for querying RDF data. This document outlines the supported features, including query forms, patterns, and functions.
Below is a simple example that creates a graph and queries it.
curl -X POST http://localhost:7642/stores/test/graphs/example
curl -X POST http://localhost:7642/stores/test/import?graph=example \
-H "Content-Type: text/plain" \
--data "<s> <p> <o> ."
curl -X POST http://localhost:7642/stores/test/query \
-H "Content-Type: application/sparql" \
--data "SELECT ?s ?p ?o WHERE { ?s ?p ?o }"
SELECT ?subject ?predicate ?object
WHERE {
?subject ?predicate ?object.
}
ASK WHERE {
?subject ?predicate ?object.
}
SELECT ?subject ?object
WHERE {
?subject ?object.
}
SELECT ?subject ?object
WHERE {
{
?subject ?object.
}
UNION
{
?subject ?object.
}
}
SELECT ?subject ?object ?optionalObject
WHERE {
?subject ?object.
OPTIONAL { ?subject ?optionalObject. }
}
SELECT ?subject ?object
WHERE {
?subject ?object.
FILTER (?object > 10)
}
SELECT ?subject
WHERE {
?subject ?object.
FILTER (BOUND(?object))
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (isIRI(?subject))
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (isLiteral(?object))
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (isBlank(?subject))
}
SELECT ?subject (STR(?object) AS ?objectStr)
WHERE {
?subject ?predicate ?object.
}
SELECT ?subject (LANG(?object) AS ?lang)
WHERE {
?subject ?predicate ?object.
}
SELECT ?subject (DATATYPE(?object) AS ?datatype)
WHERE {
?subject ?predicate ?object.
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (CONTAINS(STR(?object), "example"))
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (STRSTARTS(STR(?object), "http://"))
}
SELECT ?subject
WHERE {
?subject ?predicate ?object.
FILTER (STRENDS(STR(?object), ".org"))
}
SELECT ?c WHERE {
BIND(TRIPLECOUNT() AS ?c)
}
SELECT ?c WHERE {
BIND(TRIPLECOUNT() AS ?c)
}
GraphLake recognises common GeoSPARQL predicates and functions for spatial search. Use
geof:within, geof:intersects, or the Simple Features aliases
geof:sfWithin and geof:sfIntersects inside FILTER
expressions. Geometry literals may include explicit coordinate reference systems either via
the SRID=4326;POLYGON(...) form or by prefixing the geometry with a CRS IRI such as
<http://www.opengis.net/def/crs/EPSG/0/4326>. Both styles are parsed and matched
during query execution, and triple-quoted string literals can be bound with
BIND to hold the WKT typed as geo:wktLiteral when composing complex
shapes directly inside the query text.
Property paths can simplify traversal from features to geometry nodes. The example below
binds a WKT polygon with a CRS IRI, walks from each field feature to its geometry using
geo:hasGeometry/geo:asWKT, and returns the matching field names ordered
alphabetically.
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX : <http://example.org/farm#>
SELECT ?field ?name
WHERE {
BIND("<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((10.004 58.999, 10.016 58.999, 10.016 59.009, 10.004 59.009, 10.004 58.999))"^^geo:wktLiteral AS ?areaWkt)
?field a geo:Feature ; :name ?name ; geo:hasGeometry/geo:asWKT ?wkt .
FILTER( geof:sfIntersects(?wkt, ?areaWkt) )
}
ORDER BY ?name
When a stored geometry omits a CRS identifier, GraphLake assumes EPSG:4326. Mixing literals
that use the SRID= syntax with CRS IRIs will also match as long as they resolve to
the same SRID.
GraphLake's SPARQL support includes essential query forms, patterns, and functions to perform effective graph data queries. Use the provided examples to construct and execute your SPARQL queries.
Quoted triples are supported via RDF*. Use the same syntax as SPARQL-star to insert and select embedded triples.
INSERT DATA {
<< :s :p :o >> :source "example" .
}
SELECT ?src WHERE {
<< :s :p :o >> :source ?src .
}
SemanticCypher is GraphLake's OpenCypher based language. Queries use a PREFIX statement to declare namespaces. Labels and property names that omit a prefix are expanded using a default namespace. You can declare the default explicitly, otherwise GraphLake falls back to its internal namespace. Every node exposes a reserved rid property holding the RDF subject IRI.
The current implementation supports the following constructs:
PREFIX – declare a namespace prefix.CREATE – create nodes and relationships.MERGE – upsert nodes or relationships.DELETE and DETACH DELETE – remove nodes or relationships.MATCH – pattern matching over multiple hops.OPTIONAL MATCH – optional pattern matching.WHERE with =, !=, <, <=, >, >=, AND, OR.* with range syntax.SET – assign properties.REMOVE – drop properties or labels from nodes.WITH – chain query parts.UNWIND – iterate over list expressions.ORDER BY, SKIP and LIMIT – sort and paginate results.RETURN and RETURN DISTINCT – produce query results.UNION and UNION ALL – combine result sets.COUNT, SUM, AVG, MIN and MAX with grouping.$name syntax in expressions.exists().Example query returning all people and their identifiers:
PREFIX foaf http://xmlns.com/foaf/0.1/
MATCH (p:foaf:Person)
RETURN p.rid AS id
Matching a node by its resource identifier:
MATCH (p {rid:'http://example.com/Alice'})
RETURN p
Creating nodes and relationships with a declared default namespace for unprefixed property names:
PREFIX ex http://example.org/
PREFIX : http://schema.org/
CREATE (a:ex:Person {name:'Alice'})-[:ex:KNOWS]->(b:ex:Person {name:'Bob'})
Here the property name is expanded to http://schema.org/name. If no default is declared, unqualified identifiers are placed in GraphLake's internal namespace.
Matching with filters and property updates:
MATCH (p:ex:Person)-[r:ex:KNOWS]->(f:ex:Person)
WHERE p.name = 'Alice' AND f.age >= 30
SET r.since = 2020
RETURN p.name AS person, f.name AS friend
Using UNWIND and WITH to work with lists:
UNWIND [1,2,3] AS n
WITH n WHERE n > 1
RETURN n
Deleting a relationship:
MATCH (a)-[r:ex:KNOWS]->(b)
DELETE (a)-[r]->(b)
Variable length path query:
MATCH (a {rid:'http://example.com/Alice'})-[:ex:KNOWS*1..2]->(b)
RETURN b
Optional matching:
MATCH (a {rid:'http://example.com/Dave'})
OPTIONAL MATCH (a)-[:ex:KNOWS]->(b)
RETURN a, b
Ordering and pagination:
MATCH (a)-[:ex:KNOWS]->(b)
RETURN b ORDER BY b.rid SKIP 1 LIMIT 1
GraphLake provides a JavaScript-based query language to interact with the graph data. This language allows you to perform various operations such as matching triples, applying updates, and writing query results.
Example script returning all triples from myGraph:
_writer.writeHeader(["s","p","o"]);
let it = _context.matchTriples("","","","",true,["myGraph"]);
while (true) {
let t = it.next();
if (t == null) break;
_writer.writeRow([t.subject,t.predicate,t.object]);
}
GraphLake executes JS and inserts into the runtime two objects,
_contextand
_writer. The context object is used to update and query the store. The writer object allows to data to be sent back to the calling application.
_context.matchTriples(subject, predicate, object, datatype, isLiteral, graphs)
Description: Matches triples in the specified graphs.
Parameters:
subject (string): The subject of the triple.predicate (string): The predicate of the triple.object (string): The object of the triple.datatype (string): The datatype of the object.isLiteral (boolean): Whether the object is a literal.graphs (array of strings): The graphs to search in.Returns: An iterator for the matched triples.
_context.assertTriple(subject, predicate, object, datatype, isLiteral, graph)
Description: Asserts a new triple in the specified graph.
Parameters:
subject (string): The subject of the triple.predicate (string): The predicate of the triple.object (string): The object of the triple.datatype (string): The datatype of the object.isLiteral (boolean): Whether the object is a literal.graph (string): The graph to insert into._context.commit()
Description: Commits the current transaction.
Returns: true or false if the transaction was successful.
_context.deleteTriple(subject, predicate, object, datatype, isLiteral, graph)
Description: Deletes a triple from the specified graph.
Parameters:
subject (string): The subject of the triple.predicate (string): The predicate of the triple.object (string): The object of the triple.datatype (string): The datatype of the object.isLiteral (boolean): Whether the object is a literal.graph (string): The graph to delete from._writer.writeHeader(headers)
Description: Writes the header for the query result.
Parameters:
headers (array of strings): The header columns.Returns: Undefined.
_writer.writeRow(row)
Description: Writes a row to the query result.
Parameters:
row (array of any): The row data.Returns: Undefined.
_writer.writeHeader(["Subject", "Predicate", "Object"]);
let triplesIter = _context.matchTriples("http://example.org/subject", "", "", "", true, ["graph1"]);
while (true) {
let triple = triplesIter.next();
if (triple == null) {
break;
}
_writer.writeRow([triple.subject, triple.predicate, triple.object]);
}
Explanation: This script matches all triples with the subject
http://example.org/subject in graph1 and writes the results to the output.
// Delete an existing triple
_context.deleteTriple("http://example.org/subject", "http://example.org/predicate", "http://example.org/object", "", false, "graph1");
// Add two new triples
_context.assertTriple("http://example.org/subject1", "http://example.org/predicate1", "http://example.org/object1", "", false, "graph1");
_context.assertTriple("http://example.org/subject2", "http://example.org/predicate2", "http://example.org/object2", "", false, "graph1");
// Commit the transaction
_context.commit();
Explanation: This script commits the current transaction.
Example 3: Writing Headers and Rows
_writer.writeHeader(["Id", "Name"]);
let triplesIter = _context.matchTriples("http://example.org/person", "http://example.org/name", "", "", true, ["graph1"]);
while (true) {
let triple = triplesIter.next();
if (triple == null) {
break;
}
_writer.writeRow([triple.subject, triple.object]);
}
Explanation: This script writes a header with columns "Id" and "Name", matches triples with the
subject http://example.org/person and predicate http://example.org/name in
graph1, and writes the results to the output.
_writer.writeHeader(["Subject", "Predicate", "Object"]);
let graphs = ["graph1", "graph2"];
let triplesIter = _context.matchTriples("http://example.org/subject", "", "", "", true, graphs);
while (true) {
let triple = triplesIter.next();
if (triple == null) {
break;
}
_writer.writeRow([triple.subject, triple.predicate, triple.object]);
}
Explanation: This script matches all triples with the subject
http://example.org/subject in both graph1 and graph2 and writes the results
to the output.
GET /health
Check server status.
| Method | GET |
|---|---|
| Endpoint | /health |
| Description | Returns {"status":"ok"} when the server is running. |
POST /stores/:store/query
Execute a query in the specified :store. GraphLake supports three different query langauges: SPARQL
(core subset), SemanticCypher, and GraphLake Javascript. See the sections in the query section for details on each
langauge. Use the following content types to specify which query you are using: application/sparql,
application/x-graphlake-query-semanticcypher, application/x-graphlake-query-javascript.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/query |
| Description | Runs a query on a given store. |
| Query Parameters |
|
| Request Body |
A string containing the query. The format depends on the underlying query engine.
|
| Response |
JSON result of the query. For example:
Note the query result structure can differ based on the query type. |
GET /stores/:store/sparql?query=...
Execute a SPARQL query via a URL-encoded query parameter.
| Method | GET |
|---|---|
| Endpoint | /stores/:store/sparql |
| Query Parameters |
|
| Response | Same as /stores/:store/query. |
POST /stores/:store/talk
Generate and run a SPARQL query from a natural language question using an LLM.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/talk |
| Request Body | |
| Response | JSON query results. |
Optional query parameters:
branch: branch name to query.schemaGraph: graph containing SHACL schema to provide context.POST /stores/:store/update_with_invariant
Execute a SPARQL UPDATE while verifying invariant ASK queries before commit. The transaction aborts if any invariant evaluates to false.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/update_with_invariant |
| Description | Runs a SPARQL UPDATE and validates invariants on the resulting data before committing. |
| Request Body |
JSON object with
|
| Response |
Success response:
If the invariant fails:
|
POST /stores/:store/validate
Validate a data graph against SHACL shapes stored in the specified store.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/validate |
| Request Body | |
| Response | SHACL validation report in JSON. |
POST /stores/:store/update
Execute a SPARQL UPDATE and apply optional metadata and tags to the resulting snapshot. The update may opt out of rollup consolidation.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/update |
| Description | Runs a SPARQL UPDATE and records optional snapshot metadata. |
| Request Body |
Either a raw SPARQL update with
|
| Response |
|
GET /stores/:store/snapshots
Retrieve snapshot metadata for a store branch. Additional query parameters filter snapshots by metadata keys.
| Method | GET |
|---|---|
| Endpoint | /stores/:store/snapshots |
| Description | Lists snapshots and their metadata, tags, and rollup flags. |
| Query Parameters |
|
| Response |
|
Snapshots can be labeled with user-defined tags during import and update operations. Use the tag query parameter on read-only endpoints like /stores/:store/query to target a specific tagged snapshot. Listing snapshots with ?tag=myTag returns only snapshots carrying that tag.
POST /stores/:store/export?graph=graph_name&location=folder
Exports one or more graphs from a branch to files on disk. The optional graph query parameters may be repeated. If location is not provided the server writes files to the store's exports directory.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/export |
| Description | Export data from the specified store and branch. |
| Query Parameters |
|
| Response | Returns 200 OK when the export completes. |
POST /stores/:store/import?location=name_of_file_or_folder&graph=name_of_graph
Start an asynchronous import job for a specified graph. The data can be sent in the request body or loaded in based on files or folders located in the store import folder. The call returns a JSON object containing a jobId.
Use GET /jobs to list running jobs and GET /jobs/{id} to view the status of a particular job. Job status includes the number of triples processed and, once finished, the total triple count obtained via the TRIPLECOUNT SPARQL function.
Optional tag and meta-* query parameters label the snapshot created by the import with tags and metadata.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/import |
| Description | Imports data into the given graph. The data can be sent via request body or by specifying a location parameter. |
| Query Parameters |
If |
| Request Body |
If
For Turtle data, set the header to
|
| Response |
Returns |
GET /jobs
List currently running background jobs.
| Method | GET |
|---|---|
| Endpoint | /jobs |
| Description | Returns an array of job objects. |
GET /jobs/:id
Retrieve the status of a specific job.
| Method | GET |
|---|---|
| Endpoint | /jobs/:id |
| Description | Returns details for job :id. |
POST /stores/:store/graphs/:graph
Create an empty graph in the specified :store.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/graphs/:graph |
| Description | Creates a new graph with the provided name. |
| Request Body | No body is required. The :graph path parameter is the graph name. |
| Response |
|
GET /stores/:store/graphs
Retrieve a list of available graphs in the specified :store.
| Method | GET |
|---|---|
| Endpoint | /stores/:store/graphs |
| Description | Lists all graph names in the store. |
| Response |
An array of strings, each representing a graph name.
|
DELETE /stores/:store/graphs/:graph
Delete a specific graph within a store.
| Method | DELETE |
|---|---|
| Endpoint | /stores/:store/graphs/:graph |
| Description | Deletes the specified graph. |
| Request Body | No body is required. The :graph path parameter identifies the graph. |
| Response |
|
POST /stores/:store
Create a new store.
| Method | POST |
|---|---|
| Endpoint | /stores/:store |
| Description | Creates a new store using the provided name. |
| Request Body | No request body is required. :store is the store name. |
| Response |
|
GET /stores
Retrieve a list of all stores. If security is enabled, only stores accessible to the current user are returned.
| Method | GET |
|---|---|
| Endpoint | /stores |
| Description | Lists all stores. |
| Response |
An array of strings, each representing a store name.
|
DELETE /stores/:store
Delete a specified store. This operation is irreversible.
| Method | DELETE |
|---|---|
| Endpoint | /stores/:store |
| Description | Deletes the specified store. |
| Response |
|
POST /stores/:store/loadalldata
Load all data files of a store into memory caches.
| Method | POST |
|---|---|
| Endpoint | /stores/:store/loadalldata |
| Description | Preloads all data files for faster access. |
| Response | |
POST /admin/users
Create a new user in the system.
| Method | POST |
|---|---|
| Endpoint | /admin/users |
| Description | Add a new user with a username, password, and optional public key. |
| Request Body |
JSON object containing user information:
|
| Response |
|
DELETE /admin/users/:username
Delete an existing user by username.
| Method | DELETE |
|---|---|
| Endpoint | /admin/users/:username |
| Description | Deletes the specified user from the system. |
| Response |
|
POST /admin/users/:username/keypair
Generate a new private/public key pair for the specified user.
| Method | POST |
|---|---|
| Endpoint | /admin/users/:username/keypair |
| Description | Generates a new key pair and stores the public key for the user. The private key is returned in the response. |
| Response |
Returns a JSON object containing the newly generated key pair:
|
POST /admin/authenticate/password
Obtain a JWT token by providing username and password.
| Method | POST |
|---|---|
| Endpoint | /admin/authenticate/password |
| Description | Authenticates a user with password and returns a JWT token if successful. |
| Request Body |
JSON object with
|
| Response |
|
POST /admin/authenticate/jwt
Verify an existing JWT and obtain a renewed token.
| Method | POST |
|---|---|
| Endpoint | /admin/authenticate/jwt |
| Description | Verifies a JWT and returns a new JWT if valid. |
| Request Body |
JSON object with a
|
| Response |
|
POST /admin/users/:username/rules
Define or update the security rules for a specified user.
| Method | POST |
|---|---|
| Endpoint | /admin/users/:username/rules |
| Description | Sets security/ACL rules for a user. |
| Request Body |
An array of rules. Each rule typically contains resource patterns and permissions (e.g. "read", "write").
Exact structure may vary based on the
|
| Response |
|
GET /stores/:store/branches
Retrieve a list of branches for the specified store.
POST /stores/:store/branches/:branch
Create a new branch in the specified store. Optionally, specify a source branch using the source query parameter (default is main).
DELETE /stores/:store/branches/:branch
Delete a specific branch in the specified store.
Branches allow concurrent versions of the data. Create a branch from main, import or modify data, and include ?branch=name when querying or importing to use it.
POST /applications/:appname
Upload a new application with the specified name. The application definition should be sent in the request body.
GET /applications
List all available applications.
GET /applications/:appname
Retrieve the application with the specified name.
DELETE /applications/:appname
Delete the application with the specified name.
Applications group related event classes and aggregates. Once uploaded they can be referenced by name in the application data endpoints.
These endpoints allow you to interact with application-specific data classes.
GET /apps/:app/:class
Retrieve a list of items for the specified class in the application.
GET /apps/:app/:class/:id
Retrieve a specific item by ID for the specified class in the application.
POST /apps/:app/:class/:id
Create a new item with the specified ID in the specified class and application. The item data should be sent in the request body.
PUT /apps/:app/:class/:id
Update an existing item with the specified ID in the specified class and application. The updated data should be sent in the request body.
DELETE /apps/:app/:class/:id
Delete the item with the specified ID in the specified class and application.
The following data types are used in the application endpoints:
Represents a simple application configuration that exposes REST CRUD operations for a given graph and set of classes or shapes.
type URI and a resource name used in URLs.Below is an example of an ApplicationDefinition:
{
"name": "exampleApp",
"store": "exampleStore",
"graph": "exampleGraph",
"schema": "exampleSchema",
"classes": [
{"type": "Person", "resource": "people"},
{"type": "Organization", "resource": "orgs"}
],
"aggregates": [
{
"name": "exampleAggregate",
"root": {
"class": "Event",
"identity_property": "eventId",
"properties": [
{ "name": "timestamp", "property": "eventTime" },
{ "name": "location", "property": "eventLocation" }
]
},
"classes": [
{
"class": "Transaction",
"identity_property": "transactionId",
"properties": [
{ "name": "amount", "property": "transactionAmount" }
]
}
],
"all_properties": true
}
],
"access": "write"
}
This example defines an application named exampleApp that operates on the exampleGraph in the exampleStore. It includes two classes (Person and Organization) and one aggregate definition (exampleAggregate), which aggregates data from Event and Transaction classes.
Describes how to build an aggregate instance from a set of event classes.
Specifies an event class used to build an aggregate and the property on that event that identifies the aggregate instance it applies to.
Maps an event property to the aggregate property name.
These data types are used in the request and response payloads for the application endpoints.
GET /admin/version
Retrieve the current version of the server.
POST /admin/restart
Restart the server.
POST /admin/upgrade
Upgrade the server to the specified version. Provide the version as a query parameter version.
Manage agent configuration and access conversation logs.
| Method | Endpoint | Description |
|---|---|---|
| GET | /agents/:id/config | Retrieve the active configuration for agent :id. |
| PUT | /agents/:id/config | Replace the configuration for agent :id. Provide YAML in the request body. |
| GET | /agents/:id/logs | List available log cycles. Optional limit and after query parameters page the results. |
| GET | /agents/:id/logs/:cycle | Retrieve the transcript for the specified cycle. |
The configuration endpoint returns a JSON object describing how the agent runs:
{
"ollama_base_url": "http://localhost:11434",
"model": "qwen2.5-coder:7b",
"tick": "10s",
"max_steps": 4,
"mcp_ws_url": "ws://localhost:8081",
"system_prompt": "You are GraphLake.",
"user_prompt": "Answer questions about the graph."
}
Field meanings:
ollama_base_url: Base URL for the Ollama server.model: Name of the LLM to invoke.tick: Interval between agent runs (e.g., 10s).max_steps: Maximum number of reasoning steps per run.mcp_ws_url: WebSocket URL for the MCP server.system_prompt: System instructions prepended to conversations.user_prompt: Default user prompt when triggering the agent.The same fields can be supplied as YAML in the body of the PUT /agents/:id/config request.
The following request and response payloads are referenced throughout the API documentation.
Represents the request body for executing a query.
Represents the response for a query execution.
Defines the request body or parameters for importing data into a graph.
Represents the response for creating a new store.
Represents the request body for creating a new user.
Defines the request body for password-based authentication.
Defines the request body for JWT-based authentication.
Represents the response containing a JWT token.
Represents a security rule for a user.
GraphLake includes a Talk workflow that lets you describe an information need in plain language and receive the equivalent SPARQL query and result set. The workflow pairs schema knowledge with an LLM so the generated query respects your graph model.
gpt-4o-mini) that will translate questions into SPARQL.export OPENAI_API_KEY=<your-key>
export OPENAI_MODEL=gpt-4o-mini
export OPENAI_BASE_URL=http://localhost:8080/v1 # optional override for compatible endpoints
OPENAI_API_KEY (and optional OPENAI_MODEL or
OPENAI_BASE_URL) environment variables and automatically enables the OpenAI-backed Talk
integration. No code changes are required. The default client points at
https://api.openai.com/v1; setting OPENAI_BASE_URL lets you route requests to a
local or proxy OpenAI-compatible deployment.Restart GraphLake after setting the environment variables. Every /stores/:store/talk request and the UI now send
prompts to OpenAI instead of the built-in mock model.
The Talk workflow performs best when it can reference SHACL shapes that describe the vocabulary in your graphs. Import the schema into its own graph so the LLM receives structured context:
curl -X POST \
-H "Content-Type: text/turtle" \
--data-binary @schema.ttl \
"http://localhost:7642/stores/demo/import?graph=https://example.org/shapes"
Schema graphs live alongside regular data graphs, so you can version them with branches or keep multiple shape sets for different applications.
Enriching the SHACL shapes with human-readable annotations dramatically improves the queries generated by the
LLM. Add rdfs:label, rdfs:comment, or sh:description triples that explain how the
classes and properties should be used:
ex:PersonShape a sh:NodeShape ;
sh:targetClass ex:Person ;
rdfs:label "Person" ;
sh:description "Customer or prospect record" ;
sh:property [
sh:path ex:email ;
rdfs:label "Email address" ;
sh:description "Primary contact e-mail for the person" ;
sh:datatype xsd:string ;
] .
These annotations become part of the prompt sent to the model, helping it choose the correct predicates and filters when building SPARQL.
/talk.html (or choose Talk in the left navigation of the
GraphLake UI). The form automatically reuses the most recent store and branch selections.https://example.org/shapes). Supplying a schema graph makes the LLM prompt more precise./stores/:store/talk using the OpenAI-backed generator.Because the Talk UI shares browser state with the rest of the console, any stores, branches, or schema graphs selected on other pages stay pre-filled, streamlining exploratory workflows.
GraphLake can run background agents defined in a configuration file.
The LLM agent watches the graph for trigger events, asks a large language model to reason about next steps, and then persists the resulting plan back into GraphLake. This section walks through a complete example that you can copy and adapt for your own automations.
Create a small Turtle file that models an upcoming task and a deadline resource. The task description gives the LLM enough context to plan additional actions.
@prefix ex: <https://example.org/project/> .
@prefix schema: <https://schema.org/> .
ex:Task123 a schema:Action ;
schema:name "Prepare quarterly review" ;
schema:description "Compile financial metrics, gather customer feedback, and rehearse the executive presentation." ;
ex:deadline ex:Deadline123 .
ex:Deadline123 a ex:TaskDeadline ;
schema:dueDate "2025-03-15"^^schema:Date ;
schema:description "Quarterly review deck must be ready for the March 15 board meeting." .
Load the data into a working graph with the import endpoint:
curl -X POST \
"http://localhost:7642/stores/myStore/import?graph=https://example.org/project" \
-H "Content-Type: text/turtle" \
--data-binary @task.ttl
Define an agent configuration YAML file. The llm agent type listens to a SPARQL
trigger_query. When the query returns bindings, the agent builds a prompt template using
the bindings and calls the configured language model provider. The completion is then written back into
the graph with an update_template.
type: llm
tick: 30s
graph: https://example.org/project
trigger_query: |
PREFIX ex: <https://example.org/project/>
PREFIX schema: <https://schema.org/>
SELECT ?task ?taskName ?taskDescription ?deadline
WHERE {
?task a schema:Action ;
schema:name ?taskName ;
schema:description ?taskDescription ;
ex:deadline ?deadline .
?deadline a ex:TaskDeadline .
FILTER NOT EXISTS { ?task ex:generatedPlan ?plan }
}
llm:
provider: openai
model: gpt-4.1
api_key_env: OPENAI_API_KEY
prompt_template: |
You are a planning assistant. The task "{{taskName}}" has the description:
{{taskDescription}}
The deadline resource is {{deadline}}. Suggest up to three concrete follow-up events the team should
schedule in advance so they are ready by the due date. Respond in JSON with an array named "events" and
include fields "title", "purpose", and "leadTimeDays".
update_template: |
PREFIX ex: <https://example.org/project/>
INSERT {
GRAPH <https://example.org/project> {
{{#each events}}
_:event ex:belongsTo {{task}} ;
ex:title "{{title}}" ;
ex:purpose "{{purpose}}" ;
ex:leadTimeDays {{leadTimeDays}} .
{{/each}}
{{task}} ex:generatedPlan _:plan .
_:plan ex:rawResponse "{{raw_response}}" .
}
}
WHERE {}
The configuration instructs the agent to poll the graph every 30 seconds, collect tasks with deadlines,
and persist the LLM response as a set of event nodes connected to the task. The raw_response
field is optional but useful for auditing.
agents/task-planner.yaml and
ensure the file is available on the server running GraphLake.
api_key_env (for example export OPENAI_API_KEY=sk-...).
curl -X POST \
"http://localhost:7642/stores/myStore/agents" \
-H "Content-Type: application/json" \
-d '{
"name": "task-planner",
"configPath": "agents/task-planner.yaml"
}'
/stores/:store/agents/:name/start endpoint if the agent is not already
active.
Use the agent endpoints to inspect the current status and recent actions:
# Check agent state and last tick
curl http://localhost:7642/stores/myStore/agents/task-planner
# Retrieve execution history (most recent 10 runs)
curl "http://localhost:7642/stores/myStore/agents/task-planner/history?limit=10"
Logs also include each LLM invocation with the rendered prompt and model response. Tail the GraphLake server logs to follow the agent in real time:
docker logs -f graphlake-server | grep "task-planner"
After the LLM generates a plan, query the graph to review the generated events and raw output:
PREFIX ex: <https://example.org/project/>
PREFIX schema: <https://schema.org/>
SELECT ?taskName ?eventTitle ?purpose ?leadTime
WHERE {
?task schema:name ?taskName ;
ex:generatedPlan ?plan .
?event ex:belongsTo ?task ;
ex:title ?eventTitle ;
ex:purpose ?purpose ;
ex:leadTimeDays ?leadTime .
}
ORDER BY ?taskName ?eventTitle
The query reveals the concrete follow-up events suggested by the LLM. Because the agent records the raw
response on ?plan, you can also retrieve it for audit purposes:
PREFIX ex: <https://example.org/project/>
SELECT ?task ?raw WHERE {
?task ex:generatedPlan ?plan .
?plan ex:rawResponse ?raw .
}
Combine these queries with dashboard visualisations or alerts to build a complete feedback loop between the graph, the agent, and your operational systems.
Periodically executes a SELECT trigger_query and runs a templated
update_query for each result. Placeholders like {{var}}
are replaced with bindings from the trigger.
type: sparql
tick: 10s
graph: urn:graph1
trigger_query: |
SELECT ?s WHERE { ?s <urn:status> "open" }
update_query: |
WITH <urn:graph1>
DELETE { {{s}} <urn:status> "open" . }
INSERT { {{s}} <urn:status> "done" . }
WHERE {}
GraphLake uses signed JWT to allow access to the API. Tokens are fetched from the API by authenticating with either a signed JWT token using a local private key or user name and password. Users are managed on the server to give read, write and owner access to stores and graphs.
When a security manager is enabled, certain endpoints require a valid JWT token for authorization. The server applies ACL checks to ensure that a user can only access the stores and graphs permitted by their assigned rules. If you receive a 401 (Unauthorized) or 403 (Forbidden) response, check that:
The developer edition runs unsecured.
GraphLake writes structured logs to StdErr which when run in docker goes into the docker logs. We recommend shipping these logs to your favourite log management system and configuring any notifications you require.
To enable a great developer experience we provide a VSCode extension for managing stores and graphs, running queries, and user managemment. All operations can also be done from CURL or programtically over http in any language.
GraphLake includes native SHACL validation so data can be checked against a schema before use.
SHACL shapes are stored in a regular named graph. Upload the shapes using the import endpoint and supply the graph name where the shapes should reside:
curl -X POST \
"http://localhost:7642/stores/myStore/import?graph=https://example.org/shapes" \
-H "Content-Type: text/turtle" \
--data-binary @shapes.ttl
To validate data, post the schema graph and data graph to the /stores/:store/validate endpoint. Optionally include a branch query parameter to validate a non-main branch.
curl -X POST "http://localhost:7642/stores/myStore/validate" \
-H "Content-Type: application/json" \
-d '{
"schema": "https://example.org/shapes",
"graph": "https://example.org/data"
}'
The response is a SHACL validation report listing conforming status and any violations.
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix ex: <https://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:PersonShape
a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [
sh:path ex:name ;
sh:datatype xsd:string ;
sh:minCount 1 ;
] ;
sh:property [
sh:path ex:age ;
sh:datatype xsd:integer ;
sh:minInclusive 0 ;
] .
ex:EmailShape
a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [
sh:path ex:email ;
sh:pattern "^[^@]+@[^@]+$" ;
] .
ex:KnowsShape
a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [
sh:path ex:knows ;
sh:class ex:Person ;
] .
ex:Person has a name string and non‑negative age.ex:knows to also be ex:Person.Multi-node operation allows GraphLake to scale horizontally by distributing workloads across multiple nodes. This setup is particularly beneficial for large-scale deployments where high availability, fault tolerance, and load balancing are critical. By leveraging multiple nodes, organizations can ensure that their GraphLake instance remains responsive even under heavy workloads.
GRAPHLAKE_MULTI_NODE to true or use the --multinode flag when starting the application.To start a node in multi-node mode using Docker:
docker run -d -p 7642:7642 -v /tmp/graphlake-store:/store \
-e GRAPHLAKE_MULTI_NODE=true \
dataplatformsolutions/graphlake:latest --storepath /store --port 7642
For a full walkthrough, see Multi-Node Tutorial.
The gateway package exposes a lightweight front end capable of starting GraphLake nodes on demand. A Provider interface abstracts the underlying infrastructure so implementations can target Kubernetes, AWS, Azure, or other platforms.
GRAPHLAKE_S3_BUCKET, GRAPHLAKE_AWS_ACCESS_KEY, GRAPHLAKE_AWS_SECRET_KEY, and GRAPHLAKE_AWS_REGION).KUBECONFIG to connect to the cluster.DefaultAzureCredential.The gateway proxies requests to GraphLake nodes and provides admin endpoints for user and token management:
POST /workloads/{name}/start
POST /workloads/{name}/stop
GET|POST|... /workloads/{name}/
POST /admin/authenticate/password
POST /admin/authenticate/jwt
POST /admin/users
DELETE /admin/users/{username}
POST /admin/users/{username}/keypair
POST /admin/users/{username}/rules
The gateway manages workloads and nodes dynamically:
GATEWAY_NODE_CAPACITY.GATEWAY_IDLE_TIMEOUT (default: 10 minutes).GATEWAY_KEEP_ONE_NODE to true keeps a single node running even when idle.GATEWAY_SCALE_DOWN_DELAY after request counts drop below capacity (default: 1 minute).Run the gateway service using the cmd/gateway binary. It exposes REST endpoints to start and stop workloads and proxies GraphLake API calls:
POST /workloads/{name}/start
POST /workloads/{name}/stop
GET|POST|... /workloads/{name}/
To list active workloads and their nodes:
GET /workloads # list active workload names
GET /workloads/{name}/nodes # list nodes for a workload with health info
GET /workloads/{name}/nodes/{id}/health
If you have issues with GraphLake then please reach out to us on discord, file an issue on public github repository, or send an email to "contact @ dataplatformsolutions.com".