GraphLake Documentation

Version: 0.3.51

Date: October 10, 2025

Introduction

GraphLake is a graph database that supports a unified OneGraph model, seamlessly combining property graphs and RDF. This means you do not need to choose between the two graph types; it supports both simultaneously, offering flexibility and versatility for a wide range of use cases.

Designed for scalability and performance, GraphLake supports both large analytical workloads and Online Transaction Processing (OLTP) scenarios. Its architecture draws inspiration from the MPP style of systems like Apache Iceberg and DeltaLake. However, GraphLake introduces automatic data partitioning, with representation and filtering mechanisms specifically optimized for graph data structures.

GraphLake is engineered to work efficiently with files stored locally on high-performance NVMe or SSD drives or in cloud storage solutions such as Amazon S3 (and S3-compatible options like MinIO) or Azure Blob Storage. The system allows independent scaling of compute and storage resources to match workload requirements. For instance, you can store large volumes of data cost-effectively while using a small compute instance, or opt for a configuration with smaller datasets and larger compute resources to support high-performance querying.

Leveraging parallel processing and memory, GraphLake delivers impressive performance, with the ability to scale hardware resources for even greater efficiency.

GraphLake is not offered as a managed service. Instead, it is designed to be deployed within your cloud environment or run on your hardware, giving you complete control over your setup and infrastructure. We provide a developer edition that is free to use indefinitely, making it easy to get started with GraphLake and explore its capabilities.

Installation

GraphLake developer edition is available now, commercial releases will be coming soon. If you would like to know more or start using GraphLake commercially, please email "contact @ dataplatformsolutions.com"

Downloads

Pull the docker image with:

docker pull dataplatformsolutions/graphlake:latest

Download the Visual Studio Code extension from https://graphlake.net/downloads/graphlake-1.0.0.vsix

Cloud Deployments

To support deploying GraphLake into your cloud we have provided example terraform scripts that can be used and adapted to a specific setup. Deployment of GraphLake through the AWS and Azure marketplace is coming soon.

AWS

AWS terraform on https://github.com/dataplatformsolutions/graphlake-deploy

Available in the AWS marketplace - coming soon!

Azure

Azure terraform on https://github.com/dataplatformsolutions/graphlake-deploy

Available in the Azure marketplace - coming soon!

Digital Ocean

Digital Ocean terraform on https://github.com/dataplatformsolutions/graphlake-deploy

Config and Startup

GraphLake supports the following startup config environment variables:

Environment Variable Description
GRAPHLAKE_BACKEND_TYPE The storage backend to use (local, s3, azure) - only local is supported at the moment
GRAPHLAKE_STORE_PATH The path to store data when using local backend
GRAPHLAKE_PORT The port to listen on (default 7642)
GRAPHLAKE_LOG_LEVEL The log level to use - debug or info
GRAPHLAKE_FILE_CACHE_SIZE Size of data file cache (defaults to 100 entries per shard)
GRAPHLAKE_METADATA_CACHE_SIZE Size of metadata (.meta) cache (defaults to 100 entries)
GRAPHLAKE_SNAPSHOT_CACHE_SIZE Number of snapshots cached in memory (defaults to 5)
GRAPHLAKE_ADMIN_PASSWORD Password for user admin. Used for Business edition and up to secure GraphLake.
GRAPHLAKE_AZURE_ACCOUNT_NAME Azure storage account name used when backend type is azure
GRAPHLAKE_AZURE_ACCOUNT_KEY Azure storage account key
GRAPHLAKE_AWS_ACCESS_KEY AWS access key for S3 backend
GRAPHLAKE_AWS_SECRET_KEY AWS secret key for S3 backend
GRAPHLAKE_AWS_REGION AWS region for S3 backend
GRAPHLAKE_S3_BUCKET Bucket name used for S3 backend
GRAPHLAKE_S3_ENDPOINT Custom S3 endpoint (optional)
GRAPHLAKE_DOCKER_VERSION_FILE File containing version information when running inside Docker
GRAPHLAKE_MULTI_NODE Set to true to enable multi-node coordination

To run GraphLake in a Docker container as a detached process, passing the environment variables and mapping a local folder to the configured data path, you would run:

docker run -d \
      -e GRAPHLAKE_BACKEND_TYPE=local \
      -e GRAPHLAKE_STORE_PATH=/data \
      -v /path/to/local/data:/data \
      -p 7642:7642 \
      dataplatformsolutions/graphlake:latest

Versions

There are 4 versions of GraphLake

For enterprise editions please reach out to us on email at contact@dataplatformsolutions.com to discuss your requirements.

Getting Started Examples


    # Example: Create a store and a graph, then import data.

    # 1. Create a new store
    curl -X POST http://localhost:8080/stores/myNewStore
    
    # 2. Create a new graph in that store
    curl -X POST http://localhost:8080/stores/myNewStore/graphs/myGraph
    
    # 3. Import N-Triples data from a file
    curl -X POST http://localhost:8080/stores/myNewStore/graphs/myGraph/import \
         -H "Content-Type: text/plain" \
         --data-binary @path/to/your/file.nt
    
    # 4. Query the data
    curl -X POST http://localhost:8080/stores/myNewStore/query \
         -H "Content-Type: text/plain" \
         --data "SELECT ?s ?p ?o WHERE { ?s ?p ?o }"
    
    # 5. Update with an invariant
    curl -X POST http://localhost:8080/stores/myNewStore/update_with_invariant \
         -H "Content-Type: application/json" \
         -d '{ "update": "INSERT DATA { GRAPH <http://example.com/graph> { <http://example.com/alice> <http://example.com/name> "Alice" . } }", "invariant": "ASK WHERE { <http://example.com/alice> <http://example.com/name> "Alice" . }" }'

Data Import

Data can be loaded into a graph using the /stores/:store/import endpoint. N-Triples, Turtle, and CSV are supported. Provide data in the request body or reference files that have been copied to the server.

If the format query parameter is omitted, the server infers the format from the Content-Type header. Use text/turtle for Turtle data and application/n-triples for N-Triples.

When importing from files, place them under the store’s import directory. For a store named myStore this path is /stores/myStore/import inside the store path (for the Docker image the store path is mounted at /store, so the full path would be /store/stores/myStore/import). The location parameter refers to a file or folder inside this directory.

Example: Import data sent in the request body

POST /stores/myStore/import?graph=myGraph
Content-Type: application/n-triples

<http://example.org/s> <http://example.org/p> "obj" .
  

Example: Import Turtle data

POST /stores/myStore/import?graph=myGraph
Content-Type: text/turtle

@prefix ex: <http://example.org/> .
ex:s ex:p "obj" .
  

Example: Import data from a file on the server

Copy data.nt to /store/stores/myStore/import/data.nt and then call:

POST /stores/myStore/import?graph=myGraph&location=data.nt

For CSV imports, include format=csv and specify csvType=nodes or csvType=edges:

POST /stores/myStore/import?graph=myGraph&location=nodes.csv&format=csv&csvType=nodes

CSV files describe either nodes or edges:

Node CSV

With predicatePrefix=http://example.com/, typePrefix=http://example.com/, and idPrefix=http://example.com/:

rid,type,graph,name,tags
1,person,people,Alice,"rdf;graph;ai"

Edge CSV

Values in from and to without a prefix are expanded using idPrefix. The pid/predicate value and any property column names are expanded using predicatePrefix.

Edges must reference existing nodes by rid or full IRI. Nodes imported without a rid receive generated identifiers that cannot be known ahead of time, so provide explicit rid values for nodes that will be linked by edge CSVs.

With idPrefix=http://example.com/ and predicatePrefix=http://example.com/:

from,to,pid,since
1,2,knows,2010

This yields <http://example.com/1> <http://example.com/knows> <http://example.com/2> and a property <http://example.com/since> "2010".

All CSV files require a header row.

Importing with a Graph Manifest

A graph manifest lets you import multiple graphs in one request. Create a folder inside the import directory and add a graphs.json file mapping graph names to subdirectories that hold the data files:

{
  "people": "people-data",
  "products": "products-data"
}

The folder structure would be:

/store/stores/myStore/import/batch/
  graphs.json
  people-data/
    people.nt
  products-data/
    products.nt

Import all graphs listed in the manifest with:

POST /stores/myStore/import?location=batch&graphManifest=true

The optional branch parameter imports into a specific branch (default is main).

Query

SPARQL

GraphLake supports a subset of SPARQL features for querying RDF data. This document outlines the supported features, including query forms, patterns, and functions.

Below is a simple example that creates a graph and queries it.

curl -X POST http://localhost:7642/stores/test/graphs/example
curl -X POST http://localhost:7642/stores/test/import?graph=example \
     -H "Content-Type: text/plain" \
     --data "<s> <p> <o> ."
curl -X POST http://localhost:7642/stores/test/query \
     -H "Content-Type: application/sparql" \
     --data "SELECT ?s ?p ?o WHERE { ?s ?p ?o }"

Query Forms

  1. SELECT
  2. ASK

Patterns

  1. Triple Patterns
  2. Group Graph Patterns
  3. Optional Patterns
  4. Filter Patterns

Functions

  1. Bound
  2. isIRI
  3. isLiteral
  4. isBlank
  5. STR
  6. LANG
  7. DATATYPE
  8. CONTAINS
  9. STRSTARTS
  10. STRENDS
  11. TRIPLECOUNT

Geospatial Filters

GraphLake recognises common GeoSPARQL predicates and functions for spatial search. Use geof:within, geof:intersects, or the Simple Features aliases geof:sfWithin and geof:sfIntersects inside FILTER expressions. Geometry literals may include explicit coordinate reference systems either via the SRID=4326;POLYGON(...) form or by prefixing the geometry with a CRS IRI such as <http://www.opengis.net/def/crs/EPSG/0/4326>. Both styles are parsed and matched during query execution, and triple-quoted string literals can be bound with BIND to hold the WKT typed as geo:wktLiteral when composing complex shapes directly inside the query text.

Property paths can simplify traversal from features to geometry nodes. The example below binds a WKT polygon with a CRS IRI, walks from each field feature to its geometry using geo:hasGeometry/geo:asWKT, and returns the matching field names ordered alphabetically.


PREFIX geo:   <http://www.opengis.net/ont/geosparql#>
PREFIX geof:  <http://www.opengis.net/def/function/geosparql/>
PREFIX :      <http://example.org/farm#>

SELECT ?field ?name
WHERE {
  BIND("<http://www.opengis.net/def/crs/EPSG/0/4326> POLYGON((10.004 58.999, 10.016 58.999, 10.016 59.009, 10.004 59.009, 10.004 58.999))"^^geo:wktLiteral AS ?areaWkt)

  ?field a geo:Feature ; :name ?name ; geo:hasGeometry/geo:asWKT ?wkt .

  FILTER( geof:sfIntersects(?wkt, ?areaWkt) )
}
ORDER BY ?name
    

When a stored geometry omits a CRS identifier, GraphLake assumes EPSG:4326. Mixing literals that use the SRID= syntax with CRS IRIs will also match as long as they resolve to the same SRID.

GraphLake's SPARQL support includes essential query forms, patterns, and functions to perform effective graph data queries. Use the provided examples to construct and execute your SPARQL queries.

RDF*

Quoted triples are supported via RDF*. Use the same syntax as SPARQL-star to insert and select embedded triples.

INSERT DATA {
  << :s :p :o >> :source "example" .
}
SELECT ?src WHERE {
  << :s :p :o >> :source ?src .
}

SemanticCypher

SemanticCypher is GraphLake's OpenCypher based language. Queries use a PREFIX statement to declare namespaces. Labels and property names that omit a prefix are expanded using a default namespace. You can declare the default explicitly, otherwise GraphLake falls back to its internal namespace. Every node exposes a reserved rid property holding the RDF subject IRI.

The current implementation supports the following constructs:

Example query returning all people and their identifiers:

PREFIX foaf http://xmlns.com/foaf/0.1/
MATCH (p:foaf:Person)
RETURN p.rid AS id

Matching a node by its resource identifier:

MATCH (p {rid:'http://example.com/Alice'})
RETURN p

Creating nodes and relationships with a declared default namespace for unprefixed property names:

PREFIX ex http://example.org/
PREFIX : http://schema.org/
CREATE (a:ex:Person {name:'Alice'})-[:ex:KNOWS]->(b:ex:Person {name:'Bob'})

Here the property name is expanded to http://schema.org/name. If no default is declared, unqualified identifiers are placed in GraphLake's internal namespace.

Matching with filters and property updates:

MATCH (p:ex:Person)-[r:ex:KNOWS]->(f:ex:Person)
WHERE p.name = 'Alice' AND f.age >= 30
SET r.since = 2020
RETURN p.name AS person, f.name AS friend

Using UNWIND and WITH to work with lists:

UNWIND [1,2,3] AS n
WITH n WHERE n > 1
RETURN n

Deleting a relationship:

MATCH (a)-[r:ex:KNOWS]->(b)
DELETE (a)-[r]->(b)

Variable length path query:

MATCH (a {rid:'http://example.com/Alice'})-[:ex:KNOWS*1..2]->(b)
RETURN b

Optional matching:

MATCH (a {rid:'http://example.com/Dave'})
OPTIONAL MATCH (a)-[:ex:KNOWS]->(b)
RETURN a, b

Ordering and pagination:

MATCH (a)-[:ex:KNOWS]->(b)
RETURN b ORDER BY b.rid SKIP 1 LIMIT 1

GraphLake JS

GraphLake provides a JavaScript-based query language to interact with the graph data. This language allows you to perform various operations such as matching triples, applying updates, and writing query results.

Example script returning all triples from myGraph:

_writer.writeHeader(["s","p","o"]);
let it = _context.matchTriples("","","","",true,["myGraph"]);
while (true) {
  let t = it.next();
  if (t == null) break;
  _writer.writeRow([t.subject,t.predicate,t.object]);
}

GraphLake executes JS and inserts into the runtime two objects,

_context
and
_writer
. The context object is used to update and query the store. The writer object allows to data to be sent back to the calling application.

Match Triples

_context.matchTriples(subject, predicate, object, datatype, isLiteral, graphs)

Description: Matches triples in the specified graphs.

Parameters:

Returns: An iterator for the matched triples.

Assert Triple

_context.assertTriple(subject, predicate, object, datatype, isLiteral, graph)

Description: Asserts a new triple in the specified graph.

Parameters:

Commit Transaction

_context.commit()

Description: Commits the current transaction.

Returns: true or false if the transaction was successful.

Delete Triple

_context.deleteTriple(subject, predicate, object, datatype, isLiteral, graph)

Description: Deletes a triple from the specified graph.

Parameters:

Write Header

_writer.writeHeader(headers)

Description: Writes the header for the query result.

Parameters:

Returns: Undefined.

Write Row

_writer.writeRow(row)

Description: Writes a row to the query result.

Parameters:

Returns: Undefined.

Examples

Example 1: Matching Triples

    _writer.writeHeader(["Subject", "Predicate", "Object"]);
    
    let triplesIter = _context.matchTriples("http://example.org/subject", "", "", "", true, ["graph1"]);
    while (true) {
        let triple = triplesIter.next();
        if (triple == null) {
            break;
        }
        _writer.writeRow([triple.subject, triple.predicate, triple.object]);
    }
      

Explanation: This script matches all triples with the subject http://example.org/subject in graph1 and writes the results to the output.

Example 2: Simple Transaction

  // Delete an existing triple
  _context.deleteTriple("http://example.org/subject", "http://example.org/predicate", "http://example.org/object", "", false, "graph1");

  // Add two new triples
  _context.assertTriple("http://example.org/subject1", "http://example.org/predicate1", "http://example.org/object1", "", false, "graph1");
  _context.assertTriple("http://example.org/subject2", "http://example.org/predicate2", "http://example.org/object2", "", false, "graph1");

  // Commit the transaction
  _context.commit();
      

Explanation: This script commits the current transaction.

Example 3: Writing Headers and Rows

  _writer.writeHeader(["Id", "Name"]);
  
  let triplesIter = _context.matchTriples("http://example.org/person", "http://example.org/name", "", "", true, ["graph1"]);
  while (true) {
      let triple = triplesIter.next();
      if (triple == null) {
          break;
      }
      _writer.writeRow([triple.subject, triple.object]);
  }
      

Explanation: This script writes a header with columns "Id" and "Name", matches triples with the subject http://example.org/person and predicate http://example.org/name in graph1, and writes the results to the output.

Example 4: Matching Triples with Different Graphs

  _writer.writeHeader(["Subject", "Predicate", "Object"]);
  
  let graphs = ["graph1", "graph2"];
  let triplesIter = _context.matchTriples("http://example.org/subject", "", "", "", true, graphs);
  while (true) {
      let triple = triplesIter.next();
      if (triple == null) {
          break;
      }
      _writer.writeRow([triple.subject, triple.predicate, triple.object]);
  }
      

Explanation: This script matches all triples with the subject http://example.org/subject in both graph1 and graph2 and writes the results to the output.

API

Health

GET /health

Check server status.

MethodGET
Endpoint/health
DescriptionReturns {"status":"ok"} when the server is running.

Query

POST /stores/:store/query

Execute a query in the specified :store. GraphLake supports three different query langauges: SPARQL (core subset), SemanticCypher, and GraphLake Javascript. See the sections in the query section for details on each langauge. Use the following content types to specify which query you are using: application/sparql, application/x-graphlake-query-semanticcypher, application/x-graphlake-query-javascript.

Method POST
Endpoint /stores/:store/query
Description Runs a query on a given store.
Query Parameters
  • branch (optional): Branch to query. Defaults to main.
  • time (optional): Snapshot timestamp in milliseconds since epoch.
  • tag (optional): Snapshot tag to query; takes precedence over time.
Request Body

A string containing the query. The format depends on the underlying query engine.


POST /stores/myStore/query
Content-Type: application/sparql

SELECT ?s ?p ?o WHERE { ?s ?p ?o }
            
Response

JSON result of the query. For example:


HTTP/1.1 200 OK
Content-Type: application/json

{
  "results": [
    {
      "s": "http://example.org#subject1",
      "p": "http://example.org#predicate1",
      "o": "http://example.org#object1"
    },
    ...
  ]
}
            

Note the query result structure can differ based on the query type.


SPARQL GET

GET /stores/:store/sparql?query=...

Execute a SPARQL query via a URL-encoded query parameter.

MethodGET
Endpoint/stores/:store/sparql
Query Parameters
  • query (required): URL-encoded SPARQL query string.
  • branch (optional): Branch to query. Defaults to main.
  • time (optional): Snapshot timestamp in milliseconds since epoch.
  • tag (optional): Snapshot tag to query; takes precedence over time.
  • default-graph-uri (optional): Default graph for the query.
ResponseSame as /stores/:store/query.

Talk

POST /stores/:store/talk

Generate and run a SPARQL query from a natural language question using an LLM.

MethodPOST
Endpoint/stores/:store/talk
Request Body
{"question": "How many people are there?"}
ResponseJSON query results.

Optional query parameters:


Update with Invariant

POST /stores/:store/update_with_invariant

Execute a SPARQL UPDATE while verifying invariant ASK queries before commit. The transaction aborts if any invariant evaluates to false.

MethodPOST
Endpoint/stores/:store/update_with_invariant
DescriptionRuns a SPARQL UPDATE and validates invariants on the resulting data before committing.
Request Body

JSON object with update containing the SPARQL update and invariant containing an ASK query:


POST /stores/myStore/update_with_invariant
Content-Type: application/json

{
  "update": "INSERT DATA { GRAPH <http://example.com/graph> { <http://example.com/alice> <http://example.com/name> "Alice" . } }",
  "invariant": "ASK WHERE { <http://example.com/alice> <http://example.com/name> "Alice" . }"
}
        
Response

Success response:


HTTP/1.1 200 OK
{ "message": "SPARQL UPDATE executed" }
        

If the invariant fails:


HTTP/1.1 500 Internal Server Error
{ "error": "invariant failed" }
        

Validate Data

POST /stores/:store/validate

Validate a data graph against SHACL shapes stored in the specified store.

MethodPOST
Endpoint/stores/:store/validate
Request Body
{"schema": "graph_of_shapes", "graph": "data_graph"}
ResponseSHACL validation report in JSON.

Update

POST /stores/:store/update

Execute a SPARQL UPDATE and apply optional metadata and tags to the resulting snapshot. The update may opt out of rollup consolidation.

MethodPOST
Endpoint/stores/:store/update
DescriptionRuns a SPARQL UPDATE and records optional snapshot metadata.
Request Body

Either a raw SPARQL update with Content-Type: application/sparql-update or a JSON object containing the update and optional metadata fields:


POST /stores/myStore/update
Content-Type: application/json

{
  "update": "WITH <http://g> INSERT { <http://s> <http://p> <http://o> . }",
  "metadata": {"source": "batch1"},
  "tags": ["daily"],
  "skip_rollup": true
}
        
Response
HTTP/1.1 200 OK
{ "message": "update committed" }

List Snapshots

GET /stores/:store/snapshots

Retrieve snapshot metadata for a store branch. Additional query parameters filter snapshots by metadata keys.

MethodGET
Endpoint/stores/:store/snapshots
DescriptionLists snapshots and their metadata, tags, and rollup flags.
Query Parameters
  • branch (optional): Branch name (defaults to main).
  • tag (optional): Return only snapshots containing the specified tag.
  • Any other parameter filters snapshots whose metadata contains the matching key/value.
Response
[
  { "version": 1, "tags": ["daily"], "metadata": {"source": "batch1"}, "skip_rollup": true }
]

Tagged Snapshots

Snapshots can be labeled with user-defined tags during import and update operations. Use the tag query parameter on read-only endpoints like /stores/:store/query to target a specific tagged snapshot. Listing snapshots with ?tag=myTag returns only snapshots carrying that tag.


Export Data

POST /stores/:store/export?graph=graph_name&location=folder

Exports one or more graphs from a branch to files on disk. The optional graph query parameters may be repeated. If location is not provided the server writes files to the store's exports directory.

MethodPOST
Endpoint/stores/:store/export
DescriptionExport data from the specified store and branch.
Query Parameters
  • graph (optional): Graph name to export. Can be specified multiple times.
  • branch (optional): Branch to export from. Defaults to main.
  • location (optional): Folder path under the store to write files.
Response Returns 200 OK when the export completes.

Import Data into a Graph

POST /stores/:store/import?location=name_of_file_or_folder&graph=name_of_graph

Start an asynchronous import job for a specified graph. The data can be sent in the request body or loaded in based on files or folders located in the store import folder. The call returns a JSON object containing a jobId.

Use GET /jobs to list running jobs and GET /jobs/{id} to view the status of a particular job. Job status includes the number of triples processed and, once finished, the total triple count obtained via the TRIPLECOUNT SPARQL function.

Optional tag and meta-* query parameters label the snapshot created by the import with tags and metadata.

Method POST
Endpoint /stores/:store/import
Description Imports data into the given graph. The data can be sent via request body or by specifying a location parameter.
Query Parameters
  • location (optional): The name of the file or folder (located in the /stores/storename/import directory). If not provided, the server expects data in the request body.
  • graph (required): The name of the graph that the data is loaded into
  • format (optional): ntriples, turtle, or csv. Defaults to N-Triples.
  • csvType (optional): nodes or edges when format=csv.
  • idPrefix (optional): prefix for rid, from, and to values.
  • typePrefix (optional): prefix for values in type columns.
  • predicatePrefix (optional): prefix for column names and predicate values.
  • tag (optional): Apply a tag to the resulting snapshot. Repeat for multiple tags.
  • meta-key (optional): Metadata key/value pairs for the snapshot, e.g., meta-source=batch1.

If format is not specified, the server detects the data format from the Content-Type header. Use text/turtle for Turtle data and application/n-triples for N-Triples.

Request Body

If location query param is not provided, send the data in the request body:


POST /stores/myStore/import?graph=myGraph
Content-Type: application/n-triples

<http://example.org#subject> <http://example.org#predicate> <http://example.org#object> .
            

For Turtle data, set the header to text/turtle:


POST /stores/myStore/import?graph=myGraph
Content-Type: text/turtle

@prefix ex: <http://example.org/> .
ex:subject ex:predicate "object" .
        
Response

Returns 200 OK on successful import.


Jobs

GET /jobs

List currently running background jobs.

MethodGET
Endpoint/jobs
DescriptionReturns an array of job objects.

GET /jobs/:id

Retrieve the status of a specific job.

MethodGET
Endpoint/jobs/:id
DescriptionReturns details for job :id.

Create Graph

POST /stores/:store/graphs/:graph

Create an empty graph in the specified :store.

Method POST
Endpoint /stores/:store/graphs/:graph
Description Creates a new graph with the provided name.
Request Body No body is required. The :graph path parameter is the graph name.
Response

HTTP/1.1 200 OK
{
  "message": "Graph created"
}
            

List Graphs

GET /stores/:store/graphs

Retrieve a list of available graphs in the specified :store.

Method GET
Endpoint /stores/:store/graphs
Description Lists all graph names in the store.
Response

An array of strings, each representing a graph name.


HTTP/1.1 200 OK
Content-Type: application/json

[
  "graph1",
  "graph2",
  ...
]
            

Delete Graph

DELETE /stores/:store/graphs/:graph

Delete a specific graph within a store.

Method DELETE
Endpoint /stores/:store/graphs/:graph
Description Deletes the specified graph.
Request Body No body is required. The :graph path parameter identifies the graph.
Response

HTTP/1.1 200 OK
{
  "message": "Graph deleted"
}
            

Create Store

POST /stores/:store

Create a new store.

Method POST
Endpoint /stores/:store
Description Creates a new store using the provided name.
Request Body No request body is required. :store is the store name.
Response

HTTP/1.1 200 OK
{
  "message": "Store created"
}
            

List Stores

GET /stores

Retrieve a list of all stores. If security is enabled, only stores accessible to the current user are returned.

Method GET
Endpoint /stores
Description Lists all stores.
Response

An array of strings, each representing a store name.


HTTP/1.1 200 OK
[
  "store1",
  "store2",
  ...
]
            

Delete Store

DELETE /stores/:store

Delete a specified store. This operation is irreversible.

Method DELETE
Endpoint /stores/:store
Description Deletes the specified store.
Response

HTTP/1.1 200 OK
{
  "message": "Store deleted"
}
            

Load All Data

POST /stores/:store/loadalldata

Load all data files of a store into memory caches.

MethodPOST
Endpoint/stores/:store/loadalldata
DescriptionPreloads all data files for faster access.
Response
{"message": "all data files loaded"}

Add User

POST /admin/users

Create a new user in the system.

Method POST
Endpoint /admin/users
Description Add a new user with a username, password, and optional public key.
Request Body

JSON object containing user information:


{
  "username": "testuser",
  "password": "secretpassword",
  "public_key": "-----BEGIN PUBLIC KEY-----..."
}
            
Response

HTTP/1.1 200 OK
{
  "message": "User added"
}
            

Delete User

DELETE /admin/users/:username

Delete an existing user by username.

Method DELETE
Endpoint /admin/users/:username
Description Deletes the specified user from the system.
Response

HTTP/1.1 200 OK
{
  "message": "User deleted"
}
            

Generate Key Pair for a User

POST /admin/users/:username/keypair

Generate a new private/public key pair for the specified user.

Method POST
Endpoint /admin/users/:username/keypair
Description Generates a new key pair and stores the public key for the user. The private key is returned in the response.
Response

Returns a JSON object containing the newly generated key pair:


HTTP/1.1 200 OK
{
  "private_key": "-----BEGIN PRIVATE KEY-----...",
  "public_key": "-----BEGIN PUBLIC KEY-----..."
}
            

Authenticate with Password

POST /admin/authenticate/password

Obtain a JWT token by providing username and password.

Method POST
Endpoint /admin/authenticate/password
Description Authenticates a user with password and returns a JWT token if successful.
Request Body

JSON object with username and password fields:


{
  "username": "testuser",
  "password": "secretpassword"
}
            
Response

HTTP/1.1 200 OK
{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
            

Authenticate with JWT

POST /admin/authenticate/jwt

Verify an existing JWT and obtain a renewed token.

Method POST
Endpoint /admin/authenticate/jwt
Description Verifies a JWT and returns a new JWT if valid.
Request Body

JSON object with a token field:


{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
            
Response

HTTP/1.1 200 OK
{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... (new token)"
}
            

Set User Rules

POST /admin/users/:username/rules

Define or update the security rules for a specified user.

Method POST
Endpoint /admin/users/:username/rules
Description Sets security/ACL rules for a user.
Request Body

An array of rules. Each rule typically contains resource patterns and permissions (e.g. "read", "write"). Exact structure may vary based on the security.Rule definition:


[
  {
    "resource": "/stores/myStore/graphs/graph1",
    "permission": "read"
  },
  {
    "resource": "/stores/myStore",
    "permission": "write"
  }
]
            
Response

HTTP/1.1 200 OK
{
  "message": "Rule added"
}
            

Branch Management

GET /stores/:store/branches

Retrieve a list of branches for the specified store.

POST /stores/:store/branches/:branch

Create a new branch in the specified store. Optionally, specify a source branch using the source query parameter (default is main).

DELETE /stores/:store/branches/:branch

Delete a specific branch in the specified store.

Branches allow concurrent versions of the data. Create a branch from main, import or modify data, and include ?branch=name when querying or importing to use it.


Application Management

POST /applications/:appname

Upload a new application with the specified name. The application definition should be sent in the request body.

GET /applications

List all available applications.

GET /applications/:appname

Retrieve the application with the specified name.

DELETE /applications/:appname

Delete the application with the specified name.

Applications group related event classes and aggregates. Once uploaded they can be referenced by name in the application data endpoints.


Application Data Endpoints

These endpoints allow you to interact with application-specific data classes.

GET /apps/:app/:class

Retrieve a list of items for the specified class in the application.

GET /apps/:app/:class/:id

Retrieve a specific item by ID for the specified class in the application.

POST /apps/:app/:class/:id

Create a new item with the specified ID in the specified class and application. The item data should be sent in the request body.

PUT /apps/:app/:class/:id

Update an existing item with the specified ID in the specified class and application. The updated data should be sent in the request body.

DELETE /apps/:app/:class/:id

Delete the item with the specified ID in the specified class and application.

Application Data Types

The following data types are used in the application endpoints:

ApplicationDefinition

Represents a simple application configuration that exposes REST CRUD operations for a given graph and set of classes or shapes.

Example

Below is an example of an ApplicationDefinition:

{
  "name": "exampleApp",
  "store": "exampleStore",
  "graph": "exampleGraph",
  "schema": "exampleSchema",
  "classes": [
    {"type": "Person", "resource": "people"},
    {"type": "Organization", "resource": "orgs"}
  ],
  "aggregates": [
    {
      "name": "exampleAggregate",
      "root": {
        "class": "Event",
        "identity_property": "eventId",
        "properties": [
          { "name": "timestamp", "property": "eventTime" },
          { "name": "location", "property": "eventLocation" }
        ]
      },
      "classes": [
        {
          "class": "Transaction",
          "identity_property": "transactionId",
          "properties": [
            { "name": "amount", "property": "transactionAmount" }
          ]
        }
      ],
      "all_properties": true
    }
  ],
  "access": "write"
}

This example defines an application named exampleApp that operates on the exampleGraph in the exampleStore. It includes two classes (Person and Organization) and one aggregate definition (exampleAggregate), which aggregates data from Event and Transaction classes.

AggregateDefinition

Describes how to build an aggregate instance from a set of event classes.

AggregateClass

Specifies an event class used to build an aggregate and the property on that event that identifies the aggregate instance it applies to.

AggregateProperty

Maps an event property to the aggregate property name.

These data types are used in the request and response payloads for the application endpoints.


Admin Endpoints

GET /admin/version

Retrieve the current version of the server.

POST /admin/restart

Restart the server.

POST /admin/upgrade

Upgrade the server to the specified version. Provide the version as a query parameter version.

Agent Endpoints

Manage agent configuration and access conversation logs.

MethodEndpointDescription
GET/agents/:id/configRetrieve the active configuration for agent :id.
PUT/agents/:id/configReplace the configuration for agent :id. Provide YAML in the request body.
GET/agents/:id/logsList available log cycles. Optional limit and after query parameters page the results.
GET/agents/:id/logs/:cycleRetrieve the transcript for the specified cycle.

The configuration endpoint returns a JSON object describing how the agent runs:

{
    "ollama_base_url": "http://localhost:11434",
    "model": "qwen2.5-coder:7b",
    "tick": "10s",
    "max_steps": 4,
    "mcp_ws_url": "ws://localhost:8081",
    "system_prompt": "You are GraphLake.",
    "user_prompt": "Answer questions about the graph."
  }

Field meanings:

The same fields can be supplied as YAML in the body of the PUT /agents/:id/config request.

API Data Types

The following request and response payloads are referenced throughout the API documentation.

QueryRequest

Represents the request body for executing a query.

QueryResponse

Represents the response for a query execution.

GraphImportRequest

Defines the request body or parameters for importing data into a graph.

StoreCreationResponse

Represents the response for creating a new store.

UserCreationRequest

Represents the request body for creating a new user.

PasswordAuthRequest

Defines the request body for password-based authentication.

JWTAuthRequest

Defines the request body for JWT-based authentication.

TokenResponse

Represents the response containing a JWT token.

Rule

Represents a security rule for a user.

Talk to Your Data

GraphLake includes a Talk workflow that lets you describe an information need in plain language and receive the equivalent SPARQL query and result set. The workflow pairs schema knowledge with an LLM so the generated query respects your graph model.

Configure the OpenAI Endpoint

  1. Gather credentials. Create an API key in the OpenAI dashboard and choose the model (for example gpt-4o-mini) that will translate questions into SPARQL.
  2. Inject the key at startup. Before starting GraphLake, export the credentials so they are available to the process:
    export OPENAI_API_KEY=<your-key>
    export OPENAI_MODEL=gpt-4o-mini
    export OPENAI_BASE_URL=http://localhost:8080/v1 # optional override for compatible endpoints
  3. Start GraphLake. When the server starts it now checks for the OPENAI_API_KEY (and optional OPENAI_MODEL or OPENAI_BASE_URL) environment variables and automatically enables the OpenAI-backed Talk integration. No code changes are required. The default client points at https://api.openai.com/v1; setting OPENAI_BASE_URL lets you route requests to a local or proxy OpenAI-compatible deployment.

Restart GraphLake after setting the environment variables. Every /stores/:store/talk request and the UI now send prompts to OpenAI instead of the built-in mock model.

Upload a Schema

The Talk workflow performs best when it can reference SHACL shapes that describe the vocabulary in your graphs. Import the schema into its own graph so the LLM receives structured context:

curl -X POST \
  -H "Content-Type: text/turtle" \
  --data-binary @schema.ttl \
  "http://localhost:7642/stores/demo/import?graph=https://example.org/shapes"

Schema graphs live alongside regular data graphs, so you can version them with branches or keep multiple shape sets for different applications.

Annotate the Schema

Enriching the SHACL shapes with human-readable annotations dramatically improves the queries generated by the LLM. Add rdfs:label, rdfs:comment, or sh:description triples that explain how the classes and properties should be used:

ex:PersonShape a sh:NodeShape ;
  sh:targetClass ex:Person ;
  rdfs:label "Person" ;
  sh:description "Customer or prospect record" ;
  sh:property [
    sh:path ex:email ;
    rdfs:label "Email address" ;
    sh:description "Primary contact e-mail for the person" ;
    sh:datatype xsd:string ;
  ] .

These annotations become part of the prompt sent to the model, helping it choose the correct predicates and filters when building SPARQL.

Use the Talk UI

  1. Open the page. Browse to /talk.html (or choose Talk in the left navigation of the GraphLake UI). The form automatically reuses the most recent store and branch selections.
  2. Choose context. Enter the store name, optional branch, and the schema graph you imported earlier (for example https://example.org/shapes). Supplying a schema graph makes the LLM prompt more precise.
  3. Ask a question. Type a natural-language request such as “List active customers created this month” and press Ask. The UI posts to /stores/:store/talk using the OpenAI-backed generator.
  4. Review the output. The generated SPARQL appears in the Query panel and the execution results display beneath it. Errors returned by the API (for example schema misconfiguration) are rendered in the results panel so you can adjust the prompt or fix the data.
  5. Iterate. Refine the question, adjust annotations, or point at a different branch or schema graph until the query matches your intent. Once you are satisfied you can copy the SPARQL statement into scripts or stored queries.

Because the Talk UI shares browser state with the rest of the console, any stores, branches, or schema graphs selected on other pages stay pre-filled, streamlining exploratory workflows.

Agents

GraphLake can run background agents defined in a configuration file.

LLM Agent

The LLM agent watches the graph for trigger events, asks a large language model to reason about next steps, and then persists the resulting plan back into GraphLake. This section walks through a complete example that you can copy and adapt for your own automations.

Sample data to load

Create a small Turtle file that models an upcoming task and a deadline resource. The task description gives the LLM enough context to plan additional actions.

@prefix ex: <https://example.org/project/> .
@prefix schema: <https://schema.org/> .

ex:Task123 a schema:Action ;
  schema:name "Prepare quarterly review" ;
  schema:description "Compile financial metrics, gather customer feedback, and rehearse the executive presentation." ;
  ex:deadline ex:Deadline123 .

ex:Deadline123 a ex:TaskDeadline ;
  schema:dueDate "2025-03-15"^^schema:Date ;
  schema:description "Quarterly review deck must be ready for the March 15 board meeting." .

Load the data into a working graph with the import endpoint:

curl -X POST \
  "http://localhost:7642/stores/myStore/import?graph=https://example.org/project" \
  -H "Content-Type: text/turtle" \
  --data-binary @task.ttl

Configure the LLM agent

Define an agent configuration YAML file. The llm agent type listens to a SPARQL trigger_query. When the query returns bindings, the agent builds a prompt template using the bindings and calls the configured language model provider. The completion is then written back into the graph with an update_template.

type: llm
tick: 30s
graph: https://example.org/project
trigger_query: |
  PREFIX ex: <https://example.org/project/>
  PREFIX schema: <https://schema.org/>
  SELECT ?task ?taskName ?taskDescription ?deadline
  WHERE {
    ?task a schema:Action ;
          schema:name ?taskName ;
          schema:description ?taskDescription ;
          ex:deadline ?deadline .
    ?deadline a ex:TaskDeadline .
    FILTER NOT EXISTS { ?task ex:generatedPlan ?plan }
  }
llm:
  provider: openai
  model: gpt-4.1
  api_key_env: OPENAI_API_KEY
prompt_template: |
  You are a planning assistant. The task "{{taskName}}" has the description:
  {{taskDescription}}

  The deadline resource is {{deadline}}. Suggest up to three concrete follow-up events the team should
  schedule in advance so they are ready by the due date. Respond in JSON with an array named "events" and
  include fields "title", "purpose", and "leadTimeDays".
update_template: |
  PREFIX ex: <https://example.org/project/>
  INSERT {
    GRAPH <https://example.org/project> {
      {{#each events}}
        _:event ex:belongsTo {{task}} ;
                ex:title "{{title}}" ;
                ex:purpose "{{purpose}}" ;
                ex:leadTimeDays {{leadTimeDays}} .
      {{/each}}
      {{task}} ex:generatedPlan _:plan .
      _:plan ex:rawResponse "{{raw_response}}" .
    }
  }
  WHERE {}

The configuration instructs the agent to poll the graph every 30 seconds, collect tasks with deadlines, and persist the LLM response as a set of event nodes connected to the task. The raw_response field is optional but useful for auditing.

Initialise and run the agent

  1. Export the configuration. Save the YAML to agents/task-planner.yaml and ensure the file is available on the server running GraphLake.
  2. Configure credentials. Set the environment variable referenced in api_key_env (for example export OPENAI_API_KEY=sk-...).
  3. Register the agent. Call the agent API to add or reload the configuration.
    curl -X POST \
      "http://localhost:7642/stores/myStore/agents" \
      -H "Content-Type: application/json" \
      -d '{
        "name": "task-planner",
        "configPath": "agents/task-planner.yaml"
      }'
  4. Start processing. Agents run inside the GraphLake application process. Restart the service or call the /stores/:store/agents/:name/start endpoint if the agent is not already active.

Monitor agent activity

Use the agent endpoints to inspect the current status and recent actions:

# Check agent state and last tick
curl http://localhost:7642/stores/myStore/agents/task-planner

# Retrieve execution history (most recent 10 runs)
curl "http://localhost:7642/stores/myStore/agents/task-planner/history?limit=10"

Logs also include each LLM invocation with the rendered prompt and model response. Tail the GraphLake server logs to follow the agent in real time:

docker logs -f graphlake-server | grep "task-planner"

View graph updates

After the LLM generates a plan, query the graph to review the generated events and raw output:

PREFIX ex: <https://example.org/project/>
PREFIX schema: <https://schema.org/>

SELECT ?taskName ?eventTitle ?purpose ?leadTime
WHERE {
  ?task schema:name ?taskName ;
        ex:generatedPlan ?plan .
  ?event ex:belongsTo ?task ;
         ex:title ?eventTitle ;
         ex:purpose ?purpose ;
         ex:leadTimeDays ?leadTime .
}
ORDER BY ?taskName ?eventTitle

The query reveals the concrete follow-up events suggested by the LLM. Because the agent records the raw response on ?plan, you can also retrieve it for audit purposes:

PREFIX ex: <https://example.org/project/>
SELECT ?task ?raw WHERE {
  ?task ex:generatedPlan ?plan .
  ?plan ex:rawResponse ?raw .
}

Combine these queries with dashboard visualisations or alerts to build a complete feedback loop between the graph, the agent, and your operational systems.

SPARQL Agent

Periodically executes a SELECT trigger_query and runs a templated update_query for each result. Placeholders like {{var}} are replaced with bindings from the trigger.

type: sparql
tick: 10s
graph: urn:graph1
trigger_query: |
  SELECT ?s WHERE { ?s <urn:status> "open" }
update_query: |
  WITH <urn:graph1>
  DELETE { {{s}} <urn:status> "open" . }
  INSERT { {{s}} <urn:status> "done" . }
  WHERE {}

Security

GraphLake uses signed JWT to allow access to the API. Tokens are fetched from the API by authenticating with either a signed JWT token using a local private key or user name and password. Users are managed on the server to give read, write and owner access to stores and graphs.

When a security manager is enabled, certain endpoints require a valid JWT token for authorization. The server applies ACL checks to ensure that a user can only access the stores and graphs permitted by their assigned rules. If you receive a 401 (Unauthorized) or 403 (Forbidden) response, check that:

The developer edition runs unsecured.

Monitoring

GraphLake writes structured logs to StdErr which when run in docker goes into the docker logs. We recommend shipping these logs to your favourite log management system and configuring any notifications you require.

VSCode client

To enable a great developer experience we provide a VSCode extension for managing stores and graphs, running queries, and user managemment. All operations can also be done from CURL or programtically over http in any language.

Installation

SHACL Support

GraphLake includes native SHACL validation so data can be checked against a schema before use.

Uploading Shapes to a Graph

SHACL shapes are stored in a regular named graph. Upload the shapes using the import endpoint and supply the graph name where the shapes should reside:

curl -X POST \
  "http://localhost:7642/stores/myStore/import?graph=https://example.org/shapes" \
  -H "Content-Type: text/turtle" \
  --data-binary @shapes.ttl

Invoking Validation

To validate data, post the schema graph and data graph to the /stores/:store/validate endpoint. Optionally include a branch query parameter to validate a non-main branch.

curl -X POST "http://localhost:7642/stores/myStore/validate" \
  -H "Content-Type: application/json" \
  -d '{
    "schema": "https://example.org/shapes",
    "graph": "https://example.org/data"
  }'

The response is a SHACL validation report listing conforming status and any violations.

Common Shape Examples

@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix ex: <https://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:PersonShape
  a sh:NodeShape ;
  sh:targetClass ex:Person ;
  sh:property [
    sh:path ex:name ;
    sh:datatype xsd:string ;
    sh:minCount 1 ;
  ] ;
  sh:property [
    sh:path ex:age ;
    sh:datatype xsd:integer ;
    sh:minInclusive 0 ;
  ] .

ex:EmailShape
  a sh:NodeShape ;
  sh:targetClass ex:Person ;
  sh:property [
    sh:path ex:email ;
    sh:pattern "^[^@]+@[^@]+$" ;
  ] .

ex:KnowsShape
  a sh:NodeShape ;
  sh:targetClass ex:Person ;
  sh:property [
    sh:path ex:knows ;
    sh:class ex:Person ;
  ] .

Multi-Node Operation

Multi-node operation allows GraphLake to scale horizontally by distributing workloads across multiple nodes. This setup is particularly beneficial for large-scale deployments where high availability, fault tolerance, and load balancing are critical. By leveraging multiple nodes, organizations can ensure that their GraphLake instance remains responsive even under heavy workloads.

Benefits of Multi-Node Operation

Configuration Steps

  1. Enable Multi-Node Mode: Set the environment variable GRAPHLAKE_MULTI_NODE to true or use the --multinode flag when starting the application.
  2. Configure Backend Storage: Ensure that all nodes have access to a shared storage backend (e.g., S3 or Azure Blob Storage) to maintain data consistency.
  3. Synchronize Configuration: Use the same configuration file or environment variables across all nodes to ensure uniform behavior.
  4. Set Up a Load Balancer: Deploy a load balancer in front of the nodes to distribute incoming requests evenly.
  5. Monitor and Scale: Use monitoring tools to track node performance and add or remove nodes as needed to meet demand.

Example Command

To start a node in multi-node mode using Docker:

docker run -d -p 7642:7642 -v /tmp/graphlake-store:/store \
  -e GRAPHLAKE_MULTI_NODE=true \
  dataplatformsolutions/graphlake:latest --storepath /store --port 7642

For a full walkthrough, see Multi-Node Tutorial.

Gateway for Elastic Nodes

The gateway package exposes a lightweight front end capable of starting GraphLake nodes on demand. A Provider interface abstracts the underlying infrastructure so implementations can target Kubernetes, AWS, Azure, or other platforms.

Provider Configuration

Gateway Endpoints

The gateway proxies requests to GraphLake nodes and provides admin endpoints for user and token management:

POST /workloads/{name}/start
POST /workloads/{name}/stop
GET|POST|... /workloads/{name}/

POST /admin/authenticate/password
POST /admin/authenticate/jwt
POST /admin/users
DELETE /admin/users/{username}
POST /admin/users/{username}/keypair
POST /admin/users/{username}/rules

Monitoring and Scaling

The gateway manages workloads and nodes dynamically:

Example Commands

Run the gateway service using the cmd/gateway binary. It exposes REST endpoints to start and stop workloads and proxies GraphLake API calls:

POST /workloads/{name}/start
POST /workloads/{name}/stop
GET|POST|... /workloads/{name}/

To list active workloads and their nodes:

GET /workloads                     # list active workload names
GET /workloads/{name}/nodes        # list nodes for a workload with health info
GET /workloads/{name}/nodes/{id}/health

Support

If you have issues with GraphLake then please reach out to us on discord, file an issue on public github repository, or send an email to "contact @ dataplatformsolutions.com".