Dgraph (dgraph-io/dgraph)

Dgraph

https://github.com/dgraph-io/dgraph
Admin
Dgraph is a horizontally scalable and distributed GraphQL database with a graph backend that...

Tokens:35,743
Snippets:476
Trust Score:9.6
Update:1 week ago
Show doc for...
Context Summary (auto-generated)
Raw
# Dgraph

Dgraph is a horizontally scalable and distributed GraphQL database with a graph backend. It provides ACID transactions, consistent replication, and linearizable reads, built from the ground up for high-performance graph queries. As a native GraphQL database, Dgraph tightly controls data arrangement on disk to optimize query performance and throughput, reducing disk seeks and network calls in clustered deployments.

The database supports both GraphQL and DQL (Dgraph Query Language) query syntaxes, responds in JSON and Protocol Buffers, and communicates over gRPC and HTTP. Dgraph is designed to provide Google production-level scale and throughput with low latency for real-time user queries over terabytes of structured data. The architecture consists of Alpha nodes (data storage and serving) and Zero nodes (cluster coordination), enabling horizontal scaling across multiple machines.

## Running Dgraph with Docker

Start a standalone Dgraph cluster for development and testing.

```bash
# Pull the latest Dgraph image
docker pull dgraph/dgraph:latest

# Run a standalone cluster with persistent storage
docker run -it -p 8080:8080 -p 9080:9080 -v ~/dgraph:/dgraph dgraph/standalone:latest
```

## HTTP Query API

Execute DQL queries against Dgraph via the HTTP `/query` endpoint. Supports JSON and DQL content types with optional read-only and best-effort modes.

```bash
# Simple query to find all people with friends
curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{
      people(func: has(friend)) {
        uid
        name
        friend {
          name
        }
      }
    }"
  }'

# Query with variables
curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query findPerson($name: string) {
      person(func: eq(name, $name)) {
        uid
        name
        age
        friend {
          name
        }
      }
    }",
    "variables": {"$name": "Alice"}
  }'

# Read-only query (no transaction overhead)
curl -X POST "http://localhost:8080/query?ro=true" \
  -H "Content-Type: application/dql" \
  -d '{
    movies(func: has(director)) @filter(gt(release_date, "2020-01-01")) {
      uid
      title
      director {
        name
      }
    }
  }'

# Best-effort query (eventually consistent, fastest)
curl -X POST "http://localhost:8080/query?be=true" \
  -H "Content-Type: application/dql" \
  -d '{
    count(func: type(Person))
  }'
```

## HTTP Mutation API

Perform data mutations via the `/mutate` endpoint. Supports JSON format for set/delete operations with optional conditional mutations.

```bash
# Insert data using JSON format
curl -X POST http://localhost:8080/mutate?commitNow=true \
  -H "Content-Type: application/json" \
  -d '{
    "set": [
      {
        "uid": "_:alice",
        "dgraph.type": "Person",
        "name": "Alice",
        "age": 30,
        "friend": [
          {
            "uid": "_:bob",
            "dgraph.type": "Person",
            "name": "Bob",
            "age": 25
          }
        ]
      }
    ]
  }'

# Delete data
curl -X POST http://localhost:8080/mutate?commitNow=true \
  -H "Content-Type: application/json" \
  -d '{
    "delete": [
      {
        "uid": "0x1",
        "friend": {
          "uid": "0x2"
        }
      }
    ]
  }'

# Conditional mutation (upsert)
curl -X POST http://localhost:8080/mutate?commitNow=true \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{ v as var(func: eq(email, \"alice@example.com\")) }",
    "cond": "@if(eq(len(v), 0))",
    "set": [
      {
        "dgraph.type": "Person",
        "name": "Alice",
        "email": "alice@example.com"
      }
    ]
  }'

# RDF N-Quad format mutation
curl -X POST http://localhost:8080/mutate?commitNow=true \
  -H "Content-Type: application/rdf" \
  -d '
    _:alice <name> "Alice" .
    _:alice <dgraph.type> "Person" .
    _:alice <age> "30"^^<xs:int> .
    _:alice <friend> _:bob .
    _:bob <name> "Bob" .
    _:bob <dgraph.type> "Person" .
  '
```

## Schema Alter API

Modify the database schema using the `/alter` endpoint. Define predicates with types, indexes, and directives.

```bash
# Define schema with indexes
curl -X POST http://localhost:8080/alter \
  -H "Content-Type: application/json" \
  -d '{
    "schema": "
      name: string @index(exact, term, fulltext) .
      age: int @index(int) .
      email: string @index(exact) @unique .
      friend: [uid] @reverse @count .
      created_at: datetime @index(hour) .
      location: geo @index(geo) .
      embedding: float32vector @index(hnsw(metric:\"cosine\", exponent:\"4\")) .

      type Person {
        name
        age
        email
        friend
        created_at
        location
        embedding
      }
    "
  }'

# Drop all data (preserves schema)
curl -X POST http://localhost:8080/alter \
  -H "Content-Type: application/json" \
  -d '{"drop_op": "DATA"}'

# Drop specific predicate
curl -X POST http://localhost:8080/alter \
  -H "Content-Type: application/json" \
  -d '{"drop_attr": "name"}'

# Drop specific type
curl -X POST http://localhost:8080/alter \
  -H "Content-Type: application/json" \
  -d '{"drop_op": "TYPE", "drop_value": "Person"}'
```

## Transaction API

Manage transactions manually using the `/commit` endpoint for multi-statement transactions.

```bash
# Start a transaction and get startTs
RESPONSE=$(curl -s -X POST http://localhost:8080/mutate \
  -H "Content-Type: application/json" \
  -d '{
    "set": [{"uid": "_:temp", "name": "Temp"}]
  }')
START_TS=$(echo $RESPONSE | jq -r '.extensions.txn.start_ts')

# Perform additional mutation in same transaction
curl -X POST "http://localhost:8080/mutate?startTs=$START_TS" \
  -H "Content-Type: application/json" \
  -d '{
    "set": [{"uid": "_:temp2", "name": "Temp2"}]
  }'

# Commit the transaction
curl -X POST "http://localhost:8080/commit?startTs=$START_TS" \
  -H "Content-Type: application/json" \
  -d '{"keys": [], "preds": []}'

# Or abort the transaction
curl -X POST "http://localhost:8080/commit?startTs=$START_TS&abort=true" \
  -H "Content-Type: application/json" \
  -d '{}'
```

## GraphQL API

Dgraph provides a native GraphQL endpoint at `/graphql` after uploading a GraphQL schema via the admin endpoint.

```bash
# Upload GraphQL schema via admin endpoint
curl -X POST http://localhost:8080/admin/schema \
  -H "Content-Type: application/graphql" \
  -d '
    type Person {
      id: ID!
      name: String! @search(by: [hash, term])
      age: Int @search
      email: String! @id
      friends: [Person] @hasInverse(field: friends)
    }
  '

# Execute GraphQL query
curl -X POST http://localhost:8080/graphql \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query {
      queryPerson(filter: {name: {anyofterms: \"Alice\"}}) {
        id
        name
        age
        friends {
          name
        }
      }
    }"
  }'

# GraphQL mutation
curl -X POST http://localhost:8080/graphql \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation {
      addPerson(input: [{
        name: \"Alice\",
        age: 30,
        email: \"alice@example.com\"
      }]) {
        person {
          id
          name
        }
      }
    }"
  }'
```

## Admin API

The admin GraphQL API at `/admin` provides cluster management operations including health checks, backups, and configuration.

```bash
# Check cluster health
curl -X POST http://localhost:8080/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{ health { instance address status version uptime } }"
  }'

# Get cluster state
curl -X POST http://localhost:8080/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{
      state {
        groups {
          id
          members { id addr leader }
          tablets { predicate groupId }
        }
        zeros { id addr leader }
      }
    }"
  }'

# Trigger backup to S3
curl -X POST http://localhost:8080/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation {
      backup(input: {
        destination: \"s3://my-bucket/dgraph-backup\",
        accessKey: \"ACCESS_KEY\",
        secretKey: \"SECRET_KEY\"
      }) {
        response { code message }
        taskId
      }
    }"
  }'

# Export data
curl -X POST http://localhost:8080/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation {
      export(input: {format: \"json\", destination: \"/dgraph/export\"}) {
        response { code message }
        taskId
      }
    }"
  }'

# Update configuration
curl -X POST http://localhost:8080/admin \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation {
      config(input: {cacheMb: 2048, logDQLRequest: true}) {
        response { code message }
      }
    }"
  }'
```

## ACL and Authentication

Login and manage access control using JWT-based authentication when ACL is enabled.

```bash
# Login to get JWT tokens
curl -X POST http://localhost:8080/login \
  -H "Content-Type: application/json" \
  -d '{
    "userid": "groot",
    "password": "password"
  }'
# Response: {"data":{"accessJWT":"...", "refreshJWT":"..."}}

# Use access token for authenticated requests
curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -H "X-Dgraph-AccessToken: <ACCESS_JWT>" \
  -d '{
    "query": "{ users(func: type(dgraph.type.User)) { dgraph.xid } }"
  }'

# Create user via ACL command
dgraph acl add -a localhost:9080 -u groot -p password --new-user alice --new-password secret

# Create group
dgraph acl add -a localhost:9080 -u groot -p password --new-group developers

# Grant permissions
dgraph acl mod -a localhost:9080 -u groot -p password \
  --group developers --pred name --perm 7  # 7 = read + write + modify
```

## Bulk Loader

Load large datasets offline using the bulk loader for initial data population. Generates p-directories that can be used directly by Alpha nodes.

```bash
# Bulk load RDF data
dgraph bulk \
  -f data.rdf.gz \
  -s schema.txt \
  --map_shards=4 \
  --reduce_shards=2 \
  --zero=localhost:5080 \
  --out=./out

# Bulk load JSON data with GraphQL schema
dgraph bulk \
  -f data.json.gz \
  -s schema.txt \
  -g graphql-schema.graphql \
  --format=json \
  --zero=localhost:5080 \
  --out=./out \
  --replace_out

# Start Alpha with bulk-loaded data
dgraph alpha --postings=./out/0/p --zero=localhost:5080
```

## Live Loader

Stream data into a running Dgraph cluster for incremental loading with upsert support.

```bash
# Live load RDF data
dgraph live \
  -f data.rdf.gz \
  -s schema.txt \
  --alpha=localhost:9080 \
  --batch=1000 \
  --conc=10

# Live load with upsert on specific predicate
dgraph live \
  -f data.json.gz \
  --format=json \
  --alpha=localhost:9080 \
  --upsertPredicate=email \
  --batch=500

# Live load with authentication
dgraph live \
  -f data.rdf.gz \
  -s schema.txt \
  --alpha=localhost:9080 \
  --auth_token=<API_TOKEN>
```

## gRPC Client (Go)

Use the official Go client library for programmatic access with full transaction support.

```go
package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"

    "github.com/dgraph-io/dgo/v250"
    "github.com/dgraph-io/dgo/v250/protos/api"
    "google.golang.org/grpc"
    "google.golang.org/grpc/credentials/insecure"
)

type Person struct {
    UID     string   `json:"uid,omitempty"`
    Name    string   `json:"name,omitempty"`
    Age     int      `json:"age,omitempty"`
    Friends []Person `json:"friend,omitempty"`
    DType   []string `json:"dgraph.type,omitempty"`
}

func main() {
    // Connect to Dgraph
    conn, err := grpc.NewClient("localhost:9080",
        grpc.WithTransportCredentials(insecure.NewCredentials()))
    if err != nil {
        log.Fatal(err)
    }
    defer conn.Close()

    dgraphClient := dgo.NewDgraphClient(api.NewDgraphClient(conn))

    // Set schema
    op := &api.Operation{
        Schema: `
            name: string @index(exact) .
            age: int .
            friend: [uid] @reverse .
            type Person {
                name
                age
                friend
            }
        `,
    }
    if err := dgraphClient.Alter(context.Background(), op); err != nil {
        log.Fatal(err)
    }

    // Create transaction
    txn := dgraphClient.NewTxn()
    defer txn.Discard(context.Background())

    // Mutation
    p := Person{
        Name:  "Alice",
        Age:   30,
        DType: []string{"Person"},
        Friends: []Person{
            {Name: "Bob", Age: 25, DType: []string{"Person"}},
        },
    }
    pb, _ := json.Marshal(p)

    mu := &api.Mutation{SetJson: pb, CommitNow: true}
    assigned, err := txn.Mutate(context.Background(), mu)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Created UIDs: %+v\n", assigned.Uids)

    // Query
    txn = dgraphClient.NewReadOnlyTxn()
    q := `{
        people(func: type(Person)) {
            uid
            name
            age
            friend {
                name
            }
        }
    }`
    resp, err := txn.Query(context.Background(), q)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Response: %s\n", resp.Json)
}
```

## Vector Similarity Search

Dgraph supports HNSW-based vector similarity search for AI/ML applications with embeddings.

```bash
# Define vector schema
curl -X POST http://localhost:8080/alter \
  -d '
    embedding: float32vector @index(hnsw(metric:"cosine", exponent:"4")) .
    title: string @index(term) .
    type Document {
      title
      embedding
    }
  '

# Insert document with embedding
curl -X POST http://localhost:8080/mutate?commitNow=true \
  -H "Content-Type: application/json" \
  -d '{
    "set": [{
      "dgraph.type": "Document",
      "title": "Machine Learning Basics",
      "embedding": [0.1, 0.2, 0.3, 0.4, 0.5]
    }]
  }'

# Query similar documents
curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{
      similar(func: similar_to(embedding, 5, \"[0.1, 0.2, 0.3, 0.4, 0.5]\")) {
        uid
        title
        vector_distance
      }
    }"
  }'
```

## Cluster Deployment

Deploy a production Dgraph cluster with Zero and Alpha nodes using Docker Compose.

```yaml
# docker-compose.yml
version: "3.8"
services:
  zero1:
    image: dgraph/dgraph:latest
    command: dgraph zero --my=zero1:5080 --replicas=3 --raft="idx=1"
    ports:
      - "5080:5080"
      - "6080:6080"
    volumes:
      - zero1-data:/dgraph

  alpha1:
    image: dgraph/dgraph:latest
    command: dgraph alpha --my=alpha1:7080 --zero=zero1:5080
      --security="whitelist=0.0.0.0/0"
    ports:
      - "8080:8080"
      - "9080:9080"
    volumes:
      - alpha1-data:/dgraph
    depends_on:
      - zero1

  alpha2:
    image: dgraph/dgraph:latest
    command: dgraph alpha --my=alpha2:7080 --zero=zero1:5080
      --port_offset=1
    ports:
      - "8081:8081"
      - "9081:9081"
    volumes:
      - alpha2-data:/dgraph
    depends_on:
      - zero1

volumes:
  zero1-data:
  alpha1-data:
  alpha2-data:
```

Dgraph serves as an excellent foundation for applications requiring complex graph relationships, including social networks, recommendation engines, knowledge graphs, fraud detection systems, and any use case involving interconnected data with real-time query requirements. Its native GraphQL support makes it particularly suitable for modern web and mobile applications that already use GraphQL on the frontend.

Integration patterns typically involve direct HTTP/gRPC connections from application servers, with the bulk loader used for initial data migration and the live loader for ongoing ETL processes. For production deployments, multi-node clusters provide high availability and horizontal scalability, with automatic data sharding across Alpha groups managed by Zero nodes. The ACL system enables fine-grained access control for multi-tenant applications, while the backup and restore capabilities ensure data durability and disaster recovery.