Knowledge Graphs and AI: Structured Reasoning with LLMs and Graph Databases

Knowledge graphs give LLMs access to structured, verifiable facts — reducing hallucinations, enabling multi-hop reasoning, and making AI answers auditable. This guide covers how to combine graph databases with LLMs in production.

Why knowledge graphs complement LLMs

LLMs alone	LLMs + knowledge graphs
May hallucinate facts	Facts grounded in verified graph data
Static knowledge cutoff	Real-time graph queries
Weak multi-hop reasoning	Graph traversal handles relationships
Opaque provenance	Every fact has a source node
Poor entity disambiguation	Unique node IDs resolve ambiguity

Graph database options

Database	Query language	Best for
Neo4j	Cypher	General-purpose, excellent tooling
Amazon Neptune	SPARQL / Gremlin	AWS-native, RDF standards
ArangoDB	AQL	Multi-model (graph + document)
TigerGraph	GSQL	Large-scale analytics
Memgraph	Cypher	Real-time, in-memory

Building a knowledge graph with LLM extraction

// Step 1: Extract entities and relationships from text
interface GraphExtractionResult {
  entities: Array<{ id: string; type: string; name: string; properties: Record<string, string> }>;
  relationships: Array<{ source: string; target: string; type: string; properties?: Record<string, string> }>;
}

async function extractGraph(text: string): Promise<GraphExtractionResult> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: `Extract entities and relationships from text for a knowledge graph.
Entity types: Person, Company, Product, Technology, Location, Event
Relationship types: WORKS_AT, FOUNDED, USES, PART_OF, LOCATED_IN, ACQUIRED`,
      },
      {
        role: 'user',
        content: `Extract from: "${text}"`,
      },
    ],
    response_format: {
      type: 'json_schema',
      json_schema: {
        name: 'GraphExtraction',
        strict: true,
        schema: {
          type: 'object',
          properties: {
            entities: {
              type: 'array',
              items: {
                type: 'object',
                properties: {
                  id:         { type: 'string' },
                  type:       { type: 'string' },
                  name:       { type: 'string' },
                  properties: { type: 'object', additionalProperties: { type: 'string' } },
                },
                required: ['id', 'type', 'name', 'properties'],
                additionalProperties: false,
              },
            },
            relationships: {
              type: 'array',
              items: {
                type: 'object',
                properties: {
                  source:     { type: 'string' },
                  target:     { type: 'string' },
                  type:       { type: 'string' },
                  properties: { type: 'object', additionalProperties: { type: 'string' } },
                },
                required: ['source', 'target', 'type'],
                additionalProperties: false,
              },
            },
          },
          required: ['entities', 'relationships'],
          additionalProperties: false,
        },
      },
    },
  });

  return JSON.parse(response.choices[0].message.content!) as GraphExtractionResult;
}

// Step 2: Load into Neo4j
import neo4j from 'neo4j-driver';

const driver = neo4j.driver(
  process.env.NEO4J_URI!,
  neo4j.auth.basic(process.env.NEO4J_USER!, process.env.NEO4J_PASSWORD!)
);

async function loadGraphToNeo4j(extracted: GraphExtractionResult) {
  const session = driver.session();
  try {
    // Upsert entities
    for (const entity of extracted.entities) {
      await session.run(
        `MERGE (n:${entity.type} {id: $id})
         SET n.name = $name, n += $properties`,
        { id: entity.id, name: entity.name, properties: entity.properties }
      );
    }

    // Upsert relationships
    for (const rel of extracted.relationships) {
      await session.run(
        `MATCH (a {id: $source}), (b {id: $target})
         MERGE (a)-[r:${rel.type}]->(b)
         SET r += $properties`,
        { source: rel.source, target: rel.target, properties: rel.properties ?? {} }
      );
    }
  } finally {
    await session.close();
  }
}

Text-to-Cypher: natural language → graph queries

// Convert natural language questions to Cypher queries
async function textToCypher(question: string, schema: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: `You are a Cypher query generator for Neo4j.
Convert natural language questions into valid Cypher queries.
Return only the Cypher query, no explanation.

Graph schema:
${schema}`,
      },
      { role: 'user', content: question },
    ],
    max_tokens: 300,
    temperature: 0,
  });

  return response.choices[0].message.content!.trim();
}

// Execute the generated query
async function graphQuery(question: string) {
  const schema = await getGraphSchema();
  const cypherQuery = await textToCypher(question, schema);

  // Validate: only allow read queries
  if (!cypherQuery.toUpperCase().startsWith('MATCH')) {
    throw new Error('Only MATCH queries are permitted');
  }

  const session = driver.session({ defaultAccessMode: neo4j.session.READ });
  try {
    const result = await session.run(cypherQuery);
    return result.records.map(r => r.toObject());
  } finally {
    await session.close();
  }
}

// Example:
// Q: "Who founded companies that use Kubernetes?"
// Generated: MATCH (p:Person)-[:FOUNDED]->(c:Company)-[:USES]->(t:Technology {name: "Kubernetes"})
//            RETURN p.name, c.name

GraphRAG: graph-enhanced retrieval

// Combine vector search (semantic) with graph traversal (relational)
async function graphRAG(question: string): Promise<string> {
  // 1. Semantic search: find relevant seed entities
  const queryEmbedding = await embedText(question);
  const seedNodes = await vectorStore.query({ vector: queryEmbedding, topK: 3 });

  // 2. Graph traversal: expand context through relationships
  const nodeIds = seedNodes.map(n => n.metadata.nodeId);
  const session = driver.session({ defaultAccessMode: neo4j.session.READ });

  const { records } = await session.run(
    `MATCH (seed)
     WHERE seed.id IN $ids
     OPTIONAL MATCH (seed)-[r1]->(related)-[r2]->(deeper)
     RETURN seed, r1, related, r2, deeper
     LIMIT 50`,
    { ids: nodeIds }
  );

  await session.close();

  // 3. Format graph context
  const graphContext = records.map(r => {
    const seed    = r.get('seed')?.properties;
    const related = r.get('related')?.properties;
    const rel1    = r.get('r1')?.type;
    return related
      ? `${seed?.name} --[${rel1}]--> ${related?.name}`
      : seed?.name;
  }).filter(Boolean).join('
');

  // 4. Generate answer grounded in graph facts
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: 'Answer the question using only the graph facts below. Cite entity names.',
      },
      {
        role: 'user',
        content: `Graph facts:
${graphContext}

Question: ${question}`,
      },
    ],
  });

  return response.choices[0].message.content!;
}

Hallucination reduction with graph grounding

// Verify LLM claims against the knowledge graph
async function verifyFact(claim: string): Promise<{ verified: boolean; source?: string }> {
  // Extract entities from the claim
  const entities = await extractEntitiesFromText(claim);
  if (entities.length < 2) return { verified: false };

  const [entity1, entity2] = entities;

  // Check if relationship exists in graph
  const session = driver.session({ defaultAccessMode: neo4j.session.READ });
  const result = await session.run(
    `MATCH (a {name: $e1})-[r]-(b {name: $e2})
     RETURN type(r) AS relationship, a.id AS sourceId`,
    { e1: entity1, e2: entity2 }
  );
  await session.close();

  if (result.records.length > 0) {
    const rel = result.records[0].get('relationship');
    const src = result.records[0].get('sourceId');
    return { verified: true, source: `Graph: ${entity1} -[${rel}]-> ${entity2} (id: ${src})` };
  }

  return { verified: false };
}

Takeaway

Knowledge graphs are most valuable for domains with dense relationships: enterprise org charts, product catalogs, compliance rules, medical ontologies. Use LLM extraction to build the graph from unstructured text, text-to-Cypher to query it naturally, and GraphRAG to combine semantic search with relational traversal for the most accurate retrieval available.