Knowledge Graphs and AI: Structured Reasoning with LLMs and Graph Databases
·12 min read
Knowledge graphs give LLMs access to structured, verifiable facts — reducing hallucinations, enabling multi-hop reasoning, and making AI answers auditable. This guide covers how to combine graph databases with LLMs in production.
Why knowledge graphs complement LLMs
| LLMs alone | LLMs + knowledge graphs |
|---|---|
| May hallucinate facts | Facts grounded in verified graph data |
| Static knowledge cutoff | Real-time graph queries |
| Weak multi-hop reasoning | Graph traversal handles relationships |
| Opaque provenance | Every fact has a source node |
| Poor entity disambiguation | Unique node IDs resolve ambiguity |
Graph database options
| Database | Query language | Best for |
|---|---|---|
| Neo4j | Cypher | General-purpose, excellent tooling |
| Amazon Neptune | SPARQL / Gremlin | AWS-native, RDF standards |
| ArangoDB | AQL | Multi-model (graph + document) |
| TigerGraph | GSQL | Large-scale analytics |
| Memgraph | Cypher | Real-time, in-memory |
Building a knowledge graph with LLM extraction
// Step 1: Extract entities and relationships from text
interface GraphExtractionResult {
entities: Array<{ id: string; type: string; name: string; properties: Record<string, string> }>;
relationships: Array<{ source: string; target: string; type: string; properties?: Record<string, string> }>;
}
async function extractGraph(text: string): Promise<GraphExtractionResult> {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: `Extract entities and relationships from text for a knowledge graph.
Entity types: Person, Company, Product, Technology, Location, Event
Relationship types: WORKS_AT, FOUNDED, USES, PART_OF, LOCATED_IN, ACQUIRED`,
},
{
role: 'user',
content: `Extract from: "${text}"`,
},
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'GraphExtraction',
strict: true,
schema: {
type: 'object',
properties: {
entities: {
type: 'array',
items: {
type: 'object',
properties: {
id: { type: 'string' },
type: { type: 'string' },
name: { type: 'string' },
properties: { type: 'object', additionalProperties: { type: 'string' } },
},
required: ['id', 'type', 'name', 'properties'],
additionalProperties: false,
},
},
relationships: {
type: 'array',
items: {
type: 'object',
properties: {
source: { type: 'string' },
target: { type: 'string' },
type: { type: 'string' },
properties: { type: 'object', additionalProperties: { type: 'string' } },
},
required: ['source', 'target', 'type'],
additionalProperties: false,
},
},
},
required: ['entities', 'relationships'],
additionalProperties: false,
},
},
},
});
return JSON.parse(response.choices[0].message.content!) as GraphExtractionResult;
}
// Step 2: Load into Neo4j
import neo4j from 'neo4j-driver';
const driver = neo4j.driver(
process.env.NEO4J_URI!,
neo4j.auth.basic(process.env.NEO4J_USER!, process.env.NEO4J_PASSWORD!)
);
async function loadGraphToNeo4j(extracted: GraphExtractionResult) {
const session = driver.session();
try {
// Upsert entities
for (const entity of extracted.entities) {
await session.run(
`MERGE (n:${entity.type} {id: $id})
SET n.name = $name, n += $properties`,
{ id: entity.id, name: entity.name, properties: entity.properties }
);
}
// Upsert relationships
for (const rel of extracted.relationships) {
await session.run(
`MATCH (a {id: $source}), (b {id: $target})
MERGE (a)-[r:${rel.type}]->(b)
SET r += $properties`,
{ source: rel.source, target: rel.target, properties: rel.properties ?? {} }
);
}
} finally {
await session.close();
}
}Text-to-Cypher: natural language → graph queries
// Convert natural language questions to Cypher queries
async function textToCypher(question: string, schema: string): Promise<string> {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: `You are a Cypher query generator for Neo4j.
Convert natural language questions into valid Cypher queries.
Return only the Cypher query, no explanation.
Graph schema:
${schema}`,
},
{ role: 'user', content: question },
],
max_tokens: 300,
temperature: 0,
});
return response.choices[0].message.content!.trim();
}
// Execute the generated query
async function graphQuery(question: string) {
const schema = await getGraphSchema();
const cypherQuery = await textToCypher(question, schema);
// Validate: only allow read queries
if (!cypherQuery.toUpperCase().startsWith('MATCH')) {
throw new Error('Only MATCH queries are permitted');
}
const session = driver.session({ defaultAccessMode: neo4j.session.READ });
try {
const result = await session.run(cypherQuery);
return result.records.map(r => r.toObject());
} finally {
await session.close();
}
}
// Example:
// Q: "Who founded companies that use Kubernetes?"
// Generated: MATCH (p:Person)-[:FOUNDED]->(c:Company)-[:USES]->(t:Technology {name: "Kubernetes"})
// RETURN p.name, c.nameGraphRAG: graph-enhanced retrieval
// Combine vector search (semantic) with graph traversal (relational)
async function graphRAG(question: string): Promise<string> {
// 1. Semantic search: find relevant seed entities
const queryEmbedding = await embedText(question);
const seedNodes = await vectorStore.query({ vector: queryEmbedding, topK: 3 });
// 2. Graph traversal: expand context through relationships
const nodeIds = seedNodes.map(n => n.metadata.nodeId);
const session = driver.session({ defaultAccessMode: neo4j.session.READ });
const { records } = await session.run(
`MATCH (seed)
WHERE seed.id IN $ids
OPTIONAL MATCH (seed)-[r1]->(related)-[r2]->(deeper)
RETURN seed, r1, related, r2, deeper
LIMIT 50`,
{ ids: nodeIds }
);
await session.close();
// 3. Format graph context
const graphContext = records.map(r => {
const seed = r.get('seed')?.properties;
const related = r.get('related')?.properties;
const rel1 = r.get('r1')?.type;
return related
? `${seed?.name} --[${rel1}]--> ${related?.name}`
: seed?.name;
}).filter(Boolean).join('
');
// 4. Generate answer grounded in graph facts
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'Answer the question using only the graph facts below. Cite entity names.',
},
{
role: 'user',
content: `Graph facts:
${graphContext}
Question: ${question}`,
},
],
});
return response.choices[0].message.content!;
}Hallucination reduction with graph grounding
// Verify LLM claims against the knowledge graph
async function verifyFact(claim: string): Promise<{ verified: boolean; source?: string }> {
// Extract entities from the claim
const entities = await extractEntitiesFromText(claim);
if (entities.length < 2) return { verified: false };
const [entity1, entity2] = entities;
// Check if relationship exists in graph
const session = driver.session({ defaultAccessMode: neo4j.session.READ });
const result = await session.run(
`MATCH (a {name: $e1})-[r]-(b {name: $e2})
RETURN type(r) AS relationship, a.id AS sourceId`,
{ e1: entity1, e2: entity2 }
);
await session.close();
if (result.records.length > 0) {
const rel = result.records[0].get('relationship');
const src = result.records[0].get('sourceId');
return { verified: true, source: `Graph: ${entity1} -[${rel}]-> ${entity2} (id: ${src})` };
}
return { verified: false };
}Takeaway
Knowledge graphs are most valuable for domains with dense relationships: enterprise org charts, product catalogs, compliance rules, medical ontologies. Use LLM extraction to build the graph from unstructured text, text-to-Cypher to query it naturally, and GraphRAG to combine semantic search with relational traversal for the most accurate retrieval available.