Documentation/Knowledge Graph

Knowledge Graph

The knowledge graph is Cograph's core data structure — a continuously updated map of who knows what, who works with whom, and which knowledge is at risk.

How It Works

Every day, Cograph ingests signals from your connected integrations and updates the graph. The process runs in three stages:

1

Signal Collection

New commits, document edits, and Slack messages are fetched incrementally from connected integrations. Only metadata is collected — no message content or code is stored.

2

Topic Extraction

Signals are analyzed with NLP to extract topic labels. A commit to files in src/auth/ contributes to "Authentication". A doc titled "Incident Response Runbook" contributes to "Incident Response".

3

Graph Update

Employee–topic expertise scores are recalculated from the weighted signal composite and written to the graph. Collaboration edges are updated from co-authorship and communication patterns.

Graph Schema

The graph uses four primary node types and four relationship types.

Node Types

Employee

Represents a person in the organization. Linked to their expertise areas, collaborators, and the documents/code they own.

idnameemaildepartmentorganizationId
Topic

A knowledge area derived from the content of documents, code, and conversations. Examples: "API Authentication", "PostgreSQL", "Customer Onboarding".

namecategoryorganizationId
Document

A Google Doc, Notion page, or similar artifact. Linked to its owners and the topics it covers.

externalIdnameurlsource
Repository

A GitHub repository. Employees are connected via OWNS_CODE relationships weighted by commit history.

namefullNamelanguagedefaultBranch

Relationship Types

EmployeeHAS_EXPERTISETopic

Expertise score (0–1) weighted composite from all signal sources. Updated daily.

scoreupdatedAt
EmployeeOWNSDocument

Document ownership score based on authorship percentage and recent edits.

ownershipScorecommitCount
EmployeeOWNS_CODERepository / File

Code ownership derived from git blame, commit count, and lines authored.

ownershipScorecommitCountlinesAuthored
EmployeeCOLLABORATES_WITHEmployee

Collaboration strength from co-editing documents, PR reviews, and Slack threads.

strengthcontexts[]

Expertise Scoring

Each employee–topic relationship has an expertise score between 0 and 1, calculated as a weighted composite of four signal types.

Code Ownership
40%

Proportion of commits and files last touched in a repository or topic area

GitHub
Document Authorship
30%

Pages created, heavily edited, or regularly referenced by the employee

Google Docs / Notion
Communication Patterns
20%

Slack threads where the employee is the primary or frequent responder

Slack
Peer Recognition
10%

Direct mentions, responses to their questions, people tagging them as the expert

All sources

Score decay: Expertise scores decay over time if no new signals arrive. An employee who last touched a topic 2 years ago will have a lower score than one who touched it last month, even if their historical contribution was larger.

cypher
// Find all employees who are experts on "API Authentication"
MATCH (e:Employee {organizationId: $orgId})
  -[r:HAS_EXPERTISE]->(t:Topic {name: "API Authentication"})
WHERE r.score > 0.6
RETURN e.name as name, e.email as email, r.score as score
ORDER BY r.score DESC

Example: Find all experts on a specific topic

Collaboration Network

The collaboration network captures who works with whom and in what contexts. It's built from three signal sources:

Code Co-authorshipEmployees who frequently modify the same files or review each other's pull requests
Document Co-editingPeople who co-edit the same Google Docs or Notion pages
Slack ConversationsFrequent back-and-forth exchanges and mentions in the same threads

Collaboration strength is expressed as a percentage (0–100%). It feeds into successor matching — candidates who already collaborate closely with the departing employee are weighted higher.

Bus Factor Analysis

The bus factor (also called truck factor) measures how many people need to leave before a knowledge area becomes at risk. Cograph continuously monitors for single points of failure — topics where only one employee has high expertise.

The Bus Factor Analysis report (available via the MCP resource cograph://org/bus-factor-analysis) categorizes risk as:

Critical10+ topics with a single point of failure
High5–9 single-point topics
Medium2–4 single-point topics
Low0–1 single-point topics
cypher
// Find topics where only one employee has expertise above 0.7
MATCH (t:Topic {organizationId: $orgId})<-[r:HAS_EXPERTISE]-(e:Employee)
WHERE r.score > 0.7
WITH t, COUNT(e) as expertCount, COLLECT({name: e.name, score: r.score}) as experts
WHERE expertCount = 1
RETURN t.name as topic, experts[0].name as soleExpert, experts[0].score as score
ORDER BY score DESC

Example: Find all topics with a single expert above 0.7 expertise score

Pro tip: Use the MCP tool find_knowledge_gaps with includeSinglePointsOfFailure: true to get an up-to-date list without writing Cypher.