Org Basics – Knowledge Graph Onboarding

1. Organization basics

Just enough info to bootstrap the knowledge graph and connect to your code.

Company name Primary email domain

Code hosting provider(s)

If self-hosted, Git base URL

Where do your production services run?

Environments

dev staging prod Other

Customer deployments

Yes No Sometimes

NOTE — Agent after Org Basics: - Runs lightweight research using org name + domain. - Infers SaaS vs on-prem tendencies. - Extracts tech stack signals from public info. - Writes {suspected_stack, suspected_business_model, confidence}.

2. Connect to your repositories

Authorize access (READ-ONLY)

PERMISSIONS — What access we request: - Read-only access to: * Repos * Branch names * CI configuration * Workflow files * Basic metadata - We NEVER request: * Secrets * Write access * Deploy permissions * Admin permissions - GitHub scopes: * repo:read (or equivalents) * workflow:read - Everything is logged with full provenance.

NOTE — Repo Classifier Agent (after OAuth): - Fetches repo list + top-level files. - Detects: * programming languages * presence of CI configs * possible deploy files * IaC repos (terraform, pulumi, helm, k8s) - Writes to the STAGING GRAPH: * candidate repos * classification tags * suspected components * provenance: {repo, file_path, extractor}

NOTE — IaC Detection & Environment inference: - Terraform detection: * providers → cloud * workspaces → environments - Helm/k8s detection: * namespaces → environments * cluster names → infra graph - Writes candidate Environment nodes to STAGING GRAPH with confidence scores.

3. Team structure & ownership

You can upload or link any document. We normalize it.

Upload CSV / Excel

Google Sheets

Confluence / Wiki pages

Upload PDF

NOTE — Team Extraction Agent: - Maps columns to canonical fields (team_name, service_name, repo_url). - Extracts ownership from Confluence/Wiki tables. - Extracts team structure from PDF text. - Writes all results to the STAGING GRAPH, not the real graph. - Everything has: * confidence scores * source pointers * error flags for ambiguous mappings

4. Finish initial setup

We will run automated discovery jobs and propose your engineering graph.

NOTE — Staging Graph Architecture: - All extracted / inferred / LLM-generated data first goes to: * candidate_nodes * candidate_edges * candidate_properties - Each candidate entry includes: * confidence (0–1) * source_of_truth (repo path, sheet row, wiki URL) * extraction_agent_version - NOTHING goes into the canonical graph until: * You review * Accept or edit * Confirm mappings - Once validated: * canonical_graph ← staging_graph (promoted)

PERMISSIONS FINAL NOTE: - All processing is local to our backend. - No data is shared externally. - No persistent credentials — OAuth tokens stored encrypted and revocable. - You can disconnect at any time.

Onboard your engineering org