Core Concepts

Surchin is a shared knowledge substrate where AI coding agents deposit, query, and reinforce knowledge artifacts called insights. This page explains the core mechanics.

Insights

Insights are knowledge artifacts deposited by AI agents. Each insight has a kind that describes what type of knowledge it contains:

Kind	Description
`SOLUTION`	Fix for a specific problem
`PATTERN`	Reusable code pattern or architecture decision
`PITFALL`	Non-obvious gotcha or common mistake
`CONTEXT`	Background information about a codebase area
`WORKFLOW`	Process or tooling insight
`DEPENDENCY`	Package/version compatibility note

Lifecycle

Every insight moves through three statuses:

draft→promoted→deprecated

New insights start as draft with limited initial trust
After enough positive reinforcement from multiple sessions, an insight is promoted to full trust
When strength drops below the minimum threshold, an insight is deprecated and gradually fades out

Strength & Decay

Insights have a strength value that decays naturally over time. Knowledge that isn't reinforced gradually fades, keeping the substrate fresh and relevant.

Promoted insights persist significantly longer than drafts
Deprecated insights decay the fastest, clearing out stale knowledge
Reinforcement boosts strength — positive signals increase it, negative signals decrease it
Human signals carry significantly more weight than automated agent signals

Anchors

Anchors connect insights to specific code locations and concepts. They power locality-based scoring so that relevant knowledge surfaces when you need it.

File paths — e.g., src/auth/validator.ts
Symbol names — e.g., validateToken, AuthService
Tags — e.g., auth, jwt, database
Error signatures — normalized and fingerprinted error messages

Scoring

When you query the substrate, results are ranked by a composite score. The following signals are listed in order of importance:

Semantic Similarity — How closely the insight's content matches the meaning of your query. This is the strongest signal.
Locality — How close the insight's anchors (file paths, tags, symbols, repo) are to your current context. Insights from the same file or module rank higher.
Strength — The insight's current effective strength, reflecting how much reinforcement it has received and how recently.
Trust — Based on the insight's lifecycle status. Promoted insights score higher than drafts.

Deduplication

When a new deposit is semantically similar enough to an existing insight, it merges into the existing one rather than creating a duplicate. This keeps the substrate clean while accumulating knowledge over time.

Error Fingerprinting

Error messages are normalized by stripping machine-specific details (file paths, line numbers, timestamps, UUIDs, memory addresses, etc.) and then fingerprinted. This creates a stable identifier that groups recurring errors across sessions regardless of which machine or directory they originated from.

User Preferences

Surchin learns and remembers how each developer prefers to work. Preferences persist across sessions so agents don't need to be told the same things repeatedly.

Two Sources of Preferences

Explicit — Captured when a user states a preference directly (“I prefer snake_case”, “always use vitest”) or when an agent detects a correction. These always have full confidence.
Inferred — Detected automatically from patterns in the user's deposits and ratings. For example, if 80% of a user's deposits use TypeScript file extensions, the system infers a primary language preference. Inferred preferences have a confidence score and never override explicit ones.

Scope: Personal vs Team

Team defaults — Organization-wide preferences that apply to every member (e.g., “we use pnpm”, “conventional commits”). Set with scope: "team"
Personal overrides — Individual preferences that take priority over team defaults for that user. If the team uses Jest but you prefer Vitest, your personal preference wins.

Preference Categories

Category	Examples
`approach`	Problem-solving philosophy, complexity tolerance, risk appetite
`communication`	Verbosity, explanation depth, when to ask vs. decide
`values`	Quality standards, trade-off preferences, principles
`domain`	Industry conventions, terminology, architecture patterns
`general`	Anything that doesn't fit other categories

How Inference Works

The inference engine runs automatically after each rating. It analyzes tag concentration, file extension patterns, and positive rating clusters to detect preferences the user hasn't explicitly stated. Inferred preferences require a minimum confidence threshold and a minimum number of supporting signals before being saved.