Deep Dive

The NORDON Intelligence Engine

18 ML modules. Zero cloud dependencies. Built in Rust. Here's exactly how the engine works under the hood.

Architecture

The ML Pipeline

Ingestion Pipeline

Event→Signal Detection→Scoring→Classification→Storage

Retrieval Pipeline

Context←Token Budget←Ranking←Retrieval←Query

16 Modules

Every module, explained

Each module handles a specific aspect of the intelligence pipeline. All written in Rust. All run locally.

Importance Scoring

727 lines

The core scoring engine. Scans every event for 40+ signals across 4 categories — architecture decisions, performance insights, security concerns, and dependency changes. A weighted algorithm rates each event from 0 to 1, with configurable boosts for user-flagged items (0.4), decisions (0.3), and failures (0.25). Only events scoring above the 0.4 threshold become memories.

40+ signal detectors, weighted scoring, configurable thresholds

Retrieval Ranking

261 lines

A 6-signal ranking formula that finds the most relevant memories for any given context. Weighs keyword match (30%), importance score (20%), recency with 14-day half-life decay (15%), acceptance feedback (15%), scope match (10%), and frequency (10%). Handles token budget allocation to maximize context quality within LLM limits.

6-signal ranking, 14-day half-life decay, token budgeting

Pattern Detection

235 lines

Detects recurring patterns in your codebase activity. Identifies file groups that are frequently modified together, failure loops where the same error recurs, and workflow patterns that indicate established team practices. Surfaces these patterns as memories so your AI can anticipate needs.

File group detection, failure loop detection, workflow patterns

Drift Detection

188 lines

Monitors for architecture drift and stale memories. Detects when current coding activity contradicts stored decisions or patterns. Flags memories that haven't been accessed or validated in a configurable window. Sends drift alerts so your team stays aligned with documented decisions.

Architecture drift alerts, stale memory detection

Semantic Search

190 lines

Hash-based embeddings for fast semantic similarity search with zero external dependencies. Generates compact vector representations of memory content and computes cosine similarity for retrieval. No API calls, no Python, no heavy ML frameworks — just pure Rust math.

Hash-based embeddings, cosine similarity, zero dependencies

Auto-Extraction

358 lines

NLP-powered extraction that automatically identifies and structures important information from raw tool results and conversation context. Classifies extracted content into 7 memory types — decisions, failures, procedures, constraints, patterns, facts, and context snapshots — using pattern matching and heuristic analysis.

NLP extraction, 7-type classification, structured output

Dependency Tracking

286 lines

An if-X-then-Y rule engine that tracks relationships between memories. When a decision depends on a constraint, or a procedure relies on a specific tool version, the dependency tracker maintains those links. Surfaces dependent memories together and warns when dependencies change.

If-X-then-Y rules, dependency graphs, change warnings

Memory Deduplication

173 lines

Embedding-based duplicate detection that prevents memory bloat. Compares new memories against existing ones using semantic similarity, not just exact string matching. Merges near-duplicates intelligently, keeping the most complete and recent version while preserving unique details from each source.

Semantic dedup, intelligent merging, bloat prevention

Policy Engine

337 lines

The security backbone. Scans all content for 31 distinct secret patterns (AWS keys, GitHub tokens, Stripe keys, JWTs, database strings, and more) and redacts them before storage. Blocks 28 sensitive file types from ever being read. Enforces team-defined retention and access policies.

31 secret patterns, 28 blocked file types, policy enforcement

Memory Decay

in store

Four configurable decay profiles that control how memories age over time. Memories can decay linearly, exponentially, step-wise, or not at all. The decay function integrates with retrieval ranking so that aging memories naturally lose priority unless they continue to be accessed or validated.

4 decay profiles, configurable aging, access-aware

Feedback Learning

in scoring

Closes the learning loop. Every time you approve or reject a surfaced memory, the feedback signal adjusts future scoring weights. Acceptance rates accumulate over time, making the importance scoring more accurate for your specific workflow. The system literally gets smarter the more you use it.

Acceptance tracking, weight adjustment, continuous learning

Confidence System

in scoring

Memories get stronger or weaker over time based on usage. When a memory is accessed and leads to a good outcome, its confidence increases. Memories that go unused gradually lose confidence. This creates a natural selection process where the most useful knowledge stays prominent.

Confidence boost on use, decay on neglect, natural selection

Session Summaries

in extraction

Automatically generates a structured summary when a coding session ends. Captures what was accomplished, what decisions were made, what failed, and what's left to do. The next session starts with full context of where you left off.

Auto-generated summaries, session continuity, context snapshots

Conversation Extraction

510 lines

Extracts structured knowledge from natural conversation. When you discuss architecture, debug a bug, or explain a constraint to your AI, the conversation extractor identifies decisions, failures, constraints, and patterns from the dialogue and creates typed memories.

Chat parsing, decision extraction, constraint detection

Memory Compression

226 lines

Token-efficient memory compression that reduces injection payload size without losing critical information. Intelligently summarizes verbose memories, strips boilerplate, and packs the maximum amount of context into the available token budget.

Token optimization, lossless compression, budget efficiency

Memory Quality Scoring

in scoring

Grades every memory from A to F based on completeness, specificity, and actionability. Generates actionable suggestions for improving low-quality memories. Calculates an overall repo health score so you know the state of your project's knowledge base at a glance.

A-F grading, quality suggestions, repo health score

Performance

All ML runs locally — zero API calls

< 50msRetrieval latencyFull 6-signal ranking and retrieval completes in under 50 milliseconds.

< 5msScoring per eventImportance scoring with all 40+ signal detectors runs in under 5ms.

277Automated testsComprehensive test suite covering every module, edge case, and integration.

0API calls requiredEvery ML operation runs locally. No network requests, no cloud dependencies.

Engineering

Why Rust?

We chose Rust deliberately. Not because it's trendy, but because every alternative had a dealbreaker for local-first ML.

Memory safety without garbage collection

Rust's ownership model prevents memory leaks, use-after-free, and data races at compile time. No GC pauses means predictable performance — critical when you're scoring events in a hot path.

Cross-platform native binaries

A single codebase compiles to native binaries for macOS (ARM + Intel), Windows, and Linux. No JVM, no Python runtime, no Node.js. Just a binary that works.

Predictable performance

No JIT warmup, no garbage collection pauses, no interpreter overhead. The first event scores as fast as the millionth. Sub-5ms scoring is consistent, not a best-case number.

Single binary deployment

The entire ML engine ships as a single binary with zero runtime dependencies. No pip install, no npm install, no Docker required. Copy the binary, run it, done.

By the Numbers

The full codebase

18ML modules

80K+Lines of code

40+Signal detectors

7Memory types

31Secret patterns

277Tests

Visualization

Knowledge Graph

The Knowledge Graph is built from memories and their relationships. Each memory becomes a node, and dependency links, co-occurrence patterns, and explicit references become edges. The result is an interactive force-directed visualization of your project's entire knowledge structure.

How the graph is built

Every memory creates a node. Dependency tracking (Module 07) creates edges between related memories. Pattern detection (Module 03) identifies clusters. Conversation extraction (Module 14) adds relationship metadata. The graph grows organically as your project evolves.

What you can do with it

Explore your project's knowledge visually. Filter by memory type to see only decisions, or only failures. Click a node to see its full content and connections. Identify knowledge gaps -- areas of your codebase with few memories. Share graph snapshots with your team.

Get Started

See the engine in action

Install NORDON and watch the ML engine learn from your first coding session. Free tier includes full access to all 16 modules.

Get Started Free How It Works

npm install -g @sodasoft/nordon-cli

Deep Dive

The NORDON Intelligence Engine

18 ML modules. Zero cloud dependencies. Built in Rust. Here's exactly how the engine works under the hood.

Architecture

The ML Pipeline

Ingestion Pipeline

Event→Signal Detection→Scoring→Classification→Storage

Retrieval Pipeline

Context←Token Budget←Ranking←Retrieval←Query

16 Modules

Every module, explained

Each module handles a specific aspect of the intelligence pipeline. All written in Rust. All run locally.

Importance Scoring

727 lines

40+ signal detectors, weighted scoring, configurable thresholds

Retrieval Ranking

261 lines

6-signal ranking, 14-day half-life decay, token budgeting

Pattern Detection

235 lines

File group detection, failure loop detection, workflow patterns

Drift Detection

188 lines

Architecture drift alerts, stale memory detection

Semantic Search

190 lines

Hash-based embeddings, cosine similarity, zero dependencies

Auto-Extraction

358 lines

NLP extraction, 7-type classification, structured output

Dependency Tracking

286 lines

If-X-then-Y rules, dependency graphs, change warnings

Memory Deduplication

173 lines

Semantic dedup, intelligent merging, bloat prevention

Policy Engine

337 lines

31 secret patterns, 28 blocked file types, policy enforcement

Memory Decay

in store

4 decay profiles, configurable aging, access-aware

Feedback Learning

in scoring

Acceptance tracking, weight adjustment, continuous learning

Confidence System

in scoring

Confidence boost on use, decay on neglect, natural selection

Session Summaries

in extraction

Auto-generated summaries, session continuity, context snapshots

Conversation Extraction

510 lines

Chat parsing, decision extraction, constraint detection

Memory Compression

226 lines

Token optimization, lossless compression, budget efficiency

Memory Quality Scoring

in scoring

A-F grading, quality suggestions, repo health score

Performance

All ML runs locally — zero API calls

< 50msRetrieval latencyFull 6-signal ranking and retrieval completes in under 50 milliseconds.

< 5msScoring per eventImportance scoring with all 40+ signal detectors runs in under 5ms.

277Automated testsComprehensive test suite covering every module, edge case, and integration.

0API calls requiredEvery ML operation runs locally. No network requests, no cloud dependencies.

Engineering

Why Rust?

We chose Rust deliberately. Not because it's trendy, but because every alternative had a dealbreaker for local-first ML.

Memory safety without garbage collection

Rust's ownership model prevents memory leaks, use-after-free, and data races at compile time. No GC pauses means predictable performance — critical when you're scoring events in a hot path.

Cross-platform native binaries

A single codebase compiles to native binaries for macOS (ARM + Intel), Windows, and Linux. No JVM, no Python runtime, no Node.js. Just a binary that works.

Predictable performance

No JIT warmup, no garbage collection pauses, no interpreter overhead. The first event scores as fast as the millionth. Sub-5ms scoring is consistent, not a best-case number.

Single binary deployment

The entire ML engine ships as a single binary with zero runtime dependencies. No pip install, no npm install, no Docker required. Copy the binary, run it, done.

By the Numbers

The full codebase

18ML modules

80K+Lines of code

40+Signal detectors

7Memory types

31Secret patterns

277Tests

Visualization

Knowledge Graph

How the graph is built

What you can do with it

Get Started

See the engine in action

Install NORDON and watch the ML engine learn from your first coding session. Free tier includes full access to all 16 modules.

Get Started Free How It Works

npm install -g @sodasoft/nordon-cli