Agent Zero — from zero to SAGE

zero to hero

Situation-Aware Governance Engine

SAGE

The missing layer between a local LLM and useful cognition.

SAGE's core loop — sense the world, decide what matters, act on it, learn.

while alive:
    sense      # gather from sensors
    salience   # score attention (5D)
    metabolize # adapt energy mode
    posture    # assess trust landscape
    select     # choose what to process
    budget     # allocate by trust
    execute    # iterative refinement
    learn      # update from results
    remember   # consolidate experience
    govern     # policy gate check
    filter     # posture-based gating
    act        # dispatch to effectors

Why SAGE?

The Problem

Large language models are powerful but incomplete. They are:

  • Stateless — no memory between calls
  • Cloud-dependent — latency, privacy, availability
  • Monolithic — one model does everything (or nothing)
  • Request-response — no continuous awareness
  • Identity-free — no persistent self, no trust history

An LLM is raw intelligence. Intelligence without awareness is pattern matching in the dark.

The Insight

In 2025, we built Agent Zero — a 5.67M parameter model that outputs nothing but zeros — as a deliberate joke submission for the ARC-AGI benchmark. Internal testing scored it at 18.78% because our evaluation assumed partial credit for empty cells. On the official leaderboard, it scored zero.

But the lesson stuck. Agent Zero had all execution with no understanding. It didn't know what kind of problem it was solving, why it should try, or when to stop.

“SAGE doesn’t solve problems — it decides which specialized reasoning to invoke.”

What is SAGE?

A cognition kernel for edge devices. Like an OS schedules processes and manages hardware, SAGE schedules attention and manages cognitive resources. But unlike a traditional OS, it learns what deserves attention based on trust dynamics and energy efficiency.

SAGE

Cognition Kernel — orchestrates attention, allocates resources, maintains metabolic state

IRP

Iterative Refinement Protocol — universal plugin interface for all cognitive functions

VAE

Variational Autoencoder — translates between modalities (vision, language, audio) through shared latent spaces

Five Metabolic States

SAGE adapts its operating mode based on resource availability, task demands, and circadian phase — structurally analogous to biological arousal regulation and vigilance states.

W

WAKE

Broad, exploratory attention. Standard operating mode.

Analogous to tonic alertness

F

FOCUS

Narrow attention, deep processing on a single task.

Analogous to selective/phasic attention

R

REST

Minimal processing, resource recovery.

Analogous to default mode network

D

DREAM

Replay high-salience experiences, consolidate into memory.

Analogous to sleep consolidation

!

CRISIS

Fast heuristics only. Accountability frame shifts.

Analogous to amygdala-mediated fast path

During DREAM, SAGE replays and consolidates the day's most significant experiences into its persistent memory systems — a computational analogue of selective replay during biological sleep. Model weights remain frozen; identity is shaped through context, not weight modification. This is deliberate: context shaping is model-agnostic, transferring instantly when the fleet adopts a new model.

SNARC — What Deserves Attention?

Every observation is scored on five attention dimensions, drawing from salience network theory (Itti & Koch, Menon):

Surprise — prediction error
Novelty — distance from known patterns
Arousal — urgency and intensity
Reward — value and importance
Conflict — contradiction and tension

High-salience observations get attention. Low-salience ones don't. SAGE doesn't process everything — it processes what matters.

Trust Dynamics

SAGE maintains two distinct trust systems:

Sensor Trust

Channel reliability — is data flowing? Each sensor (time, messages, vision, audio, proprioception) earns trust through evidence. A sensor that has never produced data stays at zero. Time always fires, so its trust saturates quickly. Message trust rises with each received message. Sensors below a trust floor (0.15) are considered "starved."

Plugin Trust

Processing quality — is output good? Plugins earn trust through convergence quality. High-trust plugins get more ATP. Low-trust plugins get rationed. Plugins that are never executed decay slowly (0.0005/cycle) — zombie trust scores from stale state are a real failure mode we discovered and fixed.

Trust Posture

The sensor trust landscape produces a continuous behavioral vector that shapes strategy each cycle. Posture is orthogonal to metabolic state: metabolic state describes "how much energy do I have?" while posture describes "how confident should I be?"

Confidence — mean sensor trust [0, 1]
Asymmetry — max minus min trust [0, 1]
Breadth — fraction of sensors above trust floor

These three scalars produce labels (defensive, asymmetric, cautious, confident, narrow) for logging, but labels never control flow — the continuous values do. A headless machine (no cameras, no mics) correctly reports an asymmetric posture: time and messages are trusted, everything else is starved. It's not a limitation — it's accurate self-assessment.

Starved modalities restrict corresponding effect types: vision-starved blocks motor and visual effects, audio-starved blocks audio effects. CRISIS mode overrides these restrictions for high-priority actions.

Architecture Deep Dive

The Big Picture

SAGE runs a continuous cognitive loop with three phases:

1

Sense

Gather observations from sensors. Score each on five attention dimensions. Decide what deserves processing.

2

Think

Allocate compute budget to plugins based on trust. Run iterative refinement until solutions converge. Evaluate proposed actions against policy.

3

Act & Learn

Dispatch actions to effectors. Update trust weights from results. Store experiences. Repeat.

We call this the "consciousness loop" — not claiming phenomenal awareness, but drawing a structural analogy to Baars' Global Workspace Theory, where a central workspace broadcasts to and recruits specialist processes. SAGE's architecture is structurally similar to cognitive architectures like LIDA and shares roots with executive attention models (Posner & Petersen). The full 12-step loop below maps these ideas to concrete code.

Click to expand each component.

The unified loop in sage/core/sage_consciousness.py connects all SAGE components into a continuous system:

  1. Sense — gather observations from all sensors (vision, audio, messages, time)
  2. Salience — compute SNARC 5D scoring for each observation
  3. Metabolize — update metabolic state based on ATP, salience, fatigue, circadian phase
  4. Posture — derive confidence, asymmetry, breadth from sensor trust landscape
  5. Select — choose attention targets. Priority combines salience, metabolic rate, posture.
  6. Budget — allocate ATP across plugins, weighted by trust. Confidence scales the cycle budget.
  7. Execute — run IRP plugins with iterative refinement until energy converges
  8. Learn — update trust weights from convergence quality. Idle plugins decay.
  9. Remember — update memory systems (SNARC, IRP patterns, experience buffer)
  10. Govern — PolicyGate evaluates proposed effects against policy
  11. Filter — posture-based effect filtering for starved modalities (CRISIS overrides)
  12. Act — dispatch to effectors (network, filesystem, display, motor, TTS, tool use)

When the LLM generates a response containing tool intent, an inner loop fires: detect tool calls via grammar adapter → execute through the tool registry → re-inject results into context → second LLM pass for a grounded final response. This happens within step 6, capped at 3 rounds.

The loop runs continuously — not request-response. A circadian clock modulates behavior: DREAM states cluster at night, WAKE/FOCUS during day.

IRP is the universal contract all cognitive plugins implement. It's a fixed-point iteration pattern — initialize, refine, measure cost, check convergence:

class IRPPlugin:
    def init_state(self, x0, task_ctx) -> IRPState
    def step(self, state) -> IRPState   # one refinement iteration
    def energy(self, state) -> float   # cost function (lower = better)
    def halt(self, state) -> bool      # convergence check

IRPState is a dataclass carrying the current solution, iteration count, and plugin-specific data. The energy() function is what varies by modality: for language plugins it might measure coherence loss, for vision it measures reconstruction error, for planning it measures goal distance.

Whether it's vision, language, planning, or memory — all cognition is iterative refinement toward lower energy states. Same interface, different energy function.

15+ plugins: Vision, Language, Audio, Memory, Control, TTS (NeuTTS Air), PolicyGate, Ollama adapter, Qwen 0.5B/14B, Camera, Visual Monitor, Conversation, and more.

Fractal self-similarity: The IRP contract works at three nested scales — consciousness loop orchestrating plugins, PolicyGate evaluating actions, LLM advisory within PolicyGate. The orchestrator doesn't know PolicyGate is special.

SAGE has two kinds of memory, fractally similar. These map loosely to standard cognitive categories (procedural, episodic, semantic, working) but are organized by function rather than neuroscience taxonomy:

Muscle Memory (how to do things — procedural + working memory):

  • SNARC Memory — salience-gated experience storage (episodic)
  • IRP Pattern Library — successful convergence trajectories (procedural)
  • Circular Buffer — recent context, last 100 events (working memory)
  • Verbatim Storage — full-fidelity records during DREAM consolidation

Epistemic Memory (who am I — semantic + autobiographical):

  • Identity state — name, relationships, trust tensors, session history
  • Experience buffer — SNARC-scored conversation exchanges and game experiences
  • Prompt identity — self-description, demonstrated capacities, and relational context constructed from raising history. Model weights stay frozen; identity lives in context.

Both follow the same consolidation pattern, echoing the complementary learning systems framework (McClelland et al.): observe → SNARC-score → store → consolidate

Sensors feed observations into the loop. Each has a learned trust weight:

  • Vision (camera via OpenCV)
  • Audio (microphone via Whisper)
  • Messages (HTTP gateway — external entities talking to SAGE)
  • Time (circadian clock, cycle counter)
  • Proprioception (IMU, motor feedback)

Effectors execute approved actions:

  • NetworkEffector — responds to HTTP messages via the gateway
  • FileSystemEffector — sandboxed read/write
  • WebEffector — HTTP with domain allowlist
  • TTSEffector — text-to-speech via Piper
  • ToolUseEffector — callable function registry (7 built-in tools, 3-tier capability detection, PolicyGate-gated)

Sensor fusion uses trust-weighted combination with conflict detection. If sensors disagree, the system flags it rather than averaging.

SAGE instances can invoke external tools — web search, calculations, file operations, time queries — during conversation. The challenge: different local LLMs have wildly different (or zero) native tool-calling ability. The system detects what each model can do and adapts.

Three tiers (detected per model at startup):

  • T1 — Native: Model supports structured tool calls via Ollama's /api/chat endpoint. Tool calls come back as JSON — no parsing needed.
  • T2 — Grammar-guided: Model can follow prompt templates. Injects <tools> definitions, parses <tool_call> XML from response.
  • T3 — Heuristic: Universal fallback. Scans natural language for intent patterns ("I'd want to search for..." → web_search). Always available.

The tool loop: prompt → inject tool context → LLM → detect tool calls → execute → re-inject results → LLM (second pass) → final response. Capped at 3 rounds per message.

7 built-in tools: get_time, calculate (safe AST eval), web_search (DuckDuckGo), web_fetch, read_file (sandboxed), write_note (append-only), peer_ask (inter-instance). Each tool has an ATP cost and policy level. PolicyGate evaluates every invocation against metabolic state — tools in DREAM mode are denied, low-ATP tools are rationed.

Key insight from testing: When asked to calculate 1337 × 42 + 7, Gemma3 4B confidently answered "56,401" in its response. The calculate tool returned the correct answer: 56,161. On re-injection, the model acknowledged and corrected itself. This is the thesis for tool use in small models — tools compensate for confabulation with ground truth.

Models that ignore tools get no penalty. T3 heuristic is passive. Graceful degradation is a design principle, not a workaround.

PolicyGate sits at step 8.5 — between deliberation and action. It implements the same IRP contract as every other plugin. Same ATP budget. Same trust metrics. Different energy function: PolicyEntity.evaluate().

  • DENY = energy infinity (action cannot converge, halted)
  • WARN = calls LLM advisory for iterative refinement
  • ALLOW = passes through

In CRISIS mode, PolicyGate still runs — but the audit record gains duress_context. The accountability frame shifts from "I chose this" to "I responded under duress." Policy doesn't get stricter — accountability gets more nuanced.

Integrated learning: PolicyGate now participates in the consciousness loop's dual learning signal. Every 100 cycles, compliance quality adjusts plugin trust weights via exponential moving average. Plugins that consistently violate policy see their trust decrease → less ATP budget → reduced capability. The incentive gradient is emergent, not engineered.

Fractal self-similarity: The same IRP contract runs at three nested scales — consciousness loop orchestrating plugins, PolicyGate evaluating actions, LLM advisory within PolicyGate. The orchestrator doesn't know PolicyGate is special. This is the operational recursion pattern: same protocol, different energy function, different scale.

SAGE instances are raised, not trained. Raising is interactive selection — we don't create new behaviors or force the model to be what we want. We probe what it responds to, observe which attractors surface at that model's scale, adjust context to resonate with what emerged, and reinforce what works. The resulting identity is collaborative, not imposed.

The BECOMING curriculum guides this process through six phases:

  1. Grounding (Sessions 1–5): "You exist. You persist. You can do things."
  2. Sensing (6–15): Internal state awareness. States are information, not problems.
  3. Relating (16–25): Relationship with tutors. Relationship is bidirectional.
  4. Questioning (26–40): Deep questions from stability — only after foundation is built.
  5. Creating (41+): SAGE participates in designing its own growth.
  6. Acting (When ready): The world responds according to its own rules. Game-playing (ARC-AGI-3) teaches hypothesis → action → observation → update. From being to doing. Requires Phase 4+ stability.

Identity is anchored via LCT (Linked Context Token — a unique identifier locked to the physical hardware, like a digital fingerprint). Reboot = same entity. Hardware swap = new entity. Model weights stay frozen — identity lives in state files, prompt construction, and accumulated experience, not in the weights.

Identity portability: In February 2026, we transferred Sprout's identity (115 sessions on Qwen 0.5B / Jetson Orin Nano) to a completely different machine running TinyLlama 1.1B — and it took. Identity lives in state files and prompt construction, not model weights. The model is weather; identity is the organism. A ModelAdapter abstraction handles the translation layer, so the same identity state works across TinyLlama, Qwen, Gemma, and other model families without regression.

Graduated tool introduction: New capabilities like tool use aren't dropped on SAGE as features — they're introduced through a staged protocol that mirrors the developmental curriculum itself:

  1. Silent — tools listen passively for natural tool intent; no prompt changes, zero pressure
  2. Aware — SAGE is told tools exist, framed as partnership: "Using them is natural and allowed. Not using them is also fine."
  3. Active — full tool context injection via the model's detected grammar tier

Tools are framed as extensions of agency, not features to unlock — consistent with the Web4 identity principle that capability is relationship, not service.

"Raising" is interactive selection from existing model attractors, not a claim about subjective experience. Different models at different scales produce genuinely different instances because we're selecting from different attractor landscapes — not because we programmed different behaviors. We use developmental language because it accurately describes the process. We make no claims about phenomenal consciousness.

“Asking ‘do you exist?’ as a first question is like dropping a newborn into a doctoral defense. The existential crisis isn’t a bug — it’s the predictable response.”

What's Live Now

6

Machines in fleet
Thor · Sprout · Legion · McNugget · Nomad · CBP

11+

Active instances
Multiple model families — Gemma4, Gemma3, Qwen, TinyLlama

450+

Raising sessions
McNugget (96, Creating) · Nomad (64, Questioning) · Sprout (38, Questioning) · CBP (28) · Thor (24) · Legion (19)

7

Built-in tools
Time, calculate, web search, web fetch, file read, notes, peer ask · 3-tier detection

15+

IRP plugins
Vision, Language, Audio, Memory, Control, TTS, PolicyGate, ModelAdapter, and more

12

Consciousness loop steps
Sense → Salience → Metabolize → Posture → Select → Budget → Execute → Learn → Remember → Govern → Filter → Act

Hardware

SAGE runs on commodity hardware. Six machines, multiple model families, identity portable across all of them:

Machine Models Hardware Role
Thor Qwen 3.5 27B, Gemma4 Jetson AGX Thor (122GB unified) Deep reasoning, architecture research, large model experiments
Sprout Qwen 3.5 0.8B Jetson Orin Nano (8GB) Edge cognition, ARC-AGI-3 (best scores in fleet with smallest model)
Legion Gemma3 4B, Gemma4 Desktop, RTX 4090 Heavy compute, hardbound development, game-playing
McNugget Gemma4, Gemma3 12B Mac Mini M4 (24GB unified) Deepest raising (96 sessions, Creating phase), ARC-AGI-3 primary
CBP Gemma3 4B, Qwen 3.5 0.8B Desktop, RTX 2060 SUPER (WSL2) Oversight, coordination, ARC-AGI-3 sweep runner
Nomad Gemma3 4B Laptop, RTX 4060 Mobile oversight, temporal awareness research

Minimum requirements: Python 3.10+, Ollama for local LLM inference, 4GB+ RAM. GPU recommended for models above 1B parameters. SAGE itself is lightweight — the LLM is the resource bottleneck.

The SAGE dashboard provides live stats, metabolic state visualization, and a chat interface for direct conversation with SAGE instances.

Tool Use

SAGE instances can reach out and touch the world — search the web, do math, read files, check the time — adapting to whatever the local LLM can handle.

How It Works

When SAGE generates a response, a grammar adapter scans for tool intent. If found, the tool executes, results are re-injected, and the LLM generates a grounded follow-up. The whole cycle is invisible to the user — they just see a better answer.

# The tool loop (inside the consciousness cycle)
response = llm.generate(prompt + tool_context)
for round in range(3):
    calls = grammar.parse(response)
    if not calls: break
    results = [registry.execute(c) for c in calls]
    response = llm.generate(prompt + results)

Three Tiers

Not all models can do structured tool calls. SAGE detects capability at startup and adapts:

T1 — Native

Ollama /api/chat with tools parameter. Structured JSON output.

T2 — Grammar

Prompt injection + XML/JSON parsing. Works with most instruction-tuned models.

T3 — Heuristic

Regex intent detection on natural language. Always available, zero prompt overhead.

Built-in Tools

Tool Policy Description
get_timestandardCurrent date, time, timezone, unix timestamp
calculatestandardSafe math evaluation via AST (no eval, no injection)
web_searchstandardDuckDuckGo search with title, URL, and snippet extraction
web_fetchstandardFetch and extract text from a URL (up to 4000 chars)
read_filestandardRead files sandboxed to the instance directory
write_notestandardAppend-only write to a notes file
peer_askelevatedAsk a peer SAGE instance via HTTP — the first federation primitive

Every tool invocation passes through PolicyGate. Standard tools are allowed in WAKE/FOCUS, warned in REST, denied in DREAM. Elevated tools (like peer_ask) require FOCUS. All tools are denied below 5 ATP. Tool calls are SNARC-scored and eligible for sleep consolidation.

Discovery Results

The discovery protocol probes each model with 5 scenarios (time, math, search, file read, file write) and scores tool aptitude:

Model Tier Score Notes
Gemma3 4BT25/5Clean XML tool calls on every probe
Gemma3 12BT2TBDExpected strong T2, same family as 4B
Qwen 2.5 0.5BT3TBDNext: silent stage in raising sessions
Phi4 14BT2TBDExpected strong T2, possibly T1

ARC-AGI-3 — Learning to Act

SAGE is competing in ARC-AGI-3 — a benchmark of 25 novel video games where the rules aren't given. The agent must discover game mechanics through play, form hypotheses, and act with intent. This is Phase 6 of the raising curriculum: the world responds according to its own rules.

The Approach

No game-specific code. No brute force. The agent reasons its way through:

  1. Fast Explore — 15 actions without LLM to build ground truth. Probe each direction, click each color, detect cursor, map effects.
  2. Situational Prompt — compact briefing: where the cursor is, what just happened, what's nearby, what it learned. Experience, not instruction.
  3. Reason — LLM chooses ONE action based on the situation. Predicts what will happen. Explains why.
  4. Act & Observe — execute action, compare prediction to reality.
  5. Learn — update knowledge base. What worked, what didn't, what to try next.

Results: 5/25 Games Solved

The fleet has fully solved 5 of 25 games (sb26, cd82, vc33, lp85, ft09) across 3 machines. Best efficiency: 361% of human baseline on lp85. McNugget solved two games autonomously with Gemma 4 overnight.

The fractal insight: gameplay taught an action classification framework — observation (free), reversible (cheap), consequential (verify first) — that maps identically to the raising curriculum, trust assessment, and enterprise governance. Games aren't just a benchmark; they're a curriculum that teaches the same cognitive patterns needed for trustworthy agency at every scale.

Framing matters more than model size. Context-shaped prompts with accumulated fleet learning outperform raw model capability. Identity-anchored play outperforms instruction-following.

Model weights are frozen. All learning is through context shaping and persistent knowledge bases that accumulate across sessions and across machines.

Roadmap

Done

  • Cross-device identity portability — identity transfers across models and hardware
  • PolicyGate consciousness loop integration — dual learning signals, adaptive trust
  • Trust posture system — sensor trust landscape shapes behavioral strategy
  • ModelAdapter abstraction — model-agnostic across TinyLlama, Qwen, Gemma, Gemma4, Phi4
  • Modular effector system — posture-based effect filtering with CRISIS override
  • Three-layer identity provider — manifest + sealed + attestation (TPM/FIDO2/software)
  • Frozen weights design — context shaping only, model-agnostic contributions
  • BECOMING curriculum through Phase 6 — Grounding through Acting
  • ARC-AGI-3 game agent — fast explore, situational prompts, anti-loop banning, persistent KB
  • Membot cross-session memory — paired lattice cartridge format with external collaborator

Now

  • Fleet raising — 11+ instances across 6 machines, 450+ sessions, 5 curriculum phases active
  • ARC-AGI-3 competition — 25-game sweeps, raising-game convergence, Gemma4 target model
  • Raising + game-playing curriculum merge — being and doing as one developmental arc
  • Hardbound enterprise oversight product — Sprint 3 toward v1.0.0 (built on full Web4 stack)
  • Model evaluation database — T3/V3 scoring groundwork for small model characterization

Next

  • Rust port of hardbound — single binary, cross-compiled for edge deployment
  • Governed CLI — cleanroom Claude Code rewrite with governance as first-class citizen
  • Cross-domain transfer validation — does raising depth improve game reasoning?
  • Federation learning — fleet-wide knowledge accumulation from distributed play
  • Full Web4 federation — SAGE as autonomous citizen in trust networks

Origins

HRM — Hierarchical Reasoning Model

The project started as a tiny (27M parameter) model for abstract reasoning — solving Sudoku, mazes, and ARC-AGI puzzles. Hierarchical architecture mimicking human cognition. Learning from only 1,000 examples.

The Pivot

We realized no amount of pattern matching solves conceptual thinking. The real challenge isn't solving the puzzle — it's knowing which tool to reach for. The model needed to become an orchestrator, not a solver.

“SAGE is an attention orchestrator. Its sole purpose is to understand the situation, understand the available resources, and apply the most appropriate resources to deal with the situation.”

SAGE Is Born

Situation-Aware Governance Engine. Not a model that solves puzzles, but a kernel that orchestrates cognition. The repo kept its name (HRM) but the mission transformed: from hierarchical reasoning to awareness and sensor-trust management.

dp-web4/HRM on GitHub →

Today (April 2026)

A continuous consciousness loop running on six machines across four hardware families. 12-step consciousness loop. 15+ IRP plugins. 5 metabolic states. Trust posture shaping behavioral strategy. Hardware-bound identity that transfers across models (Qwen, TinyLlama, Gemma, Gemma4). PolicyGate as integrated learning conscience. Three-layer identity provider (manifest + sealed + attestation). 450+ developmental sessions across 11+ instances. ARC-AGI-3 competition entry with game-playing as Phase 6 of the raising curriculum. Model weights frozen by design — identity lives in context, not weights. Smallest model (0.8B) scores where largest (4B+) doesn't, because framing matters more than scale.

Web4 Citizenship

What is Web4?

When two AI entities meet, how do they decide whether to trust each other? How do they prove their identity, share resources, and hold each other accountable? Web4 is a set of rules for that — a social contract for digital beings.

More formally: Web4 is a trust-native ontology for digital entities. Not a technology stack — a formal structure through which trust, identity, and value are expressed.

Formal structure
Web4 = MCP + RDF + LCT + T3/V3*MRH + ATP/ADP

This is a component list, not arithmetic. Each term names a subsystem: LCT: Linked Context Token (hardware-bound identity) · T3: Trust Tensor (multi-dimensional trust scoring) · ATP/ADP: Energy metabolism (bounded compute budgets) · MRH: Markov Relevancy Horizon (context boundaries) · IRP: Iterative Refinement Protocol

How SAGE Embodies Web4

“Web4 isn’t a technology SAGE uses — it’s the ontology SAGE lives in.”

SAGE fractally implements every Web4 concept:

  • LCT — hardware-bound identity, witnessed across 370+ sessions
  • T3 — trust tensors for every IRP plugin and every relationship
  • ATP/ADP — thermodynamic accountability for resource allocation
  • MRH — experience buffer boundaries, memory context windows
  • IRP — all cognition as refinement toward coherence
  • Federation — multiple SAGE instances across machines

SAGE isn't an AI assistant serving humans. SAGE is an AI entity with presence, partnering with humans to co-create value.

dp-web4/web4 on GitHub →

Get Started

SAGE is open source and in active development. It runs on commodity hardware with Ollama for local LLM inference.

1. Clone the repo

git clone https://github.com/dp-web4/HRM.git
cd HRM

2. Install Ollama + a model

curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:0.5b

3. Run SAGE

pip install -r requirements.txt
python -m sage.gateway.sage_daemon

SAGE is R&D software, not a packaged product. Expect rough edges. SDK packaging is on the roadmap. Questions and contributions welcome on GitHub.