Grampus — Agentic AI Framework¶

As simple as CrewAI to start. As powerful as LangGraph for production.

Grampus is an open-source agentic AI framework built on Dapr's distributed runtime. It provides agent intelligence — memory, orchestration, safety, observability, and evaluation — while Dapr handles the infrastructure: state, pub/sub, workflows, security, and scaling.

Quick Install¶

pip install grampus-ai
grampus init my-agent
cd my-agent && grampus run agent.py --input "Hello"

Why Grampus?¶

:brain: Four-Layer Memory

Working memory (token window), episodic (cross-session events), semantic (SPO facts), and procedural (learned workflows) — all secured with provenance tracking and poisoning defense.

Memory guide →
:shield: Safety by Default

Multi-layer prompt injection detection, PII redaction, and action boundaries wrap every LLM call, tool result, and memory write. Configure via YAML policies.

Safety guide →
:rocket: Production-Ready

Built on Dapr for durable execution, OTEL for distributed tracing, and Prometheus for metrics. Deploy locally, on Docker Compose, or Kubernetes with identical agent code.

Deployment guide →

Architecture¶

graph TB
    User["User Input"] --> CLI["CLI / API"]
    CLI --> Safety["Safety Pipeline\n(injection, PII, guard)"]
    Safety --> Runner["Agent Runner\n(ReAct / Plan-and-Execute)"]
    Runner --> Memory["Memory Manager\n(working · episodic · semantic · procedural)"]
    Runner --> Tools["Tool Executor\n(registry · MCP · sandbox)"]
    Runner --> LLM["Model Client\n(Claude / GPT)"]
    Runner --> Obs["Observability\n(OTEL · Prometheus · EventLog)"]
    Memory --> Dapr["Dapr Runtime\n(state · pub/sub · workflows · mTLS)"]
    Tools --> Dapr
    Dapr --> PG["PostgreSQL + pgvector"]
    Dapr --> Redis["Redis Cache"]
    Obs --> Jaeger["Jaeger / OTEL Collector"]
    Obs --> Prom["Prometheus / Grafana"]

    style Dapr fill:#4f46e5,color:#fff
    style Safety fill:#dc2626,color:#fff
    style Memory fill:#059669,color:#fff

Feature Highlights¶

Feature	Description
ReAct Agent Loop	Built-in Observe→Think→Act loop with configurable max iterations
Graph Engine	Multi-node workflows with conditional branching and Dapr checkpoints
Long-Horizon Planning	Structured SubGoal DAG with parallel waves, FLARE lookahead, retry/fallback control flow, partial replanning, and postcondition verification. Simple tasks skip planning automatically
Multi-Agent Crews	Sequential, parallel, and hierarchical crew patterns
Market-Based Allocation	Dynamic worker selection via capability-first filtering, calibration-discounted bid scoring, and UCB reputation tracking. `MarketCrew` extends `Crew` with `use_market=True` — best-fit agent wins each task automatically
Multi-Agent Debate	Panel of heterogeneous models debate the same question; adaptive early-stop, sycophancy-resistant prompting, three aggregation strategies, and `escalate_to_human` for low-confidence answers
Uncertainty Quantification	Per-step confidence tracking with P(True) + verbalized fusion, adaptive semantic entropy, SAUP propagation across steps, and three-tier escalation (PROCEED → LOG → PAUSE → ABORT). Irreversible tool calls blocked at MEDIUM confidence
Agent Handoffs	Runtime agent-to-agent delegation with A2A protocol discovery and injection-sanitized context
Memory Security	Content hashing, provenance tracking, injection detection, rate limiting
Tool Sandboxing	Docker-isolated execution, resource limits, network control
MCP Client	Discover and invoke tools from any MCP-compatible server
Eval Framework	16 assertion types, streaming quality assertions, LLM-as-judge, A/B prompt testing, regression detection
Adversarial Red-Teaming	Six OWASP Agentic Top 10 attack strategies (ASI01/ASI02/ASI06), LLM+rule-based judge, adaptive mutation, `grampus redteam` CLI with CI exit-code support
Cost Tracking	Per-model, per-agent, per-session budget enforcement with Slack/email/webhook alerts
Google Gemini	Native Gemini client (`gemini-2.0-flash`, `gemini-1.5-pro`) alongside Anthropic and OpenAI
Local Models	Ollama client for zero-cost local inference with any pulled model
Agent State Snapshots	Export/import full session state for debugging, migration, and eval baselines
Grafana Dashboard	Pre-built 14-panel dashboard for agent throughput, latency, cost, and errors
Prompt Playground	Interactive CLI REPL for testing prompts and comparing models (`grampus playground`)
Web UI	Built-in HTMX interface at `/ui/` for memory inspection and monitoring

Get Started¶

Installation — Prerequisites, pip/uv install, Dapr setup
Quickstart — First agent in 5 minutes
Concepts — Mental model for memory, loops, and safety