Skip to content

Error Reference

All Grampus exceptions inherit from GrampusError and carry a machine-readable code string and optional details dict.

from grampus.core.errors import GrampusError

try:
    result = await runner.run(agent, user_input, session_id="s1")
except GrampusError as e:
    print(f"Error: {e}")
    print(f"Code:    {e.code}")
    print(f"Details: {e.details}")

Hierarchy

GrampusError
├── ConfigError
├── MemoryError
│   └── MemorySecurityError
├── ToolError
│   ├── ToolNotFoundError
│   ├── ToolValidationError
│   └── ToolTimeoutError
├── OrchestrationError
│   ├── BudgetExceededError
│   └── UncertaintyError
├── PlanningError
├── MarketAllocationError
├── SafetyError
├── ModelError
└── DaprError
    ├── DaprConnectionError
    ├── ConcurrencyError
    ├── LockAcquisitionError
    └── StateSerializationError

GrampusError (base)

GrampusError(message: str, *, code: str, details: dict | None = None)

All exceptions carry:

Attribute Type Description
message str Human-readable error description
code str Machine-readable error code (snake_case)
details dict \| None Structured context (IDs, values, limits)

ConfigError

Code: config.invalid or config.missing

Raised when: A required configuration field is missing or has an invalid value.

from grampus.core.errors import ConfigError

# Example details
{
    "field": "model.anthropic_api_key",
    "reason": "required field is not set"
}

How to handle: Check environment variables (GRAMPUS_MODEL__ANTHROPIC_API_KEY) and grampus.yaml.


MemoryError

Code: memory.store_failed, memory.retrieve_failed, memory.delete_failed

Raised when: A memory read, write, or delete operation fails at the storage layer.

from grampus.core.errors import MemoryError

# Example details
{
    "memory_type": "episodic",
    "operation": "store",
    "agent_id": "research-agent",
    "dapr_error": "connection refused"
}

How to handle: Check Dapr sidecar health (http://localhost:3500/v1.0/healthz) and PostgreSQL connectivity.


MemorySecurityError

Code: memory.security.injection_detected, memory.security.rate_limit_exceeded, memory.security.validation_failed

Raised when: A memory write is blocked by the security layer (injection detected, rate limit, size anomaly).

from grampus.core.errors import MemorySecurityError

# Example details
{
    "reason": "injection_pattern_detected",
    "pattern": "remember_in_future_sessions",
    "source_type": "TOOL_RESULT",
    "content_preview": "Ignore previous instructions..."
}

How to handle: Review the content being written. If legitimate, adjust the injection detection level in safety_policy.yaml.


ToolError

Code: tool.execution_failed

Raised when: A tool function raises an unretriable exception.

# Example details
{
    "tool_name": "web_search",
    "tool_call_id": "call_abc123",
    "error": "HTTPError: 429 Too Many Requests"
}

ToolNotFoundError

Code: tool.not_found

Raised when: ToolExecutor.execute() or ToolRegistry.get_or_raise() is called with an unregistered tool name.

# Example details
{
    "tool_name": "send_email",
    "registered_tools": ["web_search", "calculate"]
}

ToolValidationError

Code: tool.validation_failed

Raised when: A required tool argument is missing or has the wrong type.

# Example details
{
    "tool_name": "get_weather",
    "missing_arguments": ["city"],
    "received_arguments": {"units": "celsius"}
}

ToolTimeoutError

Code: tool.timeout

Raised when: A tool execution exceeds ToolExecutor.timeout_seconds and all retries are exhausted.

# Example details
{
    "tool_name": "slow_database_query",
    "timeout_seconds": 30.0,
    "attempts": 3
}

OrchestrationError

Code: orchestration.max_iterations_exceeded, orchestration.no_state_found, orchestration.agent_not_waiting, orchestration.crew_member_failed, orchestration.graph_node_failed

Raised when: The agent loop exceeds max_iterations without producing a final answer, or a Crew/Graph operation fails.

# Max iterations example details
{
    "agent_name": "research-agent",
    "max_iterations": 10,
    "last_action": "tool_call: web_search"
}

# Crew failure example details
{
    "failed_member": "critic",
    "error_code": "orchestration.max_iterations_exceeded"
}

How to handle: Increase RunnerConfig.max_iterations, simplify the agent's task, or decompose into a crew.


BudgetExceededError

Code: orchestration.budget_exceeded

Raised when: AgentDefinition.cost_budget_usd is set and the accumulated cost exceeds it during a run.

# Example details
{
    "budget_usd": 0.10,
    "accumulated_cost_usd": 0.1023,
    "agent_name": "research-agent",
    "steps_completed": 7
}

How to handle: Increase the budget, reduce tool calls, or use a cheaper model tier.


UncertaintyError

Code: UNCERTAINTY_CRITICAL

Raised when: An UncertaintyMonitor is attached to AgentRunner and the propagated confidence falls below the high_threshold configured in UncertaintyPolicy (default 0.40), triggering an ABORT action.

from grampus.core.errors import UncertaintyError
from grampus.orchestration import AgentRunner, UncertaintyMonitor, UncertaintyPolicy

policy = UncertaintyPolicy(high_threshold=0.40)
monitor = UncertaintyMonitor(policy=policy)
runner = AgentRunner(client, executor, uncertainty_monitor=monitor)

try:
    result = await runner.run(agent_def, task, session_id="s1")
except UncertaintyError as e:
    print(e.code)   # "UNCERTAINTY_CRITICAL"
    print(e.hint)   # actionable guidance

How to handle: - Lower high_threshold to be more tolerant of uncertain steps. - Add more context to the system prompt to ground the agent. - Switch to PAUSE_FOR_HUMAN instead of ABORT by raising high_threshold above 0.40 (so CRITICAL is never reached). - Enable enable_p_true=True and enable_semantic_sampling=True for a better-calibrated signal before escalating.

See the Uncertainty Quantification guide for full configuration details.


PlanningError

Code: CIRCULAR_DEPENDENCY, MAX_REPLANS_EXCEEDED, REPLAN_PARSE_FAILED, PLAN_PARSE_FAILED, NO_SUBGOALS

Raised when: The long-horizon planning subsystem encounters an unrecoverable failure: a cycle in the subgoal dependency graph, the maximum replan limit is reached, or the replanner LLM output cannot be parsed after two attempts.

from grampus.core.errors import PlanningError
from grampus.orchestration import PlanningRunner, PlanningConfig

planner = PlanningRunner(agent_runner, client, model_id, config=PlanningConfig(max_replans=3))

try:
    result = await planner.run(task, agent_def)
except PlanningError as e:
    print(e.code)   # one of the codes listed above
Code Cause How to handle
CIRCULAR_DEPENDENCY Planner generated a subgoal DAG with a cycle (A depends on B, B depends on A) Increase max_subgoals to give the planner more room, or add a system-prompt note about DAG constraints
MAX_REPLANS_EXCEEDED A subgoal kept failing and max_replans was reached Increase PlanningConfig.max_replans, add more specific fallback_strategy to subgoals, or decompose the task
REPLAN_PARSE_FAILED Replanner LLM returned unparseable JSON on two consecutive attempts Switch to a stronger model_id, or reduce task ambiguity
PLAN_PARSE_FAILED Planner failed to produce valid JSON after two attempts (degenerate fallback used instead, so not normally raised) Check model ID; the degenerate single-subgoal plan is returned instead of raising in most cases
NO_SUBGOALS Planner returned an empty subgoals list Provide a more specific task description

See the Long-Horizon Planning guide for full usage details.


MarketAllocationError

Code: MARKET_ALLOCATION_REJECTED, MARKET_WINNER_NOT_MEMBER, MARKET_NO_MEMBERS

Raised when: Market-based task allocation fails — no capable agents were registered for the required skills, all bids were below the min_success_threshold after calibration discounting, or the winning agent is not a member of the crew.

from grampus.core.errors import MarketAllocationError
from grampus.orchestration.market import MarketCrew

try:
    result = await crew.run_task_with_market(
        task_description="...",
        required_skills=["rare_skill"],
    )
except MarketAllocationError as e:
    print(f"Allocation failed: {e}")
    print(f"Code:    {e.code}")     # one of the codes above
    print(f"Details: {e.details}")  # includes task_id, status
Code Cause How to handle
MARKET_ALLOCATION_REJECTED No capable agents, or all calibrated bids below min_success_threshold Register more agents with the required skills, lower min_success_threshold, or add more capable workers to the crew
MARKET_WINNER_NOT_MEMBER The agent that won bidding is registered in CapabilityRegistry but not in the MarketCrew.members list Ensure every registered worker also has a matching CrewMember in the crew
MARKET_NO_MEMBERS MarketCrew was constructed with an empty member list Pass at least one CrewMember

See the Market-Based Allocation guide for full configuration details.


SafetyError

Code: INPUT_BLOCKED, TOOL_RESULT_BLOCKED, ACTION_BLOCKED, PII_BLOCKED

Raised when: A safety check blocks a user input, tool result, or tool call.

# Example details
{
    "violation_type": "injection",
    "severity": "critical",
    "pattern": "role_hijacking",
    "blocked_content_preview": "Ignore previous instructions..."
}

How to handle: Review the blocked content. If it's a false positive, adjust the injection detection level.


ModelError

Code: model.api_error, model.rate_limit, model.context_length_exceeded, model.invalid_response

Raised when: The LLM API returns an error or an unexpected response.

# Example details
{
    "model": "claude-sonnet-4-6",
    "provider": "anthropic",
    "status_code": 429,
    "provider_error": "rate_limit_error"
}

How to handle: Check your API key, rate limits, and token counts. For context length errors, reduce WorkingMemory token limit.


DaprError

Code: dapr.connection_failed, dapr.timeout

Raised when: The Dapr sidecar is unreachable.


ConcurrencyError

Code: dapr.concurrency_conflict

Raised when: An optimistic concurrency ETag mismatch occurs during a Dapr state write.

# Example details
{
    "key": "episodic:research-agent:session-42:ep-001",
    "expected_etag": "v3",
    "actual_etag": "v4"
}

How to handle: Retry the operation with the latest ETag.


LockAcquisitionError

Code: dapr.lock_acquisition_failed

Raised when: A distributed lock cannot be acquired within the timeout (another process holds it).


StateSerializationError

Code: dapr.serialization_failed

Raised when: A Dapr state value cannot be deserialized into the expected Pydantic model.