Safety API Reference¶

SafetyPipeline¶

The main safety middleware. Compose injection detection, PII detection, and action guard into a unified check pipeline.

`grampus.safety.pipeline.SafetyPipeline` ¶

Middleware that wraps agent actions with safety checks.

All checks are applied in order. A BLOCK-level result raises SafetyError. Non-blocking detections are logged and returned as SafetyViolation records.

Parameters:

Name	Type	Description	Default
`injection_detector`	`object \| None`	Optional injection detector. Skipped if None.	`None`
`pii_detector`	`object \| None`	Optional PII detector. Skipped if None.	`None`
`action_guard`	`object \| None`	Optional action guard. Skipped if None.	`None`
`config`	`SafetyPipelineConfig \| None`	Which check categories to enable.	`None`

`check_input(text)` `async` ¶

Check user input for injection then PII.

Parameters:

Name	Type	Description	Default
`text`	`str`	Raw user input.	required

Returns:

Type	Description
`tuple[str, list[SafetyViolation]]`	Tuple of (possibly-redacted text, new violations from this call).

Raises:

Type	Description
`SafetyError`	code="INPUT_BLOCKED" if injection is detected at threshold.

`check_tool_result(result)` `async` ¶

Check tool result output for injection and PII.

Parameters:

Name	Type	Description	Default
`result`	`ToolResult`	The ToolResult from tool execution.	required

Returns:

Type	Description
`tuple[ToolResult, list[SafetyViolation]]`	Tuple of (possibly-redacted result, new violations).

Raises:

Type	Description
`SafetyError`	code="TOOL_RESULT_BLOCKED" if injection is detected.

`check_llm_output(text)` `async` ¶

Check LLM response for PII only (injection not blocked to avoid loops).

Parameters:

Name	Type	Description	Default
`text`	`str`	LLM-generated text to check.	required

Returns:

Type	Description
`tuple[str, list[SafetyViolation]]`	Tuple of (possibly-redacted text, new violations).

`check_tool_call(tool_call, *, calls_this_turn=0, consecutive_calls=0)` `async` ¶

Check a tool call against action guard policy.

Parameters:

Name	Type	Description	Default
`tool_call`	`ToolCall`	The tool call to evaluate.	required
`calls_this_turn`	`int`	Calls already made this turn.	`0`
`consecutive_calls`	`int`	Consecutive calls without a non-tool step.	`0`

Returns:

Type	Description
`tuple[ToolCall, list[SafetyViolation]]`	Tuple of (tool_call unchanged, new violations).

Raises:

Type	Description
`SafetyError`	code="ACTION_BLOCKED" if the action guard denies the call.

`get_violations()` ¶

Return all violations recorded in this pipeline instance's lifetime.

SafetyPipelineConfig¶

from grampus.safety.pipeline import SafetyPipelineConfig

config = SafetyPipelineConfig(
    check_user_input=True,       # run injection + PII on user input
    check_tool_results=True,     # run injection + PII on tool output
    check_llm_output=True,       # run PII on LLM response (injection not blocked)
    check_memory_writes=True,    # run injection on memory write content
    log_violations=True,         # emit structlog events for violations
)

Injection detector¶

`grampus.safety.injection.PromptInjectionDetector` ¶

Multi-layer prompt injection detector.

Three detection layers applied in order: 1. Regex — known attack signatures (fast, zero false-negatives on known patterns) 2. Heuristic — structural signals (role override attempts, instruction boundaries) 3. Keyword — semantic markers without full NLP

Parameters:

Name	Type	Description	Default
`level`	`DetectionLevel`	Detection strictness. Controls the confidence threshold above which `detected=True` is returned. STRICT >= 0.3, BALANCED >= 0.5, PERMISSIVE >= 0.8	`BALANCED`

`check(text)` ¶

Synchronous check — no I/O. Returns InjectionResult.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text to inspect for injection attempts.	required

Returns:

Type	Description
`InjectionResult`	InjectionResult with confidence score and matched patterns.

InjectionCheckResult¶

@dataclass
class InjectionCheckResult:
    detected: bool
    pattern: str | None        # matched pattern name
    confidence: float          # 0.0–1.0
    blocked: bool              # True if level blocks this confidence

PII detector¶

`grampus.safety.pii.PIIDetector` ¶

Regex-based PII detector with configurable action per PII type.

Parameters:

Name	Type	Description	Default
`actions`	`dict[PIIType, PIIAction] \| None`	Map of PIIType -> PIIAction. Defaults to REDACT for all types. If a type is not in the map, defaults to LOG.	`None`

PIICheckResult¶

@dataclass
class PIICheckResult:
    detected: bool
    types_found: list[str]        # ["email", "phone", ...]
    redacted_text: str            # original if action="log", else redacted
    blocked: bool                 # True if action="block" and PII found

Action guard¶

`grampus.safety.action_guard.SafetyActionGuard` ¶

Orchestration-level action guard enforcing policy rules.

Can be constructed either with an ActionPolicy object or with inline keyword arguments (allowed_tools, denied_tools, max_tool_calls_per_turn) for quick one-off usage.

Parameters:

Name	Type	Description	Default
`policy`	`ActionPolicy \| None`	Full ActionPolicy for this agent. When provided, all keyword arguments are ignored.	`None`
`allowed_tools`	`list[str] \| None`	Optional allowlist — when set only listed tools are permitted.	`None`
`denied_tools`	`list[str] \| None`	Explicit denylist of tool names.	`None`
`max_tool_calls_per_turn`	`int`	Hard cap on tool calls per agent turn.	`20`

ActionPolicy¶

from grampus.safety.action_guard import ActionPolicy

policy = ActionPolicy(
    allowed_tools=["web_search", "calculate"],  # explicit allowlist (None = allow all)
    denied_tools=[],                             # explicit denylist
    max_tool_calls_per_turn=20,                  # across all tools per turn
    max_consecutive_tool_calls=8,                # before requiring LLM step
    max_cost_per_action_usd=0.05,               # per-tool-call cost cap
)

SafetyViolation¶

Structured record emitted for every detected issue:

@dataclass
class SafetyViolation:
    violation_type: str    # "injection" | "pii" | "action_blocked"
    severity: str          # "critical" | "high" | "medium" | "low"
    detail: str            # human-readable description
    blocked: bool          # True = request was blocked, False = logged only
    timestamp: datetime

Policy loader¶

`grampus.safety.policies.PolicyLoader` ¶

Loads GrampusSafetyPolicy from a YAML file or dict.

`load(path=None)` `staticmethod` ¶

Load and validate policy. Returns default policy if path is None.

Parameters:

Name	Type	Description	Default
`path`	`str \| None`	Filesystem path to a YAML policy file.	`None`

Returns:

Type	Description
`GrampusSafetyPolicy`	A validated GrampusSafetyPolicy instance.

Raises:

Type	Description
`ConfigError`	If the file is missing or contains invalid YAML/schema.

Example policy YAML¶

# safety_policy.yaml
injection:
  level: balanced

pii:
  action: redact
  types:
    - email
    - phone
    - ssn
    - credit_card

action_guard:
  allowed_tools:
    - web_search
    - calculate
  max_tool_calls_per_turn: 20
  max_consecutive_tool_calls: 8
  max_cost_per_action_usd: 0.05

pipeline:
  check_user_input: true
  check_tool_results: true
  check_llm_output: true
  check_memory_writes: true
  log_violations: true

Loading:

from grampus.safety.policies import load_safety_policy
from grampus.safety.pipeline import SafetyPipeline

safety_config = load_safety_policy("safety_policy.yaml")
pipeline = SafetyPipeline.from_config(safety_config)

Safety API Reference¶

SafetyPipeline¶

grampus.safety.pipeline.SafetyPipeline ¶

check_input(text) async ¶

check_tool_result(result) async ¶

check_llm_output(text) async ¶

check_tool_call(tool_call, *, calls_this_turn=0, consecutive_calls=0) async ¶

get_violations() ¶

SafetyPipelineConfig¶

Injection detector¶

grampus.safety.injection.PromptInjectionDetector ¶

check(text) ¶

InjectionCheckResult¶

PII detector¶

grampus.safety.pii.PIIDetector ¶

PIICheckResult¶

Action guard¶

grampus.safety.action_guard.SafetyActionGuard ¶

ActionPolicy¶

SafetyViolation¶

Policy loader¶

grampus.safety.policies.PolicyLoader ¶

load(path=None) staticmethod ¶

Example policy YAML¶

`grampus.safety.pipeline.SafetyPipeline` ¶

`check_input(text)` `async` ¶

`check_tool_result(result)` `async` ¶

`check_llm_output(text)` `async` ¶

`check_tool_call(tool_call, *, calls_this_turn=0, consecutive_calls=0)` `async` ¶

`get_violations()` ¶

`grampus.safety.injection.PromptInjectionDetector` ¶

`check(text)` ¶

`grampus.safety.pii.PIIDetector` ¶

`grampus.safety.action_guard.SafetyActionGuard` ¶

`grampus.safety.policies.PolicyLoader` ¶

`load(path=None)` `staticmethod` ¶