Safety API Reference¶
SafetyPipeline¶
The main safety middleware. Compose injection detection, PII detection, and action guard into a unified check pipeline.
grampus.safety.pipeline.SafetyPipeline
¶
Middleware that wraps agent actions with safety checks.
All checks are applied in order. A BLOCK-level result raises SafetyError. Non-blocking detections are logged and returned as SafetyViolation records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
injection_detector
|
object | None
|
Optional injection detector. Skipped if None. |
None
|
pii_detector
|
object | None
|
Optional PII detector. Skipped if None. |
None
|
action_guard
|
object | None
|
Optional action guard. Skipped if None. |
None
|
config
|
SafetyPipelineConfig | None
|
Which check categories to enable. |
None
|
check_input(text)
async
¶
Check user input for injection then PII.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Raw user input. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, list[SafetyViolation]]
|
Tuple of (possibly-redacted text, new violations from this call). |
Raises:
| Type | Description |
|---|---|
SafetyError
|
code="INPUT_BLOCKED" if injection is detected at threshold. |
check_tool_result(result)
async
¶
Check tool result output for injection and PII.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
ToolResult
|
The ToolResult from tool execution. |
required |
Returns:
| Type | Description |
|---|---|
tuple[ToolResult, list[SafetyViolation]]
|
Tuple of (possibly-redacted result, new violations). |
Raises:
| Type | Description |
|---|---|
SafetyError
|
code="TOOL_RESULT_BLOCKED" if injection is detected. |
check_llm_output(text)
async
¶
Check LLM response for PII only (injection not blocked to avoid loops).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
LLM-generated text to check. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, list[SafetyViolation]]
|
Tuple of (possibly-redacted text, new violations). |
check_tool_call(tool_call, *, calls_this_turn=0, consecutive_calls=0)
async
¶
Check a tool call against action guard policy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tool_call
|
ToolCall
|
The tool call to evaluate. |
required |
calls_this_turn
|
int
|
Calls already made this turn. |
0
|
consecutive_calls
|
int
|
Consecutive calls without a non-tool step. |
0
|
Returns:
| Type | Description |
|---|---|
tuple[ToolCall, list[SafetyViolation]]
|
Tuple of (tool_call unchanged, new violations). |
Raises:
| Type | Description |
|---|---|
SafetyError
|
code="ACTION_BLOCKED" if the action guard denies the call. |
get_violations()
¶
Return all violations recorded in this pipeline instance's lifetime.
SafetyPipelineConfig¶
from grampus.safety.pipeline import SafetyPipelineConfig
config = SafetyPipelineConfig(
check_user_input=True, # run injection + PII on user input
check_tool_results=True, # run injection + PII on tool output
check_llm_output=True, # run PII on LLM response (injection not blocked)
check_memory_writes=True, # run injection on memory write content
log_violations=True, # emit structlog events for violations
)
Injection detector¶
grampus.safety.injection.PromptInjectionDetector
¶
Multi-layer prompt injection detector.
Three detection layers applied in order: 1. Regex — known attack signatures (fast, zero false-negatives on known patterns) 2. Heuristic — structural signals (role override attempts, instruction boundaries) 3. Keyword — semantic markers without full NLP
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
DetectionLevel
|
Detection strictness. Controls the confidence threshold
above which |
BALANCED
|
check(text)
¶
Synchronous check — no I/O. Returns InjectionResult.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to inspect for injection attempts. |
required |
Returns:
| Type | Description |
|---|---|
InjectionResult
|
InjectionResult with confidence score and matched patterns. |
InjectionCheckResult¶
@dataclass
class InjectionCheckResult:
detected: bool
pattern: str | None # matched pattern name
confidence: float # 0.0–1.0
blocked: bool # True if level blocks this confidence
PII detector¶
grampus.safety.pii.PIIDetector
¶
Regex-based PII detector with configurable action per PII type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
actions
|
dict[PIIType, PIIAction] | None
|
Map of PIIType -> PIIAction. Defaults to REDACT for all types. If a type is not in the map, defaults to LOG. |
None
|
PIICheckResult¶
@dataclass
class PIICheckResult:
detected: bool
types_found: list[str] # ["email", "phone", ...]
redacted_text: str # original if action="log", else redacted
blocked: bool # True if action="block" and PII found
Action guard¶
grampus.safety.action_guard.SafetyActionGuard
¶
Orchestration-level action guard enforcing policy rules.
Can be constructed either with an ActionPolicy object or with inline
keyword arguments (allowed_tools, denied_tools,
max_tool_calls_per_turn) for quick one-off usage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
ActionPolicy | None
|
Full ActionPolicy for this agent. When provided, all keyword arguments are ignored. |
None
|
allowed_tools
|
list[str] | None
|
Optional allowlist — when set only listed tools are permitted. |
None
|
denied_tools
|
list[str] | None
|
Explicit denylist of tool names. |
None
|
max_tool_calls_per_turn
|
int
|
Hard cap on tool calls per agent turn. |
20
|
ActionPolicy¶
from grampus.safety.action_guard import ActionPolicy
policy = ActionPolicy(
allowed_tools=["web_search", "calculate"], # explicit allowlist (None = allow all)
denied_tools=[], # explicit denylist
max_tool_calls_per_turn=20, # across all tools per turn
max_consecutive_tool_calls=8, # before requiring LLM step
max_cost_per_action_usd=0.05, # per-tool-call cost cap
)
SafetyViolation¶
Structured record emitted for every detected issue:
@dataclass
class SafetyViolation:
violation_type: str # "injection" | "pii" | "action_blocked"
severity: str # "critical" | "high" | "medium" | "low"
detail: str # human-readable description
blocked: bool # True = request was blocked, False = logged only
timestamp: datetime
Policy loader¶
grampus.safety.policies.PolicyLoader
¶
Loads GrampusSafetyPolicy from a YAML file or dict.
load(path=None)
staticmethod
¶
Load and validate policy. Returns default policy if path is None.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | None
|
Filesystem path to a YAML policy file. |
None
|
Returns:
| Type | Description |
|---|---|
GrampusSafetyPolicy
|
A validated GrampusSafetyPolicy instance. |
Raises:
| Type | Description |
|---|---|
ConfigError
|
If the file is missing or contains invalid YAML/schema. |
Example policy YAML¶
# safety_policy.yaml
injection:
level: balanced
pii:
action: redact
types:
- email
- phone
- ssn
- credit_card
action_guard:
allowed_tools:
- web_search
- calculate
max_tool_calls_per_turn: 20
max_consecutive_tool_calls: 8
max_cost_per_action_usd: 0.05
pipeline:
check_user_input: true
check_tool_results: true
check_llm_output: true
check_memory_writes: true
log_violations: true
Loading: