LLM Gateway Policies

Igris policies are ordered rule lists evaluated on every LLM request. The first matching rule wins. Rules can allow, deny, alert, rate-limit, guard token counts, and filter content. LLM policy rules share the same PolicyRule shape as MCP governance rules. The target.kind field is the discriminator — llm_model and llm_endpoint are the LLM-specific variants.

PolicyRule shape

interface PolicyRule {
	target: PolicyRuleTarget;
	action: "allow" | "deny" | "alert";
	conditions?: Record<string, unknown>;
	limit?: PolicyRuleLimit;
	tokenGuard?: {
		maxInputTokens?: number;       // reject if estimated input tokens exceeds this
		maxOutputTokens?: number;      // cap the max_tokens field in the request
		maxRequestMaxTokens?: number;  // reject if the request's max_tokens exceeds this
	};
	contentGuard?: {
		detectors: string[];           // detector IDs (built-in or "custom:slug") and packs ("pack:pii-default")
		action: "deny" | "redact" | "alert";
		inspectResponse?: boolean;     // also scan the model response (non-streaming only)
	};
	logContent?: boolean;             // persist full prompt + completion to llm_call_bodies
}

Target variants

type PolicyRuleTarget =
	// Match a specific LLM model name or glob (e.g. "gpt-4*", "claude-*")
	| { kind: "llm_model"; model: string }

	// Match a specific LLM endpoint type
	| {
			kind: "llm_endpoint";
			endpoint:
				| "chat.completions"
				| "embeddings"
				| "images.generate"
				| "audio.transcriptions"
				| "audio.speech"
				| "passthrough";
	  }

	// Match an MCP tool call (for MCP governance rules in the same policy)
	| { kind: "mcp_tool"; tool: string };

Model glob matching

The model field supports glob patterns with * as a wildcard:

{ "kind": "llm_model", "model": "gpt-4*" }     // matches gpt-4o, gpt-4-turbo, gpt-4o-mini ...
{ "kind": "llm_model", "model": "*" }            // matches any model (catch-all)
{ "kind": "llm_model", "model": "claude-3-5-*" } // matches claude-3-5-sonnet, claude-3-5-haiku ...

Action

"allow"  // permit the request — stop evaluating further rules
"deny"   // block the request with HTTP 403 and a structured error body
"alert"  // record an alert event, continue processing (request is not blocked)

A policy without a catch-all allow rule at the end is deny-by-default — any model not matched by an explicit allow rule will be blocked.

Conditions

Conditions gate a rule on request metadata. The gateway evaluates conditions AFTER matching the target. If conditions are present and don’t match, the rule is skipped and evaluation continues.

{
	"conditions": {
		"user": "alice@corp.com",
		"metadata.role": { "in": ["developer", "admin"] },
		"metadata.team": { "neq": "interns" }
	}
}

Condition operators

Operator	Description
`"value"` (direct)	Exact equality
`{ "eq": "value" }`	Explicit equality
`{ "neq": "value" }`	Not equal
`{ "in": ["a", "b"] }`	Value is in list
`{ "nin": ["a", "b"] }`	Value is not in list

Condition key paths

Path	Source
`user`	`X-Igris-User` header or `body.user`
`traceId`	`X-Igris-Trace-Id` header
`metadata.<key>`	`X-Igris-Metadata-<key>` header or JSON blob
`connectionSlug`	The connection slug being used

Limit (rate limiting)

The limit field adds a rate limit dimension to the rule. Requests that exceed the limit are blocked with HTTP 429. Limits are tracked per connection (or per user if conditions includes a user match).

interface PolicyRuleLimit {
	requests?: number;    // max requests in the window
	tokens?: number;      // max total tokens (input + output) in the window
	dollars?: number;     // max spend in USD in the window
	per: "minute" | "hour" | "day";
}

Example: cap GPT-4o calls for contractors at 10 requests/hour:

{
	"target": { "kind": "llm_model", "model": "gpt-4o" },
	"action": "allow",
	"conditions": { "metadata.role": "contractor" },
	"limit": { "requests": 10, "per": "hour" }
}

Limits use Redis-backed sliding window counters. Multiple dimensions can be set simultaneously — the first exceeded dimension triggers the 429.

Token guards

Token guards operate at request time, before the request is forwarded to the provider. They use a character-count heuristic (chars / 4) for input estimation, and the actual usage field from the response for output recording.

{
	"target": { "kind": "llm_model", "model": "*" },
	"action": "allow",
	"tokenGuard": {
		"maxInputTokens": 8000,
		"maxRequestMaxTokens": 2000
	}
}

maxInputTokens — deny if estimated prompt tokens exceed this value
maxOutputTokens — silently cap max_tokens in the forwarded request to this value
maxRequestMaxTokens — deny if the request’s max_tokens field exceeds this value

Content guards

Content guards inspect the prompt text using a curated detector library plus optional per-org custom detectors. Detector IDs may reference an individual detector (e.g. us-ssn), a pack (e.g. pack:pii-default), or a custom detector you’ve created (custom:<your-slug>).

{
	"target": { "kind": "llm_endpoint", "endpoint": "chat.completions" },
	"action": "allow",
	"contentGuard": {
		"detectors": ["pack:pii-default", "custom:internal-id"],
		"action": "redact",
		"inspectResponse": true
	}
}

detectors — list of detector or pack IDs to apply to prompt content
action — what happens when a detector matches: deny (block), redact (rewrite the matched text with [REDACTED:<id>]), or alert (log only)
inspectResponse — also scan the model’s response for matches (non-streaming responses only; streaming response scanning is a planned future feature)

Built-in packs

Pack	Members
`pack:pii-default`	`us-ssn`, `credit-card`, `email`, `us-phone`, `ipv4`, `iban`, `uk-nin`
`pack:secrets-default`	`aws-access-key`, `aws-secret-key`, `gcp-service-account`, `private-key-pem`, `slack-token`, `github-token`, `jwt`

Custom detectors

Manage custom detectors via POST /api/v1/detectors/custom. Three kinds are supported: regex, keywords, and luhn. Reference a saved custom detector with the prefix custom:<slug> in the detectors array of any rule.

logContent

When logContent: true is set on a matching rule, the full prompt messages and completion text are stored in llm_call_bodies and linked to the audit event. This is disabled by default to avoid storing sensitive data.

{
	"target": { "kind": "llm_model", "model": "*" },
	"action": "allow",
	"logContent": true
}

Enabling logContent stores prompt and completion text verbatim. Ensure this complies with your data retention and privacy policies before enabling. Consider combining with contentGuard to redact PII before logging.

Complete policy example

This example demonstrates a comprehensive policy for a production deployment:

{
	"name": "Production LLM governance policy",
	"connectionSlug": "openai-prod",
	"rules": [
		{
			"target": { "kind": "llm_model", "model": "gpt-4o" },
			"action": "deny",
			"conditions": {
				"metadata.userTier": { "in": ["basic", "trial"] }
			}
		},

		{
			"target": { "kind": "llm_model", "model": "gpt-4*" },
			"action": "allow",
			"conditions": {
				"metadata.userTier": "premium"
			},
			"limit": {
				"tokens": 1000000,
				"dollars": 50,
				"per": "day"
			},
			"tokenGuard": {
				"maxInputTokens": 32000,
				"maxRequestMaxTokens": 4096
			},
			"logContent": false
		},

		{
			"target": { "kind": "llm_endpoint", "endpoint": "chat.completions" },
			"action": "allow",
			"limit": {
				"requests": 100,
				"per": "minute"
			},
			"contentGuard": {
				"detectors": ["pack:pii-default", "pack:secrets-default"],
				"action": "alert"
			}
		},

		{
			"target": { "kind": "llm_model", "model": "*" },
			"action": "alert"
		}
	]
}

How this policy works:

Deny GPT-4o for users on basic/trial tiers (the customer’s own user tier — metadata.userTier, not Igris’s plan tier)
Allow GPT-4 family for premium users with daily spend and token limits
Allow chat completions for everyone with a rate limit and content audit (alert-only, not deny)
Alert on any other model (catch-all — no implicit allow, so unknown models are alerted not denied)

Managing policies via SDK

// List all policies
const { data } = await igris.policies.list({ connectionSlug: "openai-prod" });

// Create a policy
const policy = await igris.policies.create({
	name: "Block GPT-4 for basic tier",
	connectionSlug: "openai-prod",
	rules: [
		{
			target: { kind: "llm_model", model: "gpt-4*" },
			action: "deny",
			conditions: { "metadata.userTier": "basic" },
		},
		{
			target: { kind: "llm_model", model: "*" },
			action: "allow",
		},
	],
});

// Update a policy
await igris.policies.update(policy.id, { enabled: false });

// Delete a policy
await igris.policies.delete(policy.id);

Documentation Index

​LLM Gateway Policies

​PolicyRule shape

​Target variants

​Model glob matching

​Action

​Conditions

​Condition operators

​Condition key paths

​Limit (rate limiting)

​Token guards

​Content guards

​Built-in packs

​Custom detectors

​logContent

​Complete policy example

​Managing policies via SDK

LLM Gateway Policies

PolicyRule shape

Target variants

Model glob matching

Action

Conditions

Condition operators

Condition key paths

Limit (rate limiting)

Token guards

Content guards

Built-in packs

Custom detectors

logContent

Complete policy example

Managing policies via SDK