Policies - Rate Limits and Usage Guidelines

RL policies are versioned snapshots of what your AI has learned. As the system learns, it creates new versions you can test and promote to production.

What You Can Do. Each policy is a snapshot of what the AI learned at a specific time. You can:

See which version is currently running
Compare different versions to see which performs better
Promote a better version to production
Rollback to a previous version if needed

01. Safe Deployment Pipeline

New policies go through a 3-stage pipeline before reaching production:

1. Shadow Mode

The new RL policy runs alongside the existing system. Both make decisions, but only the existing system's decisions are used. The new policy's decisions are logged for comparison.

2. Canary Release

The new policy is exposed to 5% → 10% → 25% → 50% → 100% of traffic gradually. Automatic rollback triggers if success rate drops below 95% or latency increases by more than 20%.

3. Production

The policy serves all traffic. Previous version is kept for instant rollback. Constitutional constraints remain active regardless of which policy is running.

02. When Do You Need This?

Testing New Improvements

Your AI just trained on thousands of new interactions. Before making it live for everyone, check its performance first. Shadow mode lets you compare without risk.

A/B Testing

Run experiments by comparing two policy versions side-by-side. See which one gives better search results, more accurate answers, or happier users.

Quick Rollbacks

If a new version isn't working as expected, roll back to the previous stable version with no downtime. Rollback is automatic if metrics degrade.

Per-Request Control

Opt out of RL influence on any request with use_rl_ranking: false. Get pure retrieval results without any policy influence when you need it.

03. Available Endpoints

04. Tier Access

View policies, status, comparePro ($99/mo)

Trigger trainingPro ($99/mo)

Promote / Rollback policiesPro ($99/mo) + admin role

05. Code Examples

Check Policy Status

Python

import os
import requests

BASE = "https://api.hebbrix.com/v1"
H = {"Authorization": f"Bearer {os.environ['HEBBRIX_API_KEY']}"}

# GET /v1/policies/status: current production policy
r = requests.get(f"{BASE}/policies/status", headers=H)
status = r.json()

prod = status.get("production")
if prod:
    print(f"Current production version: {prod['version']}")
    print(f"Promoted at: {prod.get('promoted_at')}")
    print(f"Metrics: {prod.get('metrics', {})}")
print(f"Total versions: {status['total_versions']}")

# GET /v1/policies: paginated list of trained policy versions
r = requests.get(f"{BASE}/policies", headers=H)
page = r.json()
for policy in page["items"]:
    tag = "ACTIVE" if policy["status"] == "production" else policy["status"].upper()
    print(f"Version {policy['version']} ({tag})")
if page["has_more"]:
    # Fetch next page with ?cursor=page["next_cursor"]
    pass

Promote a New Policy

Python

# POST /v1/policies/compare: side-by-side eval of baseline vs candidate
r = requests.post(
    f"{BASE}/policies/compare",
    headers=H,
    json={
        "baseline_version": "v1.2.4",
        "candidate_version": "v1.2.5",
        "agent_type": "memory_manager",
    },
)
comparison = r.json()

print(f"Improvement: {comparison['is_improvement']}")
print(f"Recommendation: {comparison['recommendation']}")

# If the candidate wins, promote it (POST /v1/policies/promote)
if comparison["is_improvement"]:
    r = requests.post(
        f"{BASE}/policies/promote",
        headers=H,
        json={"version": "v1.2.5", "agent_type": "memory_manager"},
    )
    print("Promoted v1.2.5 to production!")
else:
    print("New version doesn't perform better. Keeping current.")

Emergency Rollback

Python

# POST /v1/policies/rollback: restore a previous version to production
# Requires admin role + Pro tier. Both version and agent_type are required.
r = requests.post(
    f"{BASE}/policies/rollback",
    headers=H,
    json={
        "version": "v1.2.4",
        "agent_type": "memory_manager",
        "reason": "regression in retrieval quality",  # optional, logged for audit
    },
)
result = r.json()

print(f"Rolled back to version: {result['version']}")
print(f"Promoted at: {result.get('promoted_at')}")

06. Best Practices

Always Compare First

Never promote a new policy without comparing it to the current one.

Monitor After Promotion

Watch for changes in error rates or performance degradation after promoting.

Keep Previous Versions

Don't delete old versions immediately. Keep at least 2-3 stable versions for rollback.

Have a Rollback Plan

Know how to rollback quickly. Save the rollback command somewhere accessible.

This Is Automatic. Hebbrix creates new RL policies when training is triggered, either on a 6-hour schedule (when enough interaction data exists) or on-demand via POST /v1/policies/train. After training completes, a candidate policy is created. Your job is to decide when to promote a new, better-performing policy to production.

RL Policies

01. Safe Deployment Pipeline

1. Shadow Mode

2. Canary Release

3. Production

02. When Do You Need This?

Testing New Improvements

A/B Testing

Quick Rollbacks

Per-Request Control

03. Available Endpoints

04. Tier Access

05. Code Examples

Check Policy Status

Promote a New Policy

Emergency Rollback

06. Best Practices

Always Compare First

Monitor After Promotion

Keep Previous Versions

Have a Rollback Plan