Hebbrix
Advanced

RL Policies

Think of RL policies like different versions of your AI's brain - as the system learns, it creates improved versions you can test and promote to production.

What You Can Do

Each policy is a snapshot of what the AI learned at a specific time. You can:

  • See which version is currently running
  • Compare different versions to see which performs better
  • Promote a better version to production
  • Rollback to a previous version if needed

When Do You Need This?

Testing New Improvements

Your AI just trained on thousands of new user interactions. Before making it live for everyone, check its performance first—compare it with the current version.

A/B Testing

Run experiments by comparing two policy versions side-by-side. See which one gives better search results, more accurate answers, or happier users.

Quick Rollbacks

If a new version isn't working as expected, instantly rollback to the previous stable version—no downtime.

Version Control

Keep track of your AI's evolution over time. See exactly when changes were made and which version performed best for your specific use case.

Available Endpoints

Code Examples

Check Policy Status

Python
from hebbrix import Hebbrix

client = Hebbrix()

# Get current policy status
status = client.policies.get_status()

print(f"Current production version: {status['production_version']}")
print(f"Active since: {status['activated_at']}")
print(f"Performance score: {status['performance_score']}")

# List all available policy versions
policies = client.policies.list()

for policy in policies:
    is_active = "ACTIVE" if policy['is_production'] else "AVAILABLE"
    print(f"Version: {policy['version']} - {is_active}")

Promote a New Policy

Python
# Compare two policy versions first
comparison = client.policies.compare(
    version_a="v1.2.4",  # Current production
    version_b="v1.2.5"   # New candidate
)

print(f"Current version accuracy: {comparison['version_a']['accuracy']}%")
print(f"New version accuracy: {comparison['version_b']['accuracy']}%")

# If new version is better, promote it
if comparison['version_b']['accuracy'] > comparison['version_a']['accuracy']:
    result = client.policies.promote(version="v1.2.5")
    print(f"Promoted v1.2.5 to production!")
else:
    print("New version doesn't perform better. Keeping current version.")

Emergency Rollback

Python
# Rollback to previous version
result = client.policies.rollback()

print(f"Rolled back to version: {result['version']}")
print(f"This was the production version from: {result['previous_activated_at']}")

Best Practices

Always Compare First

Never promote a new policy without comparing it to the current one.

Monitor After Promotion

Watch for changes in error rates or performance degradation after promoting.

Keep Previous Versions

Don't delete old versions immediately. Keep at least 2-3 stable versions for rollback.

Have a Rollback Plan

Know how to rollback quickly. Save the rollback command somewhere accessible.

Don't Worry, It's Automatic!

Hebbrix automatically creates new RL policies in the background as your system learns from user interactions. You don't need to manually trigger training—it happens automatically. Your only job is to decide when to promote a new, better-performing policy to production.

Next Steps

Assistant

Ask me anything about Hebbrix