RunRL

Reinforcement Learning for LLMs and Agents

Improve your models' performance and reliability on any task.

1. Define your task

Submit prompts and create custom reward functions that evaluate model outputs on your specific tasks.

2. Run RL

Our platform applies the reinforcement learning algorithms behind Deepseek R1 to optimize your model's performance.

3. Get Better Results

Deploy your improved model that's been optimized based on your specific reward criteria.

Visual representing specialized AI models

Specialized Models for Your Approach

Tired of tuning prompts to make generic models do what you want? Train your model to be good at your task—all you need to tell us is what's good and what's bad. We'll make the model perform reliably.

Learn how we taught a research agent to use tools › Beating o3-mini on performance and cost

Already know what you want?
Start training in seconds.

Built for researchers and developers

$ pip install runrl


from runrl import RunRL
client = RunRL()
client.create_run(
    model_name="runrl/dsp",
    prompt_file="math_prompts.jsonl",
    reward_file="steganography_reward.py",
)

Documentation › Quickstart Guide › Developer Resources ›

Specialized Enterprise Agents

Custom reward development

We'll work with you to define targets, measure your agent performance, and help you outperform closed models.

World-class RL expertise

Work directly with our RL research team on your problems.

Integration with your stack

Seamlessly deploy your optimized agents into your existing infrastructure and workflows.

Book a 15-min Call

Reinforcement Learning for LLMs and Agents

RL for Chemistry Models

RL for Web Agents

RL for Code Generation

RL for Voice Agents