Reinforcement Learning for LLMs and Agents
Improve your models' performance and reliability on any task.

Improve your models' performance and reliability on any task.
Submit prompts and create custom reward functions that evaluate model outputs on your specific tasks.
Our platform applies the reinforcement learning algorithms behind Deepseek R1 to optimize your model's performance.
Deploy your improved model that's been optimized based on your specific reward criteria.
Tired of tuning prompts to make generic models do what you want? Train your model to be good at your taskāall you need to tell us is what's good and what's bad. We'll make the model perform reliably.
Learn how we taught a research agent to use tools › Beating o3-mini on performance and cost
from runrl import RunRL
client = RunRL()
client.create_run(
model_name="runrl/dsp",
prompt_file="math_prompts.jsonl",
reward_file="steganography_reward.py",
)
We'll work with you to define targets, measure your agent performance, and help you outperform closed models.
Work directly with our RL research team on your problems.
Seamlessly deploy your optimized agents into your existing infrastructure and workflows.