Turing at ICLR 2026 | Booth #301

Explore RL environments, evaluation infrastructure, benchmarks, and real-world data systems for post-training research.

Talk to a Researcher

secondaryButton

Partner with Turing

RL environments, benchmarks, data, and collaborations

Work with Turing

Researcher network, roles, and careers

LLM Researchers Happy Hour

An evening of AGI insights and real conversations

RL environments

Production-grade UI and MCP environments for agent training and evaluation. Each environment includes prompts, verifiers, and reward logic for controlled experimentation.

Request RL Environment

Explore RL Environments

Explore RL Enviroment

UI and MCP environments with full tool inventories, prompts, and workflows

Deterministic Playwright automations with structured validation

Interactive agent runs with complete tool-environment traces

Real-time leaderboards, QA rubrics, and structured environment metadata

Benchmarks and evaluation

Reproducible scoring across unified execution environments, built on real defects and tasks, with semantic-aware tests and versioned runs for full auditability.

Request Hillclimb

SWE-bench++

End-to-end evaluation for software engineering agents. 500 public + 7,000+ commercial software engineering tasks.

VLM-bench 1.0

700+ open-ended multimodal reasoning tasks across STEM and business domains.

Code Review Bench

Evaluating agentic code partners through difficult review tasks. 1,200 public tasks, 6,296 commercial tasks.

Off-the-shelf data packs

Calibrated, ready-to-deploy datasets built for frontier model evaluation. Each pack ships in standard formats and is compatible with your existing harness.

Request Sample Data

Turing Terminal-Bench

Hill-climbing terminal-bench reasoning in Harbor format. ~33–40% resolution on frontier models.

Turing LiveCodeBench

Deterministic algorithmic evaluation for frontier coding models. 1K+ non-public samples in LCB-native JSON format.

HLE++

Graduate-to-PhD headroom sets to preserve measurable pass@k separation after HLE saturation. Get 1k-5k+ OTS packs within 24–48 hours.

Expert-verified data

Human-in-the-loop datasets built for SFT, RL, and evaluation. Built from real enterprise workflows with domain precision and full traceability.

Coding: real-world repo tasks and verified patches
STEM: advanced math, chemistry, physics, and biology
Multimodality: audio, image, and GUI reasoning
Domain-specific: finance, legal, healthcare, and retail
Robotics & Embodied AI: imitation learning and embodied reasoning
Trust & Safety: policy-grounded tasks and adversarial prompts
Infrastructure-as-Code: cloud infrastructure evaluation in real environments

Request Sample Data

Case studies & collaborations

Turing has partnered with leading AI labs and enterprises to build governed post-training systems that close the gap between research benchmarks and production deployment.

Contribute as a researcher

Join Turing's network of PhDs and Olympiad-level researchers contributing to post-training research in coding, STEM, multimodal evaluation, robotics, and more.

Join the Researcher Network

Work for Turing’s internal team

Join our internal research and engineering teams building RL environments, benchmarks, and post-training systems.

Principal Research Engineer - RL Gyms (San Francisco)

Apply now

Research Engineer (Brazil)

Apply now

Research Engineer (Colombia)

Apply now

Principal Research Engineer - Code (San Francisco)

Apply now

Forward Deployed AI Engineer (San Francisco or New York City)

Apply now

Senior Engineering Manager (San Francisco)

Apply now

LLM Researchers Happy Hour During ICLR

Join us for an invite-only gathering bringing together AI researchers and enterprise leaders driving real-world AI innovation.

📅April 23, 2026 (6:00 PM - 9:00 PM)

RSVP Now

548 Market Street, PMB 18282, San Francisco, CA 94104

Turing at ICLR 2026 | Booth #301

RL environments

Benchmarks and evaluation

Off-the-shelf data packs

Expert-verified data

Case studies & collaborations

Contribute as a researcher

Work for Turing’s internal team

LLM Researchers Happy Hour During ICLR

Build with Turing