Hero--3

Turing at ICML 2026 | Booth #B406

Explore off-the-shelf data packs, coding benchmarks, and RL environments for post-training, evaluation, reward modeling, and production.

Talk to a Researcher ⟶

Discuss your evaluation or post-training goals with Turing’s technical team

Work with Turing ⟶

Contribute as a domain expert or explore full-time roles

OTS data packs, benchmarks, and RL environments

OTS data packs

Ready to deploy. Validated and calibrated for frontier evaluation, RLVR, reward modeling, benchmarking, fine-tuning, and failure-mode analysis.

Open-MM-RL
MM-STEM-HLE++
HLE++
Advanced Reasoning Rubrics

Request Data Packs

Coding & SWE evaluation

Deterministic benchmarks built on real-repository tasks and verified outcomes. Reusable across evaluation, SFT, and RL.

CyberBench
‍SWE-bench++
Code Review Bench
Terminal-Bench
LiveCodeBench

Explore Benchmarks

RL environments

Controlled environments for computer-use and MCP agents. Each run includes prompts, tools, workflows, verifiers, reward logic, leaderboards, and full tool-environment traces.

Explore RL Environments

Expert-verified datasets

Human-in-the-loop datasets across enterprise, STEM, and multimodal domains. Domain-precise, fully traceable, and ready for SFT, RL, and evaluation.

Request Data Hub Access

Info Display -- 3

Built with teams advancing frontier AI

Turing partners with AI labs and enterprises to build evaluation-safe data, post-training systems, and deployment-ready workflows. We connect research progress to real-world model performance.

Info Display -- 1 [dark-mode]

Apply your expertise to frontier AI

Contribute as a researcher

Flexible, project-based work. Design hard tasks, evaluate model outputs, and identify reasoning gaps.

Join the Researcher Network

Join Turing full-time

Build post-training systems, evaluation loops, production infrastructure, and real-world model improvement workflows.

Explore full-time roles

548 Market Street, PMB 18282, San Francisco, CA 94104

Turing at ICML 2026 | Booth #B406

OTS data packs, benchmarks, and RL environments

Built with teams advancing frontier AI

Apply your expertise to frontier AI

Turn model failures into training signal