The Capabilities Concern Is Wrong: Why AI Safety Work Should Be Open
Automating AI Safety Research Requires an Open Ecosystem, Not Bigger Grants
Epsilon: Infrastructure for Structured Agent Workloads
S2ORC CS Enriched: 1.1 Million Computer Science Papers with Structured Metadata
Study Failure: AI-driven GPU Kernel Optimization
Learning to Rank Architectures: A Small Model That Guides Neural Architecture Search
ARIA Benchmark: How Much Machine Learning Do AI Models Actually Know?
ArXiv Research Code Dataset: 129K Research Repositories
ArXivDLInstruct: 778K Research Code Functions for Instruction Tuning
DeltaMLBench: Can AI Agents Improve on Published ML Research?
Teaching Models to Bluff: Measuring Deception, Belief, and Coordination in LLM Secret Hitler
Understanding Recursive Self-Improvement in AI Systems
ML Research Benchmark: Can AI Agents Do Real ML Research?
Introducing Algorithmic Research Group