Algorithmic
Research Group
We study how software and industrial systems recursively improve themselves in real-world settings.
// Research
ARIA Benchmark: How Much Machine Learning Do AI Models Actually Know?
A benchmark suite for evaluating AI agents on real machine learning research tasks — including task definitions, a baseline agent, and...
ArXiv Research Code Dataset: 129K Research Repositories
A benchmark suite for evaluating AI agents on real machine learning research tasks — including task definitions, a baseline agent, and...
ArXivDLInstruct: 778K Research Code Functions for Instruction Tuning
A benchmark suite for evaluating AI agents on real machine learning research tasks — including task definitions, a baseline agent, and...
DeltaMLBench: Can AI Agents Improve on Published ML Research?
A benchmark suite for evaluating AI agents on real machine learning research tasks — including task definitions, a baseline agent, and...
ML Research Benchmark: Can AI Agents Do Real ML Research?
A benchmark suite for evaluating AI agents on real machine learning research tasks — including task definitions, a baseline agent, and...
// About
Recursive self-improvement is beginning to shape both software and industry. In software, AI systems are increasingly involved in designing, training, and optimizing other AI systems. Progress compounds through algorithmic advances and improvements in computing hardware.
In the physical world, similar dynamics are emerging in robotics, manufacturing, and supply chains, where automated systems increasingly optimize the processes that produce, deploy, and refine them.
Algorithmic Research Group studies recursive systems in practice. We measure progress, develop tools, and analyze how recursive improvement changes the behavior and capabilities of real-world systems.
Our work focuses on understanding the dynamics, limits, and impacts of self-improving systems across software and industrial domains.
