Code that runs, not code that sounds right.
The planned model learns from compiler results, unit tests, patch application, runtime errors, repository checks, and benchmark harnesses.
Verifier-guided coding model training
The planned model learns from compiler results, unit tests, patch application, runtime errors, repository checks, and benchmark harnesses.
The benchmark plan covers LiveCodeBench, HumanEval+, MBPP+, MultiPL-E, Aider editing, RepoQA, SWE-bench Verified, and long-horizon SWE-style tasks.
The repo has a local verifier demo and a training plan. Public scores come only after the checkpoint, harness, compute, and contamination notes are recorded.
The planned path is baseline evaluation, core SFT, edit SFT, multilingual compiler loops, long-repo training, patch search, CodeWorldModel, verifier RL, and one-shot distillation.
The first public step is the local verifier demo. The next steps are dataset review, baseline evaluation, small checkpoints, edit training, repository tasks, and measured benchmark reports.
Compute support is described in the funding brief, but the site focuses on the project: training code models with execution feedback and publishing measured results.
contact@avixosec.xyz