Code LLMs

October 4, 2024 1 min read

[2410.02749] Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
x.com
[2410.02089] RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
NeurIPS Competition HAC: The Hacker-Cup AI Competition

Code Embedding Models

voyage-code-3: more accurate code retrieval with lower dimensional, quantized embeddings – Voyage AI

Synthetic Tasks

remove function body, implement it, validate output against original function
corrupt function, have LLM find and fix mistakes (AST based)
translate code into other languages
commit => describe issue => implement commit
make up tasks => write tests => implement code
fill in the middle tasks
learn to optimize, best of N implementation to write faster version of existing functions
existing code => tests => new code
parser guided generation (reject paths that can’t parse)
structured generation (grammar based) Structured Generation with LLMs

ml

Contents

Code Embedding Models
Synthetic Tasks