What is Fine-tuning?
Fine-tuning is the process of further training a pre-trained LLM on a smaller, task-specific dataset to specialize its behavior — for a domain (legal, medical), a style (brand voice), a task (classification, structured extraction), or a language.
Also known as: LLM fine-tuning, model fine-tuning, supervised fine-tuning
When to fine-tune vs prompt or RAG
Prompt-engineer first — it's the fastest iteration. RAG when you need access to specific knowledge (especially dynamic knowledge). Fine-tune when you need consistent output style, domain reasoning that can't fit in any context window, or to deeply teach a small model behavior that previously required a large model.
Fine-tuning approaches
(1) Full fine-tuning — update all model weights. Expensive, requires significant compute, but maximum customization. (2) LoRA / QLoRA — train a small adapter that modifies the model's behavior without updating base weights. Much cheaper, easy to swap adapters per task. (3) Instruction tuning — train on (instruction, response) pairs to make a base model into an assistant. (4) Preference tuning (DPO, RLHF) — train on preference pairs to shape the model's output style.
Data quality dominates
Fine-tuning is dramatically more sensitive to data quality than to data quantity. 500 carefully-curated examples beat 50,000 noisy ones. Spend time on labeling and examples; rerun the same fine-tune with cleaner data before scaling. Most failed fine-tuning projects are data problems, not algorithm problems.
Costs and risks
Fine-tuning a frontier-tier model costs hundreds to thousands of dollars per run; LoRA on a smaller model can be tens of dollars. Risks: catastrophic forgetting (the model loses general capabilities), overfitting to the fine-tune set, model drift (training data leaks behaviors you didn't intend). Always evaluate on a held-out set that tests general capability, not just the fine-tune task.