What is the fine-tuning vs prompting choice?
When an LLM is not doing what you need, you can change the prompt, add retrieval, or fine-tune the model on your own examples. Each costs more than the last. Knowing which lever to pull — and in what order — is a core decision in building LLM applications.
Why it matters
Fine-tuning is expensive and slow, and people reach for it far too early when a better prompt or RAG would have solved the problem. Picking the right approach saves time and money and produces a better result. This judgment is exactly what distinguishes someone who has shipped LLM features from someone who has read about them.
What to learn
- The escalation ladder: prompt, then RAG, then fine-tune
- What fine-tuning is good at (style, format) versus knowledge
- Why RAG, not fine-tuning, is usually right for facts
- Data requirements for fine-tuning
- Cost and maintenance of a fine-tuned model
- Parameter-efficient fine-tuning (LoRA)
- Evaluating whether fine-tuning actually helped
Common pitfall
Fine-tuning to teach the model new facts. Fine-tuning shapes behavior and style; it is a poor and expensive way to inject knowledge, which also goes stale. For facts, use RAG to supply current information at query time. Reserve fine-tuning for consistent format, tone, or task behavior that prompting cannot reliably achieve.
Resources
Primary (free):
- OpenAI — Fine-tuning guide · docs
- Hugging Face — PEFT / LoRA · docs
- Anthropic — When to fine-tune · docs
Practice
Take a task that is failing with a basic prompt. Try to fix it first with a better prompt, then with retrieval, and only then consider fine-tuning. Write down which level solved it and why. Done when you can justify the cheapest approach that worked, and explain when fine-tuning would have been warranted.
Outcomes
- Escalate from prompting to RAG to fine-tuning in order.
- Explain what fine-tuning is and is not good for.
- Use RAG rather than fine-tuning for factual knowledge.
- Judge whether fine-tuning improved results.