Fine-tuning vs prompting · AI / ML

What is the fine-tuning vs prompting choice?

When an LLM is not doing what you need, you can change the prompt, add retrieval, or fine-tune the model on your own examples. Each costs more than the last. Knowing which lever to pull — and in what order — is a core decision in building LLM applications.

Why it matters

Fine-tuning is expensive and slow, and people reach for it far too early when a better prompt or RAG would have solved the problem. Picking the right approach saves time and money and produces a better result. This judgment is exactly what distinguishes someone who has shipped LLM features from someone who has read about them.

What to learn

The escalation ladder: prompt, then RAG, then fine-tune
What fine-tuning is good at (style, format) versus knowledge
Why RAG, not fine-tuning, is usually right for facts
Data requirements for fine-tuning
Cost and maintenance of a fine-tuned model
Parameter-efficient fine-tuning (LoRA)
Evaluating whether fine-tuning actually helped

Common pitfall

Fine-tuning to teach the model new facts. Fine-tuning shapes behavior and style; it is a poor and expensive way to inject knowledge, which also goes stale. For facts, use RAG to supply current information at query time. Reserve fine-tuning for consistent format, tone, or task behavior that prompting cannot reliably achieve.

Resources

Primary (free):

Practice

Take a task that is failing with a basic prompt. Try to fix it first with a better prompt, then with retrieval, and only then consider fine-tuning. Write down which level solved it and why. Done when you can justify the cheapest approach that worked, and explain when fine-tuning would have been warranted.

Outcomes

Escalate from prompting to RAG to fine-tuning in order.
Explain what fine-tuning is and is not good for.
Use RAG rather than fine-tuning for factual knowledge.
Judge whether fine-tuning improved results.