Skip to content

Commit 7cd46d7

Browse files
authored
Merge pull request #19 from rfbatista/fix/fix-typo-extensive
fix: typo extensive
2 parents 383cebc + 56df9f7 commit 7cd46d7

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

docs/llm-inference-basics/training-inference-differences.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,10 @@ Common techniques used in LLM training include:
2323
- **Reinforcement learning**: Allow the model to learn by trial and error, optimizing based on feedback or rewards.
2424
- **Self-supervised learning**: Learn by predicting missing or corrupted parts of the data, without explicit labels.
2525

26-
Training is computationally intensive, often requiring extensive GPU or TPU clusters. While this initial cost can be very high, it is more or less a one-time expense. Once the model achieves desired accuracy, retraining is usually only necessary to update or improve the model periodically.
26+
Training is computationally intensive, often requiring expensive GPU or TPU clusters. While this initial cost can be very high, it is more or less a one-time expense. Once the model achieves desired accuracy, retraining is usually only necessary to update or improve the model periodically.
2727

2828
## Inference: Using the model in real-time
2929

3030
LLM inference means applying the trained model to new data to make predictions. Unlike training, inference happens continuously and in real-time, responding immediately to user input or incoming data. It is the phase where the model is actively "in use." Better-trained and more finely-tuned models typically provide more accurate and useful inference.
3131

32-
Inference compute needs are ongoing and can become very high, especially as user interactions and traffic grow. Each inference request consumes computational resources such as GPUs. While each inference step may be smaller than training in isolation, the cumulative demand over time can lead to significant operational expenses.
32+
Inference compute needs are ongoing and can become very high, especially as user interactions and traffic grow. Each inference request consumes computational resources such as GPUs. While each inference step may be smaller than training in isolation, the cumulative demand over time can lead to significant operational expenses.

0 commit comments

Comments
 (0)