Tokenization breaks raw text into units the model can actually process. Those units are often smaller than words, which is why token counts and word counts do not line up neatly.
This matters across the chatbot, LLM, and fine-tuning posts because context limits, cost, latency, and next-token prediction are all defined in terms of tokens rather than sentences.
