
CEOs from Large AI Companies
Four hours. That is how long I had to wait last week after burning through my Claude tokens in the middle of a full work session. Everything I had planned for that afternoon had to stop. Not because I ran out of ideas. Not because the work was done. Because I ran out of tokens.
If you use Claude, ChatGPT, or Gemini seriously, you have felt this. Tokens are becoming more limited across every major LLM right now. Free users are hitting walls faster than ever. Paid users are running out sooner than they used to. Nobody is talking about how to actually fix it.
This week I want to change that.
First, understand what is actually happening
Most people think of tokens like messages. Send a message, use a token. That is not how it works.
Every time you send a message, the LLM re-reads the entire conversation history from the beginning. Every single time. A conversation that is 20 messages deep costs dramatically more per message than one that is 3 messages deep. The longer your conversation runs, the more expensive every reply becomes. This is the number one reason people burn through tokens faster than they expect.
How to fix it across every major LLM
Time your heavy sessions strategically
This one surprised me. On Claude, your usage limit runs on a rolling 5 hour window not a daily reset. Your tokens also deplete faster during peak usage hours. If you have flexibility in when you work, scheduling your heaviest sessions during off peak hours stretches your limit further without changing anything else about how you work.
Stop saying hello and thank you
Every word you type costs tokens. Every word the LLM generates costs tokens. Pleasantries like "thank you that was really helpful" before your next question add up across hundreds of interactions. Get straight to the point every time.
Start a new conversation when you switch tasks
When you finish one task and move to another, do not continue in the same chat. Open a fresh conversation. You immediately reset the context window cost back to zero. This single habit will make the biggest difference to how long your tokens last.
Save your memory files as markdown not PDF
If you upload context documents to your LLM as PDFs, the model processes the file far more heavily than it needs to. Switch to markdown instead. Save your document as “filename.md” rather than “filename.pdf”. Same information. A fraction of the context window used. Takes two minutes to fix right now.
Use the right model for the right task
Not every task needs the most powerful model. Claude has Haiku, Sonnet, and Opus. ChatGPT has GPT-4o mini and GPT-4o. Gemini has Flash and Pro. The bigger the model the more tokens it costs per message. Use the lighter model for simple tasks like summarising or drafting emails. Save the heavy model for complex reasoning and deep research.
Batch your questions into one message
Every time you send a separate message the LLM reloads the entire conversation history. Sending three separate messages costs three times as much context overhead as one message with three questions. Write everything you need in a single prompt. It saves tokens and usually gets you a better answer.
Use Projects on Claude to cache your files
If you are uploading the same document across multiple conversations you are paying the token cost every single time. Claude Projects lets you upload your files once and reference them across every conversation inside that project without re-paying the token cost. Upload once. Stop wasting tokens every conversation.
The real problem nobody is saying out loud
Tokens are getting more expensive and limits are getting tighter across every platform. This is not going to reverse. The students and professionals who learn to work efficiently within these limits right now will have a significant advantage over everyone who just waits for the reset.
Treat your tokens like a resource. Because that is exactly what they are.
Where to Start
If you only do one thing from this post today, switch your memory files from PDF to markdown. Open your file, copy the text, save it as “filename.md”, and upload that next time instead. Two minutes. You will feel the difference immediately.
See you next Tuesday.
Kaishu Kagami
Founder, TechFuel
