Learn in Public – week 02

I started this week with deeplearning.ai’s course on semantic caching, created in collaboration with Redis. That sent me down a rabbit hole, exploring different LLM caching strategies and the products that support them.

One such product is AWS Bedrock Prompt Caching. If large parts of your prompts are static (specifically, the prefixes), retokenizing the prefix on every request is a waste of time and money. Prompt or context caching lets you process the prefix once and store it, reducing costs and improving performance.

Sounds great, right? Let’s check the pricing mode. If your requests are more than 5 minutes apart, your cache will be cleared. If your requests are short, caching won’t be activated; if the cache hit rate is low, you will pay an extra, non-usage-based premium for cache writes. I highly recommend reading the “How Much Does Bedrock Prompt Caching Cost?” section in the article “Amazon Bedrock Prompt Caching”.

	Nicole S on 5 Python NLP pacakges
	blissful4bdd2399fa on CSV to radar plot
	tom on CSV to radar plot
	Matt on CSV to radar plot
	“ – Tom… on 📚 Book club Q1 2024 – 3…

Learn in Public – week 02

Published by tom

Leave a comment Cancel reply

Share this:

Related

Published by tom

Leave a comment Cancel reply