How semantic caching reduces LLM API calls

Bill Doerrfeld | May 5, 2025

My latest for The New Stack explores semantic caching, an emerging strategy to optimize agentic AI.


Semantic caching is like typical caching, but for AI. It could eliminate a lot of redundant API calls to LLMs, reducing costs and improving performance.


My latest for The New Stack explores semantic caching — what it is, how it works, and what the benefits are. According to the sources, semantic caching is poised to become more of a standard practice for optimizing how applications behave with AI, reducing latency and lowering the bar as costs increase.


Featured image credit: Donald Wu


Read: What Is Semantic Caching?

Other Blog Posts

Close-up of a glowing laptop keyboard in blue light, viewed at an angle with the screen above
By Bill Doerrfeld May 25, 2026
My latest InfoWorld feature explores how Model Context Protocol (MCP) supports context engineering for AI-assisted coding.
A set of metal keys on a keyring resting on a wooden surface.
By Bill Doerrfeld May 22, 2026
My latest for Nordic APIs explores 10 API key security risks and what to use alongside keys for stronger API security.
By Bill Doerrfeld May 18, 2026
The yearly API conference, apidays New York, is a hotbed for solid discussion on what's top of mind in the API space, and as MC I had a front row seat.
By Bill Doerrfeld May 13, 2026
My latest for CIO Online features real results form CIOs actively deploying AI agents to empower sales and revenue teams.
By Bill Doerrfeld May 12, 2026
Reports say consumers are souring on AI everywhere, all the time. So, at the risk of losing trust, or even potential business, is adding AI to an existing product really worth it?
By Bill Doerrfeld May 1, 2026
Cloudflare rebuilt Next.js over a weekend using agentic coding.
By Bill Doerrfeld April 20, 2026
My InfoWorld feature reviews the key building blocks in agentic systems and with real-world examples from Shopify, Block, and others.
By Bill Doerrfeld March 31, 2026
My latest InfoWorld feature explores what makes an enterprise MCP registry effective, from semantic discovery to governance and security for AI agents.
By Bill Doerrfeld March 30, 2026
My first-ever contribution to CSO Online looks at the shifting landscape, from perimeter-based security to API security, and how CISOs are responding.
By Bill Doerrfeld March 29, 2026
My latest feature for The New Stack looks into solutions being proposed to fix open source Slopmageddon.