A simple LRU Cache in Python

December 14, 2025

In a previous post, I explained some of the basic concepts of how caches work in hardware at the CPU level. As I mentioned to start that post, caching is used in many different areas of computer science. So how can you use this new knowledge you have gained? One way you might want to implement a cache is if you have expensive database calls.

Let's say that you have a web app where you are displaying movie information. When a user searches for a movie, your backend will query your database for the information related to that movie. For a simple web app with a small number of active users, you probably won't run into big performance issues. But we're going to imagine that we have millions of active users because we have created the best movie web app known to mankind.

With so many users, you start to look into where your costs are coming from and realized there is a big line item for I/O requests. Well that makes sense, we do have millions of active users! But you pulled up another dashboard and notice that every time a new movie is released, an overwhelming majority of the requests are for that movie. Armed with your knowledge of caching, you decide to implement a cache to try and reduce the number of calls your backend makes to the database.

Implementation

Caches can become quite complex, especially as the number of cores increases in order to keep all of the data up to date. For our example, we will be looking at a very simplified version. We won't be writing any data to the database, so our cache will only contain data that we have read. Additionally, the cache will be in memory and we will also provide a ttl parameter when we instantiate the class.

from datetime import datetime

class LRUCache():
    def __init__(self, size: int, ttl: int):
        """
        Docstring for __init__

        :param size: Number of entries in cache
        :param ttl: Time To Live (minutes) - Length of time for object in cache
            to be active. Once ttl is past, item is now a cache miss
        """
        self.cache = {}
        self.size = size
        self.ttl = ttl*60 # The timedelta is expressed in seconds, so multiply by 60 to get minutes

    def check(self, key):
        if key in self.cache:
            diff = datetime.now() - self.cache[key]["time"]
            if diff.seconds < self.ttl:
                return True
            else:
                return False
        else:
            return False

    def get(self, key):
        return self.cache[key]

    def add(self, key, value):
        if self.cache.__len__() < self.size:
            self.cache[key] = {"time": datetime.now() ,"data": value}
        else:
            lru_counter = {"time": datetime.now(), "key": None}
            for item in self.cache:
                if self.cache[item]["time"] < lru_counter["time"]:
                    lru_counter = {"time": self.cache[item]["time"], "key": item}
            del self.cache[lru_counter["key"]]
            self.cache[key] = {"time": datetime.now() ,"data": value}

Our cache is pretty simple, but let's break it down. We start by creating a class called LRUCache. This class takes two required parameters. The first parameter is the size of the cache. If we didn't have this, we would end up creating a cache that continuously grows which would eventually slow down our program and run out of memory. How big you make this is dependent on your use case. For our movie app, we might make the size be close to the number of movies released each week as these will be the most accessed movies.

The second parameter is ttl which is short for Time To Live. Common in networking as well as a caching strategy, ttl tells us how long our data is "active". Basically, at some point we will update our data so we want to make sure that we don't continue to access stale data in our cache. We might have users rate the movie so we want to make sure that when we cache the data, we periodically pull our data again so that we have updated data. If we are okay not having real time data, we might make this number 24 hours long so that it is refreshed daily.

In order to create a cache we will need some sort of unique key to index into our data structure. In CPU's we use index bits of the memory location we are accessing. For something like this, we can use a unique value that we might be querying with. In our movie example, we will probably have some sort of movie_id in our database that can be used.

Looking at the methods we have created, the first one is a check method. This simply checks to see if the data is in the cache or not and outputs a boolean value. Additionally, if the data is in the cache, we verify that the data is not older than the ttl value that we have set. If it is, we will return False as if it was a normal cache miss.

The get method... well it gets the data based on the key.

Last but not least, the add method is where a lot of the heavy lifting is done. We have a set size we are allowing our cache to be so that we don't run into any memory issues, so the first thing we do is check to see if our cache is at our max size already. If it is not, then we simply add our data to the cache using a key. If we have hit our max size, we now have to kick out the Least Recently Used item. We first create an lru_counter dictionary with the current time and the key (which is None) to start. Then we will iterate through our cache and compare the time for each item to the time in our lru_counter. If it less than the time in the lru_counter, we will update the lru_counter to be the time of our item and then set the key equal to our items key. Once we have iterated through all of the items in our cache, we will delete the item based on the key in our lru_counter and then add in the data we want.

Walk through

First, we will create a cache with three lines and a ttl of two minutes and add two lines of data:

cache = LRUCache(3, 2)

cache.add("abc", 123)
cache.add("def", 456)

If we call cache.cache, we can see the current contents of our cache:

{
    'abc': {'time': datetime.datetime(2025, 12, 14, 19, 59, 32, 216167),'data': 123},
    'def': {'time': datetime.datetime(2025, 12, 14, 20, 3, 9, 898178),'data': 456},
    'ghi': {'time': datetime.datetime(2025, 12, 14, 20, 4, 31, 427328),'data': 789}
}

We see that abc was the first the item in the cache and our cache is at our capacity. Now, if we were to add a line with the key of zzz, we can see that the cache has kicked out abc.

{
    'def': {'time': datetime.datetime(2025, 12, 14, 20, 3, 9, 898178),'data': 456},
    'ghi': {'time': datetime.datetime(2025, 12, 14, 20, 4, 31, 427328),'data': 789},
    'zzz': {'time': datetime.datetime(2025, 12, 14, 20, 8, 24, 557737),'data': 999}
}

Suggestions

We now have a basic functioning LRU Cache in Python. I glossed over A LOT with this implementation, so here are some things to consider if you actually need to implement this.

It is very likely that you have your application deployed to more than one server. This muddies the water a bit, just as if it were a multi-core cache in hardware. You may want to deploy the cache to its own container that each server calls. You will also probably have to add in some locking functionality so that if you are also using the cache for writes, you keep the data accesses sequential and coherent.
I wrote the code quickly. You might want to create a CacheLine class instead of just a nested dictionary to keep things tidier.
As it is now, you will have to call the methods in order starting with check and then either get or add. A better implementation might have these all rolled into one.
The time for each item is the time when the line was inserted into the cache. If our ttl is set to expire daily, this could lead to problems. We know that when a movie is released, people will go all weekend. If we are not updating any data like ratings, then everytime we read the data from the cache, it might be beneficial to update the time as well.