System Design 101 - Caching

System Design 101 - Caching

In the previous post, we discussed the top system design concepts that software engineers should know. One of them is caching.

Caching is an essential concept in system design that involves storing frequently accessed data in memory or on disk to improve system performance and reduce the need to retrieve data from slower data sources, such as databases or network storage. Caching can be implemented at various levels of a system, including application-level caching, database-level caching, and network-level caching.

In this blog post, we'll explore the concept of caching in more detail and discuss some real-world examples and code samples.

Why Caching is Important

Caching is important in system design for several reasons. First, it can help to improve system performance by reducing the amount of time it takes to access frequently used data. By storing data in memory or on disk, rather than retrieving it from a slower data source, such as a database or network storage, caching can significantly reduce system latency and improve response times.

Second, caching can help to reduce the load on data sources, such as databases or network storage. By storing frequently accessed data in memory or on disk, caching can reduce the number of requests made to these data sources, which can help to improve their performance and reduce the risk of overloading them.

Third, caching can help to improve system scalability by reducing the need for additional hardware or infrastructure. By improving system performance and reducing the load on data sources, caching can help to extend the life of existing hardware and infrastructure, and delay the need for additional investments.

Types of Caching

Various types of caching can be implemented in a system, including:

Application-level Caching

Application-level caching involves caching data within the application itself, typically in memory. This can be implemented using various caching frameworks, such as Memcached or Redis. Application-level caching can be useful for storing frequently accessed data, such as user profiles or session data, and can significantly improve application performance.

Database-level Caching

Database-level caching involves caching query results within the database itself, typically in memory. This can be implemented using various caching frameworks, such as Oracle Coherence or Amazon ElastiCache. Database-level caching can be useful for reducing the load on databases and improving database performance.

Network-level Caching

Network-level caching involves caching data at various points within the network, such as at edge servers or content delivery networks (CDNs). This can be implemented using various caching frameworks, such as Varnish or Cloudflare. Network-level caching can be useful for improving system performance and reducing the load on data sources, such as origin servers or databases.

Caching Patterns

Let's dive into some of the different caching patterns that can be used for implementing caching in your systems.

  1. Write-through caching: This pattern involves writing data to both the cache and the underlying data store at the same time. This ensures that the cache and the data store remain consistent at all times. However, it can also result in slower write performance due to the additional write operations.

  2. Write-around caching: With this pattern, data is written directly to the data store and is only cached when it is accessed frequently. This can reduce the workload on the cache and improve write performance, but it can also result in slower read performance when data is first accessed.

  3. Write-back caching: This pattern involves writing data to the cache first and then asynchronously writing it back to the data store at a later time. This can improve write performance by reducing the number of writes to the data store, but it can also introduce the risk of data loss if the system crashes before the data is written back to the data store.

  4. Cache-aside caching: With this pattern, the application retrieves data from the data store and stores it in the cache if it is accessed frequently. When the data is updated, it is written directly to the data store and the cache is invalidated. This can provide good read and write performance, but it requires more management of the cache to ensure consistency.

  5. Cache-through caching: This pattern involves retrieving data from the cache first and then from the data store if the data is not in the cache. When data is updated, it is written to both the cache and the data store. This can provide good read performance, but it can result in slower write performance due to the additional write operations.

It's important to consider the specific use case and requirements when selecting a caching pattern. Each pattern has its own strengths and weaknesses, and the appropriate pattern will depend on factors such as the read and write performance requirements, the consistency requirements, and the overall system architecture.

Caching Solutions

Let's take a look at some popular caching solutions used in the industry:

  1. Redis: Redis is an in-memory key-value store that supports a wide variety of data structures such as strings, hashes, lists, and sets. It is often used as a cache for frequently accessed data in web applications, message brokers, and other distributed systems.

  2. Memcached: Memcached is an open-source, distributed memory caching system that is commonly used to speed up web applications. It allows you to store key-value pairs in memory and supports a wide range of programming languages including Java, PHP, and Python.

  3. Amazon ElastiCache: Amazon ElastiCache is a managed, in-memory data store service provided by Amazon Web Services (AWS). It supports both Memcached and Redis and can be used to cache frequently accessed data in web applications running on AWS.

  4. Varnish Cache: Varnish Cache is a web application accelerator that is designed to improve the performance of websites. It works by caching frequently accessed content, such as images and CSS files, and serving them directly from memory instead of fetching them from the web server.

  5. Squid: Squid is a caching proxy server that is commonly used to speed up web browsing. It works by caching frequently accessed web pages, and serving them directly from memory instead of fetching them from the web server. Squid can also be used as a reverse proxy to improve the performance of web applications.

  6. Hazelcast: Hazelcast is an open-source, in-memory data grid that is often used for distributed caching. It allows you to store data in memory across multiple nodes in a cluster, providing high availability and low-latency access to frequently accessed data.

These are just a few examples of the many caching solutions available in the industry. The choice of caching solution will depend on the specific requirements of your application, such as the amount of data to be cached, the level of availability required, and the expected access patterns.

Real-World Examples

Here are some real-world examples of caching in action:

  1. Facebook: Facebook uses Memcached extensively for caching user profiles, photos, and other frequently accessed data. According to Facebook, Memcached allows them to handle over 200 million active users with a relatively small number of servers.

  2. Amazon: Amazon uses Amazon ElastiCache for caching frequently accessed data in their online store. ElastiCache allows Amazon to reduce the load on their databases and improve response times for customers.

  3. Netflix: Netflix uses a combination of application-level caching and network-level caching to improve system performance and reduce the load on its infrastructure. Netflix uses a caching framework called Hystrix, which is designed to handle the complex interactions between services in a distributed system.

Code Sample - Implementing Caching in Python

Now that we've seen some of the benefits of caching, let's look at how we can implement caching in Python.

Using a Simple Cache

One of the simplest ways to implement caching in Python is to use a dictionary to store the cached data. Here's a simple example:

import time

# Create a dictionary to store the cached data
cache = {}

# A function that takes a long time to execute
def slow_function(arg):
    time.sleep(5) # Simulate a long computation
    return arg.upper()

# A wrapper function that adds caching to the slow function
def cached_function(arg):
    # Check if the result is already in the cache
    if arg in cache:
        return cache[arg]
    else:
        # If the result is not in the cache, compute it and store it in the cache
        result = slow_function(arg)
        cache[arg] = result
        return result

In this example, we have a slow function slow_function that takes a long time to execute. We've also created a dictionary called cache to store the cached data.

We've then defined a wrapper function called cached_function that calls slow_function but adds caching. The cached_function first checks if the result for the given argument arg is already in the cache dictionary. If it is, it returns the cached result. Otherwise, it calls slow_function to compute the result, stores the result in the cache dictionary, and returns the result.

Here's an example of how to use the cached_function:

# Call the slow function (this takes 5 seconds)
result1 = slow_function("hello")
print(result1)

# Call the cached function (this also takes 5 seconds)
result2 = cached_function("hello")
print(result2)

# Call the cached function again (this should be fast since the result is cached)
result3 = cached_function("hello")
print(result3)

As you can see, the first time we call slow_function, it takes 5 seconds to execute. However, when we call cached_function with the same argument, it also takes 5 seconds to execute since the result is not cached yet. But when we call cached_function again with the same argument, it returns immediately since the result is already cached.

This simple caching mechanism can be very effective in speeding up our application if we have functions that take a long time to execute and return data that doesn't change frequently. However, there are some limitations to this caching mechanism. For example, if the data in the database changes frequently, our cache may become stale and we may end up returning outdated data. In such cases, we may need to implement a more sophisticated caching mechanism that can handle invalidation and expiration of cached data.

Using a Cache Library

Python provides several cache libraries that we can use to implement caching. One popular library is cachetools, which provides a variety of caching algorithms and features. Here's an example of how to use cachetools:

from cachetools import cached, TTLCache
import time

# A function that takes a long time to execute
@cached(cache=TTLCache(maxsize=100, ttl=300))
def slow_function(arg):
    time.sleep(5) # Simulate a long computation
    return arg.upper()

In this example, we've used the @cached decorator from cachetools to add caching to the slow_function.

Cache Maintenance

Once we have implemented caching in our system, we need to ensure that the cache is maintained properly. This includes adding and removing items as necessary and ensuring that the cache stays within the specified size limit. One way to do that is to use a proper cache eviction policy.

Cache eviction policies determine which cache entries are removed from the cache when the cache becomes full. There are several different eviction policies available, each with its own advantages and disadvantages. Let's take a look at a few common eviction policies:

  1. Least Recently Used (LRU): The LRU policy removes the least recently used cache entry when the cache becomes full. This policy is based on the assumption that cache entries that have not been accessed recently are less likely to be accessed again in the near future.

  2. First In, First Out (FIFO): The FIFO policy removes the oldest cache entry when the cache becomes full. This policy is based on the assumption that the oldest cache entries are the least likely to be accessed again in the near future.

  3. Least Frequently Used (LFU): The LFU policy removes the least frequently used cache entry when the cache becomes full. This policy is based on the assumption that cache entries that have been accessed less frequently are less likely to be accessed again in the near future.

  4. Random: The random policy removes a cache entry at random when the cache becomes full. This policy is simple to implement, but may not be the most efficient as it can lead to the removal of frequently accessed cache entries.

It's important to choose the right eviction policy based on the specific requirements of your application. For example, if your application has a high read-to-write ratio and frequently accessed cache entries are more likely to be accessed again in the near future, the LRU policy may be the best choice. However, if your application has a high write-to-read ratio and frequently accessed cache entries are not likely to be accessed again in the near future, the LFU policy may be more appropriate.

In addition to these basic eviction policies, there are also more advanced policies that take into account factors such as the size of cache entries and the cost of re-creating a cache entry. For example, the LRU-K policy removes the least recently used K cache entries when the cache becomes full, where K is a configurable parameter. This policy can be useful when cache entries have different sizes or creation costs.

Conclusion

Caching is an important technique for improving the performance and scalability of software systems. By storing frequently accessed data in memory, we can reduce the amount of time spent accessing slower storage systems, such as disks or databases.

However, it's important to use caching judiciously and to consider factors such as cache size, cache eviction policies, and cache consistency when designing and implementing caching in our systems.

Thank you for staying with me so far. Hope you liked the article. You can connect with me on LinkedIn where I regularly discuss technology and life. Also, take a look at some of my other articles and my YouTube channel. Happy reading. ๐Ÿ™‚