Consistent Hashing in Distributed Systems

By Ali Adnan

Load Balancing is a key concept in distributed systems, it is one of the challenging problem to make sure there is equal distribution, hashing is important to achieve this and consistent hashing is one of the solid ways to do it

What is Hashing?

Hashing is the process of mapping one piece of data. A function is usually used for mapping objects to hash code known as a hash function.

For example, a hash function can be used to map random size strings to some fixed number between 0 … N. Like

“abc” -> 123

Load Balancing

Load balancing is about distributing a set of tasks over a set of resources. For example, we use load balancing to distribute the API requests to web server instances.

Mod-n Hashing

In this technique, each key is hashed using a hashing function to transform an input into an integer. Then, we perform a modulo based on the number of nodes.

mod-n-hashing

The benefit of this approach is its statelessness. We don’t have to keep any state to remind us that request2 was routed to node 1. But things would change we add or remove a node.

mod-n-hashing-conclusion

This results in lots of performance issues, because now requests will move to other nodes and there will be lots of caches miss, to handle this we use consistent hashing.

Consistent Hashing

Its a technique, released in a paper by Karger in 1977 that works on fixed keys-slots ring algorithm to find the node, in the following example its just 8 portions ring, in real world example it could be a lot bigger.

consistent-hashing-ring

We map our nodes/server on this ring using a hash function, in the following case, there are three nodes.

consistent-hashing-ring-nodes

When it receives the request, it gets the hash of the requests and checks the ring clock-wise and routes the request to first node it finds in the range. Lets take example of three requests

f(request1) = 1

f(request2) = 4

f(request3) = 6

so these requests will be handled as following

consistent-hashing-ring-nodes-requests

Request1 will be handled by N2, request2 by N3 and request3 by N1, so far so good now lets suppose we need to add a new node between N2 and N3 to handle more requests.

consistent-hashing-ring-nodes-requests-conclusion

Now request2 will route to N4 instead of N3, rest is all unchanged, which is great but you will analyze that traffic won’t be evenly distributed to all nodes another case would be if N2 is crashed or removed, then all requests will route to N4 which will eventually overload N4, so solve this virtual nodes are created and multiple hashing functions are used to get different indices for nodes as following

consistent-hashing-ring-final-conclusion

In a excellent consistent hashing algorithm only 1/n keys are mapped again where n is number of nodes. O(1) to add or remove nodes.

DiskSweep Utility for MacOS

March 30, 2025 No Comments

Is Your Mac Running Out of Space? We’ve created DiskSweep – two lightweight utilities (Python + Bash) to help you find and manage large files. Perfect for

The Accountability Revolution: How AI is Forcing Recruitment Out of the Shadows

March 28, 2025 No Comments

Let’s be real: hiring has always been a messy, biased, and often unfair game. A manager rejects a candidate for a typo. A recruiter hires

The Silent Candidate Killer: How Murky Hiring Processes Erode Trust (And How to Fix It)

March 28, 2025 No Comments

Imagine spending weeks polishing your resume, researching the company, and practicing answers to “Tell me about a time you failed” — only to submit your