Amin Qurjili

Distributed Locks, why we need them, their implementation strategies and a simple concrete distributed implementation using Java and Zookeeper

December 25, 2022
5 min
Amin Qurjili

What is Distributed Lock?

In software development, a distributed lock is a synchronization mechanism that allows multiple processes or threads to coordinate access to a shared resource in a distributed system.

A distributed lock is used to ensure that only one process or thread can access a shared resource at a time, in order to prevent race conditions and other synchronization issues. It is called a “distributed” lock because it can be used to synchronize access to shared resources across multiple nodes (instances or machines) in a distributed system.

Distributed locks can be used in a variety of applications, including distributed databases, distributed file systems, distributed transactional applications and distributed cache systems. They are an important tool for building scalable and reliable distributed systems.

Why we need distributed locks?

Distributed locks are used in distributed systems to ensure that only one process or thread can access a shared resource at a time. This is necessary because in a distributed system, multiple processes or threads may be running concurrently and may try to access the same resource simultaneously. If this happens, it can lead to race conditions and other synchronization issues like dirty read and writes and also read and write skews that can result in incorrect or inconsistent behavior or data.

By using a distributed lock, processes or threads can coordinate access to the shared resource and ensure that only one of them can access it at a time. This helps to prevent race conditions and other synchronization issues, and ensures that the shared resource is accessed in a consistent and predictable manner.

Distributed locks are particularly useful in distributed systems that are highly concurrent, where multiple processes or threads may be accessing shared resources simultaneously. They can also be useful in systems that need to ensure strict consistency and reliability, such as distributed databases.

Overall, distributed locks are an important tool for building scalable and reliable distributed systems, and are used to ensure that shared resources are accessed in a consistent and predictable manner.

How distributed locks are implemented?

There are several ways to implement a distributed lock in a distributed system, and the most stable way will depend on the specific requirements and constraints of your system. Some common approaches include:

Mutual exclusion locks: This is a simple technique that uses a shared resource, such as a file or database record (Zookeeper, Redis, ETCD, etc.), to implement a lock. When a process or thread acquires the lock, it modifies the shared resource in a way that indicates that the lock is held, and other processes or threads are unable to acquire the lock until it is released.
Leases: This technique involves granting a process or thread a lease on a shared resource for a fixed period of time. When the lease expires, the process or thread must renew the lease in order to continue holding the lock. This can help to prevent stale locks and ensure that the lock is eventually released if the process or thread holding the lock fails.
Quorums: This technique involves using a quorum of nodes in a distributed system to implement a lock. A quorum is a majority of the nodes in the system, and in order to acquire the lock, a process or thread must obtain a vote from a majority of the nodes in the quorum. This can help to ensure that the lock is only granted to processes or threads that have the support of a majority of the nodes in the system.

However, there is no silver bullet way to implement a distributed lock and the most stable way to implement a distributed lock will depend on the specific requirements and constraints of your system, as well as the trade-offs you are willing to make in terms of complexity, performance, and fault tolerance. It is often best to choose an approach that strikes a balance between these factors, and that is well-suited to the needs of your particular application.

How to implement a distributed lock using Zookeeper in Java:

For creating a very simple distributed lock using Zookeeper in java you can follow down these steps and use the very simplified code snippet provided below:

First, you will need to set up a Zookeeper cluster and create a client connection to it from your Java application. You can use the Zookeeper Java client library to do this.
Next, you will need to create a lock node in the Zookeeper tree. This node will be used to hold the lock state and to coordinate access to the shared resource. You can create the node using the ‘create()’ method of the Zookeeper client.
To acquire the lock, your Java application will need to create a child node under the lock node using the ‘create()’ method, and set a watch on the lock node using the ‘exists()’ method. If the child node is successfully created, the lock has been acquired and the shared resource can be accessed. If the child node cannot be created, it means that the lock is already held by another process, and the application will need to wait until the lock is released before trying to acquire it again.
To release the lock, your Java application will need to delete the child node it created using the ‘delete()’ This will allow other processes to acquire the lock and access the shared resource.

package com.qurjili.distributedlock;

import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.ZooKeeper;

public class Locker {
// Set up a ZooKeeper client and connect to the cluster
ZooKeeper locker = new ZooKeeper("localhost:2181", 3000, null);

// Create the lock node in the ZooKeeper tree
locker.create("/lock",new byte[0],ZooDefs.Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT);

while(true){
try {
// Try to create a child node under the lock node
locker.create("/lock/node", new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);

// If the node was successfully created, we have acquired the lock
break;
} catch (KeeperException.NodeExistsException e) {
// If the node already exists, it means the lock is held by another process
// Set a watch on the lock node and wait for it to be released
locker.exists("/lock", true);
}
}

// Access the shared resource here
// Access the shared resource here
// Access the shared resource here
// Access the shared resource here
// Access the shared resource here
// Access the shared resource here
// Access the shared resource here

// Release the lock by deleting the child node
locker.delete("/lock/node",-1);
}

**Resource One**: In the example code I provided, new byte[0] creates a new zero-length byte array. This byte array is used as the initial data for the nodes being created in the Zookeeper tree.

In Zookeeper, nodes in the tree can have data associated with them, which is stored as an array of bytes. When creating a node using the create() method of the Zookeeper client, you can specify the initial data for the node as an argument. In the example code, the initial data for the lock node and the child nodes is an empty byte array, which means that the nodes have no data associated with them.

The use of an empty byte array as the initial data for the nodes is just a convenience in this example code, and is not necessary for the functioning of the distributed lock. In a real application, you may want to store additional data in the nodes, such as metadata or state information related to the lock or the shared resource.

Overall, the use of new byte[0] in the example code is just a way to specify that the nodes being created have no initial data, and is not a critical part of the distributed lock implementation.

**Resource Two**: In the example code, ZooDefs.Ids.OPEN_ACL_UNSAFE is the identifier for an access control list (ACL) that grants full permissions to all users. It is defined in the ZooDefs class, which is part of the Apache Zookeeper Java client library.

An ACL is a list of permissions that controls who can access a node in the Zookeeper tree. When creating or modifying a node, you can specify an ACL to control who has access to the node. The ZooDefs.Ids.OPEN_ACL_UNSAFE ACL grants full permissions to all users, which means that any user can read, write, and delete the node.

This ACL is called “unsafe” because it provides no security or access control, and should only be used in cases where security is not a concern. In a production system, you should use a more secure ACL that limits access to the node to specific users or groups.

In the example code, the ZooDefs.Ids.OPEN_ACL_UNSAFE ACL is used when creating the lock node and the child nodes under it. This allows any process to acquire the lock and access the shared resource, and is suitable for the purposes of the example code. However, in a real application, you should use a more secure ACL that limits access to the lock and the shared resource to authorized users or processes.

**Resource Three**: In the context of the Java Apache Zookeeper client library, CreateMode.PERSISTENT and CreateMode.EPHEMERAL are constants that are used to specify the type of node to create in a Zookeeper ensemble and indicates the lock lifetime.

CreateMode.PERSISTENT indicates that the node being created is a persistent node. A persistent node is a node in the Zookeeper ensemble that is not automatically deleted when the client that created it disconnects. Persistent nodes are used to store data that needs to be persisted across client connections.

CreateMode.EPHEMERAL indicates that the node being created is an ephemeral node. An ephemeral node is a node in the Zookeeper ensemble that is automatically deleted when the client that created it disconnects. Ephemeral nodes are used to store data that is not needed beyond the lifetime of the client that created it.

These constants are typically used when creating nodes in the Zookeeper ensemble using the create() method of the Zookeeper class.

*Note: This is just a basic example of how to use Zookeeper to implement a distributed lock in Java. You may need to add additional error handling and other features to make the lock more robust and reliable in your application.

*Note: Since Zookeeper connection pool is limited you must consider creating a application scoped connection to mitigate timeouts or connection limits.

*Note: a wise tip for keeping order of request for a lock can be to create a queue and when a thread request a lock which already exists; add those requests to that queue so threads can acquire their locks by the order which they requested.

Keywords

Comments (0)

Leave your comment

Cancel reply

Test Driven Development (TDD): A Strategic Approach to Mitigating Risks in Enterprise Software Projects

Embarking on complex enterprise projects early in my career and advancing into management within few years, my journey into risk management commenced over a decade ago. This experience laid bare the intrinsic connection

November 20, 2023
11 Ways to Overcome Procrastination – Easy tips to stop putting things off

Key points Procrastination is not a time management problem; rather, it’s likely due to difficulty managing negative feelings like boredom or anxiety. But avoiding negative emotions—and important tasks—tends to lead to much worse

September 25, 2023