Troubleshooting Blocked Threads On ServiceNow MID Server

Troubleshooting JVM blocked threads problems on ServiceNow servers is crucial for maintaining the stability and performance of Java applications integrated with the ServiceNow platform. This comprehensive guide provides a step-by-step plan to address and resolve blocked threads issues by configuring the MID Server, integrating Java applications with ServiceNow, setting up an EC2 instance, simulating blocked threads, and utilizing the yCrash tool for analysis.

Blocked Threads

Blocked threads in ServiceNow can occur due to various reasons:

  1. Resource Contention: Threads may become blocked when they are contending for resources that are held by other threads. This often happens in multi-threaded applications where multiple threads are competing for access to shared resources such as database connections, locks, or I/O operations.
  1. Synchronization Issues: Blocked threads can also result from synchronization issues, such as deadlocks or livelocks. Deadlocks occur when two or more threads are waiting for each other to release resources, while livelocks happen when threads are in a state of perpetual contention, unable to make progress.
  1. Long-running Operations: Threads may become blocked while waiting for long-running operations to complete, such as network requests, database queries, or file I/O. If these operations take too long to complete, they can cause other threads to block while waiting for the resources they need.
  1. Thread Starvation: Blocked threads can also be caused by thread starvation, where certain threads are unable to obtain the CPU time or resources they need to execute, often due to higher priority threads monopolizing resources.
  1. External Dependencies: Blocked threads can be a result of external dependencies, such as third-party APIs or services, which may experience delays or downtime, causing threads to block while waiting for responses or resources from these external systems.

Simulating Blocked threads

The Java program given below simulates Blocked threads on any machine/container in which it’s launched:

public class BlockedThreads {

    public static void start() {
        for (int counter = 0; counter < 9; ++counter) {
            new AppThread().start();
        }
    }

    public static void main(String[] args) {
        start(); 
    }
}

class AppThread extends Thread {

    @Override
    public void run() {
        AppObject.getSomething();
    }
}

class AppObject {
    
    private static final long BLOCK_TIME = 5 * 60 * 1000; // 5 minutes
    
    public static synchronized void getSomething() {
        long startTime = System.currentTimeMillis();

        while (System.currentTimeMillis() - startTime <= BLOCK_TIME) {
            try {
                Thread.sleep(1000); 
            } catch (InterruptedException e) {
                System.out.println(Thread.currentThread().getName() + " was interrupted.");
                return; 
            }
        }
    }
}

The ‘BlockedThreads’ class is designed to demonstrate thread blocking in Java through the use of synchronized methods and sleep intervals. When the program is run, the ‘main()’ method invokes the ‘start()’ method. This ‘start()’ method creates 9 instances of the ‘AppThread’ class, each running on its own separate thread, and is responsible for calling the ‘getSomething()’ method on the ‘AppObject’ class.

Each ‘AppThread’, attempts to access the synchronized ‘getSomething()’ method. The synchronized keyword ensures that only one thread can access the ‘getSomething()’ method at a time, creating a lock on the ‘AppObject’ class. The first thread to acquire this lock prevents any other threads from entering the method until it has finished its work.

Within the ‘getSomething()’ method, there is a while loop that puts the thread to sleep for 5 minutes. During this period, the thread holds the lock and repeatedly sleeps on a one-second interval. The ‘sleep()’ method is used to simulate a time-consuming operation.

While the first thread sleeps inside the synchronized ‘getSomething()’ method, any other ‘AppThread’ that attempts to access this method gets into the BLOCKED state because of the sleeping thread. These other threads get in line to wait for the first thread to exit the ‘getSomething()’ method and release the synchronization lock.

Once the first thread finishes (after 5 minutes), it releases the lock, and the next waiting thread acquires the lock and enters the method. This sequence continues until all 9 threads have had a chance to execute the ‘getSomething()’ method.

BLOCKED Threads in ServiceNow MID server

Now let’s try to simulate this BLOCKED deadlock in the ServiceNow MID Server environment. Let’s create a JAR (Java Archive) file from the above program by issuing below command:

jar cf BlockedThreads.jar BlockedThreads.class

Once a JAR file is created, let’s upload and run this program in the ServiceNow MID Server as documented in the MID Server setup guide. This guide provides a detailed walkthrough on how to run a custom Java application in the ServiceNow MID Server infrastructure. It walkthrough following steps:

  1. Creating a ServiceNow application
  2. Installing MID Server in AWS EC2 instance
  3. Configuring MID Server
  4. Installing Java application with in MID Server
  5. Running Java application from MID server

We strongly encourage you to check out the guide if you are not sure on how to run custom Java applications in ServiceNow MID server infrastructure.

Blocked threads diagnosis in ServiceNow using yCrash

yCrash is a light-weight monitoring tool designed to pinpoint performance bottlenecks and provide actionable recommendations within the ServiceNow environment. Infact ServiceNow organization itself internally uses yCrash to troubleshoot their performance problems

When the blocked threads scenario occurred on ServiceNow’s MID Server, yCrash diligently monitored the micro-metrics of the ServiceNow environment. It promptly detected the presence of blocked threads and provided comprehensive reports on the dashboard, enabling efficient resolution.

Thread Dump reporting blocked threads
Fig 1: Thread Dump reporting blocked threads

As shown in the above figure, the machine learning algorithms in the tool identified that the ‘Thread-0’ in the ServiceNow application is unable to proceed because it is “stuck” in the sleep0() method and it’s blocking 8 other threads in the application.

 BLOCKED Threads Transitive Graph
Fig 2: BLOCKED Threads Transitive Graph

Tool also reported a transitive dependency graph of the BLOCKED threads as shown in the above figure. This graph provides a visual representation of the BLOCKED threads, which makes it easy to understand the problem. From the graph above, you can notice that the red circle is the ‘Thread-0’ which is blocking 8 other threads in the application. 

Thread stack trace view
Fig 3: Thread stack trace view

Tool also reported the stack trace of the threads which was involved in this bottleneck. Above figure is the stack trace of Thread-0, which is blocking 8 other threads. Stack trace clearly points to the code path in the application which is causing the bottleneck. You can notice that ‘Thread-0’ acquired the lock in the ‘AppObject’ class and then went on to sleep.

To address this problem and ensure the application remains responsive, it is necessary to review the logic within the ‘getSomething()’ method and consider how long ‘Thread-0’ should hold onto the lock, especially when it’s sleeping. Making adjustments so that the lock is held for the minimum necessary time, or removing long sleep intervals when holding important locks, would prevent other threads from being blocked and improve the application’s concurrency and responsiveness. You can see the report there.

If you need to troubleshoot performance issues in your ServiceNow deployment using yCrash, feel free to sign up here to start using the free cloud-based tier. Alternatively, if your security requirements as a large enterprise prevent you from sending data to the cloud, you can register here to access the on-premises installation of yCrash.

Conclusion

In summary, yCrash’s analysis of blocked threads offers valuable insights into thread-related issues, facilitating efficient resolution processes. The integration with ServiceNow, combined with automated incident reporting, enhances IT service management efficiency. This integration is particularly beneficial for addressing blocked thread challenges within large applications integrated with ServiceNow, ultimately contributing to a more resilient and efficient IT environment. If you want to diagnose performance problems in your ServiceNow deployment using yCrash you may register here.

Share your Thoughts!

Up ↑

Discover more from yCrash

Subscribe now to keep reading and get access to the full archive.

Continue reading