Overhead added by collecting thread dumps

A thread dump is a snapshot of all the threads running in a java process. It’s a vital artifact to troubleshoot various production problems such as CPU spikes, unresponsiveness in the application, poor response time, hung threads, high memory consumption. Thus to facilitate troubleshooting, we have seen enterprises capture thread dumps on a periodic basis (every 5 minute or 2 minute).So we were curious to learn the overhead of capturing thread dump on a periodic basis. Thus we set out to conduct the below case study.

Environment

For our study we chose to use the open source spring boot pet clinic application. Pet Clinic is a poster child application that was developed to demonstrate the spring boot framework features. 

We ran this application in OpenJDK 11. We deployed this application on the Amazon AWS t2.medium EC2 instance which has 16GB RAM and 2 CPUs. Test was orchestrated using Apache JMeter stress testing tool. We used AWS Cloudwatch to measure the CPU, Memory utilization. In nutshell here are the tools/technologies, we used to conduct this case study:

  • OpenJDK 11
  • AWS EC2
  • AWS Cloudwatch
  • Apache JMeter

Test Scenario

In this environment, we conducted 3 tests:

  1. Baseline Test – In this scenario we ran the pet clinic application without capture thread dumps using the JMeter tool for 20 minutes with 200 concurrent users
  2. Thread dumps every 5 minutes Test – In this scenario we ran the pet clinic application using the same JMeter script for 20 minutes with 200 concurrent users. However we captured thread dump from a pet clinic application every 5 minutes.
  3. Thread dumps every 2 minutes Test – In this scenario we ran the pet clinic application using the same JMeter script for 20 minutes with 200 concurrent users. However we captured thread dump from a pet clinic application every 2 minutes.

Note: If you don’t know how to capture thread dump, see How to capture thread dumps? 8 options for more details.

Test Results

We captured average CPU and memory utilization from the AWS Cloudwatch and average response time and throughput from the JMeter tool. Data collected from all the test scenarios are summarized in the below table.

Data collectedBaseline testEvery 5 minutes testEvery 2 minutes test
Avg CPU Usage8.35%10.40%7.92%
Avg Memory Usage20.80%19.90%19.60%
Avg Response Time3901 ms3888 ms3770 ms
Avg Throughput24.4/sec25.8/sec24.8/sec

As you can see there is no noticeable difference in the CPU and Memory consumption. Similarly there is no noticeable difference in the average response and transaction throughput. 

Conclusion

Thus based on our study we can conclude that there is no noticeable overhead in capturing thread dumps on a 5 minutes or 2 minutes interval.

Leave a Reply

Powered by WordPress.com.

Up ↑

%d bloggers like this: