Diagnose CPU spike in a non-intrusive manner!

Page Contents

In this post, we are going to discuss a non-intrusive approach (i.e., approach that doesn’t add any noticeable overhead to the application) to diagnose CPU spike. Thus, you can use this approach in your production environment to troubleshoot CPU spikes.

Works on all JVM languages:
This approach can be used to troubleshoot CPU spikes in all programming languages that run on Java Virtual Machine (JVM) like Java, Scala, Kotlin, JRuby, Jython,..

Step 1: Capture 360° data

You can use the open source yCrash data script to capture 360° data from your application stack. This script basically captures 16 different artifacts from your application stack (GC Log, thread dump, heap substitute, netstat, iostat, ….) and runs less than 30 seconds. Thus it doesn’t add any measurable overhead to your application. You can trigger this script from any platform (all Linux flavors, windows, ..) and any environment (bare metal, cloud, containers, k8…).

Fig: 360-degree data

Here are the steps to run this script:

1. Download the latest yc-data-script from this location.

2. Unzip the downloaded yc-agent-latest.zip file. (Say you are unzipping in ‘/opt/workspace/yc-agent-latest’ folder)

3. In the unzipped folder you will find yc-data-script by operating system:

a) linux/yc – If you are running on Unix/Linux, then use this script.

b) windows/yc.exe – If you are running on Windows, then use this script.

c) mac/yc – If you are running on MAC, then use this script.

4. You can execute the yc script by issuing following command:

./yc -j {JAVA_HOME} -onlyCapture -p {PID}

Where,

JAVA_HOME is the home directory where JDK is installed

PID is the target JVM’s process ID

Example:

./yc -j /usr/java/jdk1.8.0_141 -onlyCapture -p 15326

When you pass the above arguments, yc-data-script will capture all the application level and system level artifacts/logs from your application stack for analysis. Captured artifacts will be compressed into a zip file and stored in the current directory where the above command was executed. The zip file will have the name in the format: ‘yc-YYYY-MM-DDTHH-mm-ss.zip‘. Example: ‘yc-2021-03-06T14-02-42.zip‘.

2. Analyze captured data

Once you have captured the data, you can analyze them using the yCrash server. You can upload the captured zip file to the yCrash server for analysis. yCrash server analyzes all the captured data and generates one unified root cause analysis report instantly. Note: There is a free tier in yCrash application which you can use for the CPU diagnosis purposes. In the yCrash incident report you will see a ‘CPU consumption by thread’ section under the ‘Thread’ report (as shown below):

Fig: CPU consumption by threads reported by yCrash

This section will show all the CPU consuming threads and the exact lines of code they are working on. Equipped with this information, you can spot the ‘black sheep’ lines of code that are causing the CPU to spike up.

How does it work?
‘Thread dump’ and ‘top -H -p {PROCESS_ID}’ are the two artifacts that yCrash data script captures. Here ‘top -H -p {PROCESS_ID}’ command shows the list of thread Ids and the amount of CPU, Memory it consumes within the specified PROCESS_ID. ‘Thread dump’ shows the code path in which threads are executing. yCrash tool marries these two data and produces the above report. For a more details, consider delving into this engaging case study that dives into the intricacies of troubleshooting CPU spikes in a Major Trading Application.

I hope this approach will facilitate you isolate CPU consuming lines of code effectively. Happy Troubleshooting!!

5 thoughts on “Diagnose CPU spike in a non-intrusive manner!”

Add yours

David Griffiths says:

July 4, 2023 at 1:57 pm

Hi, do you have any examples of applications that exhibit such CPU spikes?

1. Ram Lakshmanan says:
  
  July 5, 2023 at 2:43 pm
  
  Hello David! BuggyApp is a simple open-source project, which simulates various performance problems. One of the problem it can simulate is: CPU spike. You can run this App either from command line or web. More details about this open-source project can be found here: https://github.com/ycrash/buggyapp
  
  1. David Griffiths says:
    
    July 5, 2023 at 6:19 pm
    
    Cool, thanks!
Pingback: Simulating & troubleshooting CPU spike in Kotlin - yCrash
Pingback: Micro-Metrics Every Performance Engineer should validate before Sign-off - yCrash

Diagnose CPU spike in a non-intrusive manner!

Step 1: Capture 360° data

2. Analyze captured data

You may also like

5 thoughts on “Diagnose CPU spike in a non-intrusive manner!”

Add yours

Share your Thoughts!Cancel reply

About

Popular Topics

Troubleshooting Tools

Step 1: Capture 360° data

2. Analyze captured data

You may also like

5 thoughts on “Diagnose CPU spike in a non-intrusive manner!”

Add yours

Share your Thoughts!Cancel reply

About

Popular Topics

Troubleshooting Tools

Discover more from yCrash