Java应用CPU使用率过高排查方式
作者:烟雨楼台笑江湖
Java应用CPU使用率过高排查
1.使用top命令查看使用CPU过高的进程
top
top - 14:16:06 up 27 min, 3 users, load average: 0.00, 0.01, 0.02 Tasks: 88 total, 2 running, 86 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 999692 total, 454684 free, 206396 used, 338612 buff/cache KiB Swap: 204796 total, 204796 free, 0 used. 614532 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1113 root 20 0 2583028 131816 15412 S 0.3 13.2 0:01.65 java 1 root 20 0 128164 6832 4060 S 0.0 0.7 0:00.66 systemd
2.使用 top -H -p [pid]查看使用CPU过高的线程
top -H -p 1113
top - 14:17:34 up 28 min, 3 users, load average: 0.00, 0.01, 0.02 Threads: 19 total, 0 running, 19 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 999692 total, 454800 free, 206272 used, 338620 buff/cache KiB Swap: 204796 total, 204796 free, 0 used. 614652 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1114 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.74 java 1115 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java 1116 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.02 java 1117 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java 1118 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java
3.使用jstack -l [pid]查看jvm当前的线程信息
jstack -l 1113
[root@node01 ~]# jstack -l 1113 2023-01-06 14:20:52 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode): "Attach Listener" #17 daemon prio=9 os_prio=0 tid=0x00007fb904001000 nid=0x4b2 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "DestroyJavaVM" #16 prio=5 os_prio=0 tid=0x00007fb93400c800 nid=0x45a waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "FileWatchService" #15 prio=5 os_prio=0 tid=0x00007fb934788800 nid=0x46a waiting on condition [0x00007fb90d3d2000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e392af58> (a org.apache.rocketmq.common.CountDownLatch2$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at org.apache.rocketmq.common.CountDownLatch2.await(CountDownLatch2.java:114) at org.apache.rocketmq.common.ServiceThread.waitForRunning(ServiceThread.java:139) at org.apache.rocketmq.srvutil.FileWatchService.run(FileWatchService.java:70) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None "NettyEventExecutor" #10 prio=5 os_prio=0 tid=0x00007fb934787800 nid=0x469 waiting on condition [0x00007fb90d4d3000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e13b8328> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract$NettyEventExecutor.run(NettyRemotingAbstract.java:589) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None "NettyNIOBoss_1" #14 prio=5 os_prio=0 tid=0x00007fb934787000 nid=0x468 runnable [0x00007fb90d5d4000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000000e1c8e928> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000000e1c83868> (a java.util.Collections$UnmodifiableSet) - locked <0x00000000e1c82430> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:810) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None "NSScheduledThread1" #12 prio=5 os_prio=0 tid=0x00007fb93473c000 nid=0x467 waiting on condition [0x00007fb90d6d5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e12f8ad0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - None "ServerHouseKeepingService" #11 daemon prio=5 os_prio=0 tid=0x00007fb93457c800 nid=0x466 in Object.wait() [0x00007fb920a25000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.util.TimerThread.mainLoop(Timer.java:552) - locked <0x00000000e13b8b70> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:505) Locked ownable synchronizers: - None "AsyncAppender-Worker-RocketmqNamesrvAppender" #9 daemon prio=5 os_prio=0 tid=0x00007fb934549000 nid=0x465 waiting on condition [0x00007fb920b26000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e128ee68> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at ch.qos.logback.core.AsyncAppenderBase$Worker.run(AsyncAppenderBase.java:289) Locked ownable synchronizers: - None "Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007fb9340e3800 nid=0x463 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "C1 CompilerThread1" #7 daemon prio=9 os_prio=0 tid=0x00007fb9340e0800 nid=0x462 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "C2 CompilerThread0" #6 daemon prio=9 os_prio=0 tid=0x00007fb9340de800 nid=0x461 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Signal Dispatcher" #5 daemon prio=9 os_prio=0 tid=0x00007fb9340dc000 nid=0x460 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Surrogate Locker Thread (Concurrent GC)" #4 daemon prio=9 os_prio=0 tid=0x00007fb9340db000 nid=0x45f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fb9340a8000 nid=0x45e in Object.wait() [0x00007fb9218f5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000e0008ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x00000000e0008ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216) Locked ownable synchronizers: - None "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fb9340a5000 nid=0x45d in Object.wait() [0x00007fb9219f6000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000e0006bf8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x00000000e0006bf8> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) Locked ownable synchronizers: - None "VM Thread" os_prio=0 tid=0x00007fb93409b800 nid=0x45c runnable "Concurrent Mark-Sweep GC Thread" os_prio=0 tid=0x00007fb93402c000 nid=0x45b runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007fb9340e6800 nid=0x464 waiting on condition JNI global references: 318
这里重点住 nid 的值,可以看到是16进制的,把之前使用 top -H -p 1113获取到的线程id转成16进制(比如上面的1114对应16进制45a),用这个值过滤 jstack -l 1113 | grep 45a
[root@node01 ~]# jstack -l 1113 | grep 45a "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fb9340a5000 nid=0x45a in Object.wait() [0x00007fb9219f6000]
这样就能定位到有问题的线程了。
这里仅记录解决问题的过程,没有找到合适的例子,截图的命令结果可能不能体现出实际问题,后续遇到了,再更新。
如果发现CPU使用率飙高是fullgc导致,那可能需要看一下内存这块问题,比如内存不够用导致频繁fullgc。
cpu使用率飚高实战分析(java应用)
有关于java应用CPU使用率超高的问题,已经有很多博客有过分析了。
无外乎就是,某个线程执行执行耗CPU的动作了。
我们主要的工作就是通过Linux命令和java相关命令找到具体的线程,并分析线程执行代码。
本文比较简单,就当是做一个记录。
1.准备工作
1.1 安装并启动docker tomcat
# 1.拉取tomcat镜像 hxw@hxwdeMacBook-Pro ~ % docker pull tomcat # 2.完成后,启动一个tomcat容器 # tomcat默认启动端口为8080,将该端口映射到Mac的8088上 hxw@hxwdeMacBook-Pro ~ % docker run -p 8088:8080 -d tomcat # 3.启动完成后,通过docker ps查看 hxw@hxwdeMacBook-Pro ~ % docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 606ca0c7f783 tomcat:8.0.15 "catalina.sh run" 4 hours ago Up 42 minutes 0.0.0.0:8088->8080/tcp gracious_greider # 4.验证tomcat启动 # 通过浏览器访问 localhost:8088,出现tomcat即可
1.2 准备war包,并拷贝到docker tomcat中
war包中有一个方法会引起CPU飙高,文章最后再做说明 # 1.准备war包,名称为SpringTest.war # 2.拷贝到docker中 hxw@hxwdeMacBook-Pro ~ % docker cp SpringTest.war 606ca0c7f783:/usr/local/tomcat/webapps # 3.重启tomcat hxw@hxwdeMacBook-Pro ~ % docker restart 606ca0c7f783
2.分析CPU飙高问题
需要先调用tomcat中SpringTest.war相关方法,这里先不说明,文章最终再说
需要注意的是:以下操作都是在tomcat所在容器中进行的
2.1 进入tomcat容器,安装所需工具
# 1.通过docker exec命令进入 hxw@hxwdeMacBook-Pro ~ % docker exec -it 606ca0c7f783 /bin/bash root@606ca0c7f783:/usr/local/tomcat# # 2.安装JDK,笔者这里默认没有jdk root@606ca0c7f783:/usr/local/tomcat# apt-get update root@606ca0c7f783:/usr/local/tomcat# apt-get install default-jdk root@606ca0c7f783:/usr/local/tomcat# apt-get install less
主要使用的工具就是top和jdk(jstack)
less命令笔者所在tomcat也没有,也需要安装的(读者如果还需要其他命令,可自行安装,按照debain的方式来安装即可)
2.2 top分析进程CPU使用率
root@606ca0c7f783:/usr/local/tomcat# top top - 07:42:09 up 8:50, 0 users, load average: 1.06, 0.41, 0.25 Tasks: 3 total, 0 running, 2 sleeping, 0 stopped, 0 zombie %Cpu(s): 22.0 us, 49.9 sy, 0.0 ni, 28.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 2036420 total, 1895796 used, 140624 free, 140304 buffers KiB Swap: 1048572 total, 64248 used, 984324 free. 817248 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 3318896 400936 13704 S 101.3 19.7 0:52.43 java 58 root 20 0 169276 10464 3052 S 0.0 0.5 0:00.22 bash 3186 root 0 0 170660 9408 4084 0 0.0 0.5 0:00.00 top
很容易就看到,PID=1的进程占用了101%的CPU,就是这个进程使CPU飙高的
2.3 查看使当前进程飙高的线程信息
# 这里的1,就是上面占用CPU的进程PID root@606ca0c7f783:/usr/local/tomcat# top -H -p 1 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3388 root 20 0 3319700 392824 13732 R 99.8 19.3 1:29.68 java 1 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.05 java 27 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.00 java 28 root 20 0 3319700 392824 13732 S 0.0 19.3 0:02.65 java 29 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.36 java 30 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.37 java PID=3388的这个线程,占用了99.8%的CPU,那么我们直接分析这个线程就可以了
2.4 jstack命令查看进程栈信息
# 1.将进程号=1的进程栈信息导出到2.txt中 root@606ca0c7f783:/usr/local/tomcat# jstack 1 > 2.txt # 2.将线程号转换为16进制(3388就是2.3中查出来的线程号) root@606ca0c7f783:/usr/local/tomcat# printf "%x\n" 3388 d3c # 3.查询d3c的线程信息 root@606ca0c7f783:/usr/local/tomcat# grep d3c 2.txt "http-nio-8080-exec-7" daemon prio=10 tid=0x00000040b4005000 nid=0xd3c runnable [0x00000041100b6000] java.lang.Thread.State: RUNNABLE at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:345) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) - locked <0x00000000d6d2ff58> (a java.io.BufferedOutputStream) at java.io.PrintStream.write(PrintStream.java:482) - locked <0x00000000d6d2ff38> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) - locked <0x00000000d6d30088> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185) at java.io.PrintStream.newLine(PrintStream.java:546) - eliminated <0x00000000d6d2ff38> (a java.io.PrintStream) at java.io.PrintStream.println(PrintStream.java:737) - locked <0x00000000d6d2ff38> (a java.io.PrintStream) at org.apache.tomcat.util.log.SystemLogHandler.println(SystemLogHandler.java:237) # 在这里可以看到具体的代码块 at com.example.controller.JvmTest.incre(JvmTest.java:21)
最终分析到是由于JvmTest.incre()方法导致的CPU飙高
2.5 通过代码验证分析正确性
既然分析到JvmTest.incre()方法导致的,我们直接来看对应代码
@Controller @RequestMapping("/jvmtest") public class JvmTest { @RequestMapping(path = "/increment", method = RequestMethod.GET) public String incre() { int i = 0; for (int j = 0; j < 10000000; j++) { System.out.println(i++); } return "success"; } }
很容易看出来,就是这个for循环导致的。
注意:
java应用CPU飙高的问题并不难排查,按照这种标准方式来查,并没有解决不了的问题。
总结
以上为个人经验,希望能给大家一个参考,也希望大家多多支持脚本之家。