背景介绍
排查问题的时候,有遇到synchronized使用不合理导致接口响应延迟,出现问题的伪代码如下:
public synchronized Object businessMethod(Object params){ Object ret = xxxx; Object response = httpClient.execute(params); //业务逻辑 ... ... return ret; }
上面代码在访问远端http服务延迟的时候,所有访问该方法的线程都阻塞住了,最终导致了接口超时,而该场景下是不需要使用synchronized的。
由此联想到,如何检测由synchronized或java.util.concurrent.Lock引起的线程阻塞问题呢?
分析思路
从对象入手
一种思路是从对象入手,通过对象上的监视器可以获取如下信息:
- 持有该对象锁的线程
- 持有该对象锁的线程的重入次数
- 正在争取该对象锁的线程们
- 调用了wait后,等待notify的线程们
JVMTI提供了如下接口用来获取以上信息:
typedef struct { jthread owner; jint entry_count; jint waiter_count; jthread* waiters; jint notify_waiter_count; jthread* notify_waiters; } jvmtiMonitorUsage; //Get information about the object's monitor. //The fields of the jvmtiMonitorUsage structure are filled in with information about usage of the monitor. jvmtiError GetObjectMonitorUsage(jvmtiEnv* env,jobject object, jvmtiMonitorUsage* info_ptr)
但是,似乎无从下手。
从线程入手
如果多个线程在争用一把锁,那么拥有这把锁的线程就是阻塞住了多个线程的线程,然后将拥有这把锁的线程栈打印出来就可以对阻塞问题进行分析了。
所以线程信息中的下面两个信息是我们最关注的:
- 该线程已经拥有的锁信息
- 该线程正在争取的锁信息
JVMTI中提供了获取以上信息的接口:
Get Owned Monitor Info
jvmtiError GetOwnedMonitorInfo(jvmtiEnv* env, jthread thread, jint* owned_monitor_count_ptr, jobject** owned_monitors_ptr)
Get Current Contended Monitor
jvmtiError GetCurrentContendedMonitor(jvmtiEnv* env, jthread thread, jobject* monitor_ptr)
综上分析,该问题总的解决思路是:
- 获取所有线程信息
- 获取被争用最多的锁对象
- 获取每个锁对象对应的线程信息
- 找出拥有被争用最多的锁对象的线程信息
以上功能已经在arthas里实现了,下面看看arthas是如何实现的。
arthas: thread -b
下面是arthas thread -b命令的主要实现逻辑:
public static BlockingLockInfo findMostBlockingLock() { // 获取所有线程信息 ThreadInfo[] infos = threadMXBean.dumpAllThreads(threadMXBean.isObjectMonitorUsageSupported(), threadMXBean.isSynchronizerUsageSupported()); // a map of <LockInfo.getIdentityHashCode, number of thread blocking on this> Map<Integer, Integer> blockCountPerLock = new HashMap<Integer, Integer>(); // a map of <LockInfo.getIdentityHashCode, the thread info that holding this lock Map<Integer, ThreadInfo> ownerThreadPerLock = new HashMap<Integer, ThreadInfo>(); // 通过遍历线程,获取 // 1.被争用的锁对象,及该锁对象被多少个线程争用 // 2.已被获取到的锁对象,及拥有该锁对象的线程 for (ThreadInfo info: infos) { if (info == null) { continue; } LockInfo lockInfo = info.getLockInfo(); if (lockInfo != null) { // the current thread is blocked waiting on some condition if (blockCountPerLock.get(lockInfo.getIdentityHashCode()) == null) { blockCountPerLock.put(lockInfo.getIdentityHashCode(), 0); } int blockedCount = blockCountPerLock.get(lockInfo.getIdentityHashCode()); blockCountPerLock.put(lockInfo.getIdentityHashCode(), blockedCount + 1); } for (MonitorInfo monitorInfo: info.getLockedMonitors()) { // the object monitor currently held by this thread if (ownerThreadPerLock.get(monitorInfo.getIdentityHashCode()) == null) { ownerThreadPerLock.put(monitorInfo.getIdentityHashCode(), info); } } for (LockInfo lockedSync: info.getLockedSynchronizers()) { // the ownable synchronizer currently held by this thread if (ownerThreadPerLock.get(lockedSync.getIdentityHashCode()) == null) { ownerThreadPerLock.put(lockedSync.getIdentityHashCode(), info); } } } // find the thread that is holding the lock that blocking the largest number of threads.找出拥有【被争用最多的锁对象】的线程 int mostBlockingLock = 0; // System.identityHashCode(null) == 0 int maxBlockingCount = 0; for (Map.Entry<Integer, Integer> entry: blockCountPerLock.entrySet()) { if (entry.getValue() > maxBlockingCount && ownerThreadPerLock.get(entry.getKey()) != null) { // the lock is explicitly held by anther thread. maxBlockingCount = entry.getValue(); mostBlockingLock = entry.getKey(); } } if (mostBlockingLock == 0) { // nothing found return EMPTY_INFO; } BlockingLockInfo blockingLockInfo = new BlockingLockInfo(); blockingLockInfo.setThreadInfo(ownerThreadPerLock.get(mostBlockingLock)); blockingLockInfo.setLockIdentityHashCode(mostBlockingLock); blockingLockInfo.setBlockingThreadCount(blockCountPerLock.get(mostBlockingLock)); return blockingLockInfo; }
测试thread -b
下面代码模拟的是synchronized和Lock引起阻塞的场景:
- synchronized方式的一共起了5个线程,其中只有一个线程获取了锁,其余4个线程等待获取锁;
- Lock方式的一共也起了5个线程,其中只有一个线程获取了锁,其余4个线程等待获取锁;
import java.lang.management.ManagementFactory; import java.util.concurrent.locks.ReentrantLock; public class Main { private static final ReentrantLock REENTRANT_LOCK = new ReentrantLock(); public static void main(String[] args) { System.out.println(ManagementFactory.getRuntimeMXBean().getName()); int num = 5; for (int i = 0; i < num; i++) { Thread synchronizedT = new Thread(() -> { synchronized (Main.class) { sleep(); } }); synchronizedT.setName("synchronizedT-" + i); synchronizedT.start(); Thread lockT = new Thread(() -> { REENTRANT_LOCK.lock(); try { sleep(); } finally { REENTRANT_LOCK.unlock(); } }); lockT.setName("lockT-" + i); lockT.start(); } } public static void sleep() { try { Thread.sleep(Long.MAX_VALUE); } catch (Exception e) { e.printStackTrace(); } } }
通过thread -b期望能够显示出两个线程信息,一个是获取了锁的synchronized线程,一个是获取了锁的Lock线程,实际结果是thread -b返回了一个线程信息(从上面arthas的代码也可以得出这个结论),如图:
期望与实际不相符,这个场景下算不算arthas thread -b的一个小bug呢?
另外一点是arthas thread -b的帮助文档说明与测试结果不相符:
总结
- synchronized关键字和java.util.concurrent.Lock用来实现线程间的同步;
- JVMTI提供了获取对象监视器的方法:GetObjectMonitorUsage、获取线程已拥有的监视器信息的方法:GetOwnedMonitorInfo、获取线程争用的监视器的方法:GetCurrentContendedMonitor
- 通过ManagementFactory.getThreadMXBean()可以dumpAllThreads,通过ThreadInfo可以获取LockInfo、LockedMonitors、LockedSynchronizers。