一次由于OOM导致锁没有释放的定位流程(结合Arthas)

本文涉及的产品
传统型负载均衡 CLB,每月750个小时 15LCU
EMR Serverless StarRocks,5000CU*H 48000GB*H
应用型负载均衡 ALB,每月750个小时 15LCU
简介: 一次由于OOM导致锁没有释放的定位流程(结合Arthas)
本文基于Spring Cloud Fincheley SR4, Arthas 3.1.4

最近发现业务上返回慢,并且feign的fallback被触发了。查看日志,发现触发了重试,是什么触发的重试呢,通过异常堆栈发现:


Caused by: feign.RetryableException: connect timed out executing GET http://test-service/test-api
        at feign.FeignException.errorExecuting(FeignException.java:65) ~[feign-core-9.7.0.jar!/:?]
        at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:105) ~[feign-core-9.7.0.jar!/:?]
        at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:77) ~[feign-core-9.7.0.jar!/:?]
        at feign.hystrix.HystrixInvocationHandler$1.run(HystrixInvocationHandler.java:107) ~[feign-hystrix-9.7.0.jar!/:?]
        at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:302) ~[hystrix-core-1.5.18.jar!/:1.5.18]
        at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:298) ~[hystrix-core-1.5.18.jar!/:1.5.18]
        at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46) ~[rxjava-1.3.8.jar!/:1.3.8]
        ... 27 more
Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) ~[?:?]
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) ~[?:?]
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) ~[?:?]
        at java.net.Socket.connect(Socket.java:591) ~[?:?]
        at sun.net.NetworkClient.doConnect(NetworkClient.java:177) ~[?:?]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:474) ~[?:?]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:569) ~[?:?]
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) ~[?:?]
        at sun.net.www.http.HttpClient.New(HttpClient.java:341) ~[?:?]
        at sun.net.www.http.HttpClient.New(HttpClient.java:362) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1248) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1015) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) ~[?:?]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1515) ~[?:?]
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527) ~[?:?]
        at feign.Client$Default.convertResponse(Client.java:150) ~[feign-core-9.7.0.jar!/:?]
        at feign.Client$Default.execute(Client.java:72) ~[feign-core-9.7.0.jar!/:?]
        at org.springframework.cloud.sleuth.instrument.web.client.feign.TracingFeignClient.execute(TracingFeignClient.java:91) ~[spring-cloud-sleuth-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at org.springframework.cloud.sleuth.instrument.web.client.feign.LazyTracingFeignClient.execute(LazyTracingFeignClient.java:55) ~[spring-cloud-sleuth-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at org.springframework.cloud.openfeign.ribbon.RetryableFeignLoadBalancer$1.doWithRetry(RetryableFeignLoadBalancer.java:103) ~[spring-cloud-openfeign-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at org.springframework.cloud.openfeign.ribbon.RetryableFeignLoadBalancer$1.doWithRetry(RetryableFeignLoadBalancer.java:88) ~[spring-cloud-openfeign-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:287) ~[spring-retry-1.2.4.RELEASE.jar!/:?]
        at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:180) ~[spring-retry-1.2.4.RELEASE.jar!/:?]
        at org.springframework.cloud.openfeign.ribbon.RetryableFeignLoadBalancer.execute(RetryableFeignLoadBalancer.java:88) ~[spring-cloud-openfeign-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at org.springframework.cloud.openfeign.ribbon.RetryableFeignLoadBalancer.execute(RetryableFeignLoadBalancer.java:54) ~[spring-cloud-openfeign-core-2.0.3.RELEASE.jar!/:2.0.3.RELEASE]
        at com.netflix.client.AbstractLoadBalancerAwareClient$1.call(AbstractLoadBalancerAwareClient.java:104) ~[ribbon-loadbalancer-2.2.5.jar!/:2.2.5]
        at com.netflix.loadbalancer.reactive.LoadBalancerCommand$3$1.call(LoadBalancerCommand.java:303) ~[ribbon-loadbalancer-2.2.5.jar!/:2.2.5]
        at com.netflix.loadbalancer.reactive.LoadBalancerCommand$3$1.call(LoadBalancerCommand.java:287) ~[ribbon-loadbalancer-2.2.5.jar!/:2.2.5]
        at rx.internal.util.ScalarSynchronousObservable$3.call(ScalarSynchronousObservable.java:231) ~[rxjava-1.3.8.jar!/:1.3.8]
        at rx.internal.util.ScalarSynchronousObservable$3.call(ScalarSynchronousObservable.java:228) ~[rxjava-1.3.8.jar!/:1.3.8]
        at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar!/:1.3.8]


发现调用微服务test-service连接超时,注意是连接超时,而不是读取超时。接下来,首先查看ribbon连接超时配置,看是否配置的连接超时过短,发现是:


#ribbon默认连接超时
ribbon.ConnectTimeout=500

500ms建立tcp连接已经不短了,所以配置没有问题。从当前实例手动调用微服务实例的接口,看是否是通的:

curl http://test-service的ip:test-service的端口

发现访问成功,没有阻塞。证明网络连接没有问题,我们来看下netstat网络连接状态:

Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 test-5c665cc496-6jkx5:54644 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54666 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54680 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54088 ip-10-238-8-68.eu-central-1.compute.internal:8211 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54674 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54682 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54652 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp       32      0 test-5c665cc496-6jkx5:47092 244-213-201-91.test.com:https CLOSE_WAIT  
tcp        0      0 test-5c665cc496-6jkx5:54662 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54624 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp       32      0 test-5c665cc496-6jkx5:47098 244-213-201-91.test.com:https CLOSE_WAIT  
tcp        0      0 test-5c665cc496-6jkx5:54672 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54630 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:58758 ip-10-238-9-71.eu-central-1.compute.internal:mysql ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:8251 10-238-8-117.api-gateway.test1.svc.cluster.local:43132 FIN_WAIT2   
tcp        0      0 test-5c665cc496-6jkx5:54648 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:58778 ip-10-238-9-71.eu-central-1.compute.internal:mysql ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54646 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54628 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54650 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54632 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:54638 ip-10-238-8-119.eu-central-1.compute.internal:6379 ESTABLISHED 
tcp        0      0 test-5c665cc496-6jkx5:59434 ip-10-238-9-71.eu-central-1.compute.internal:mysql ESTABLISHED 
tcp       32      0 test-5c665cc496-6jkx5:47104 244-213-201-91.test.com:https CLOSE_WAIT  
tcp        1      0 test-5c665cc496-6jkx5:54146 ip-10-238-8-68.eu-central-1.compute.internal:8211 CLOSE_WAIT  
tcp       32      0 test-5c665cc496-6jkx5:47100 244-213-201-91.test.com:https CLOSE_WAIT  
tcp       32      0 test-5c665cc496-6jkx5:47106 244-213-201-91.test.com:https CLOSE_WAIT

发现:

  1. 并没有建立与现在service-test这个微服务任何实例ip的链接
  2. 并没有很多网络连接,也并没有很多Timed_waiting还有close_wait

猜想,可能是调用的service-test的ip地址不是最新的。

现在该Athas登场了,启动Arthas,我们查看下feign调用的真实ip: 由于我们使用了sleuth,所以这里用sleuth监控的feigh client去看调用ip:


watch org.springframework.cloud.sleuth.instrument.web.client.feign.TracingFeignClient execute params[0].url()

对应的代码是:

package org.springframework.cloud.sleuth.instrument.web.client.feign;
final class TracingFeignClient implements Client {
  @Override public Response execute(Request request, Request.Options options)
      throws IOException {
          //.....
      }
}

通过观察,我们发现,确实是以前的ip。 那么为什么没有更新呢,我们来查看下EurekaClient相关的代码,参考我另一篇文章:Spring Cloud Eureka 全解 (4) - 核心流程-服务与实例列表获取详解. 我们来看下PollingServerListUpdater这个类,负责更新的线程池是:

int coreSize = poolSizeProp.get();
ThreadFactory factory = (new ThreadFactoryBuilder())
        .setNameFormat("PollingServerListUpdater-%d")
        .setDaemon(true)
        .build();
_serverListRefreshExecutor = new ScheduledThreadPoolExecutor(coreSize, factory);


谢天谢地,这个线程是有名字的。我们通过Arthas看下线程列表:

[arthas@24]$ thread
thread
Threads Total: 736, NEW: 0, RUNNABLE: 81, BLOCKED: 0, WAITING: 634, TIMED_WAITIN
G: 21, TERMINATED: 0                                                            
ID     NAME                GROUP        PRIORI STATE %CPU   TIME   INTER DAEMON 
11     Log4j2-TF-2-AsyncLo main         5      TIMED 39     138:46 false true   
17     Log4j2-TF-5-AsyncLo main         5      TIMED 37     137:27 false true   
1045   as-command-execute- system       10     RUNNA 19     0:0    false true   
1027   AsyncAppender-Worke system       9      WAITI 0      0:0    false true   
68     AsyncResolver-boots main         5      TIMED 0      0:0    false true   
264    AsyncResolver-boots main         5      WAITI 0      0:0    false true   
1025   Attach Listener     system       9      RUNNA 0      0:0    false true   
9      Common-Cleaner      InnocuousThr 8      TIMED 0      0:0    false true   
151    DataPublisher       main         5      TIMED 0      0:2    false true   
107    DestroyJavaVM       main         5      RUNNA 0      0:37   false false  
69     DiscoveryClient-0   main         5      TIMED 0      0:1    false true   
70     DiscoveryClient-1   main         5      WAITI 0      0:1    false true   
748    DiscoveryClient-Cac main         5      WAITI 0      0:9    false true   
751    DiscoveryClient-Hea main         5      WAITI 0      0:7    false true   
71     DiscoveryClient-Ins main         5      TIMED 0      0:0    false true   
24     Druid-ConnectionPoo main         5      WAITI 0      0:0    false true   
21     Druid-ConnectionPoo main         5      WAITI 0      0:0    false true   
25     Druid-ConnectionPoo main         5      TIMED 0      0:0    false true   
22     Druid-ConnectionPoo main         5      TIMED 0      0:0    false true   
20     Druid-ConnectionPoo main         5      TIMED 0      0:0    false true   
67     Eureka-JerseyClient main         5      WAITI 0      0:0    false true   
3      Finalizer           system       8      WAITI 0      0:0    false true   
119    ForkJoinPool.common main         5      WAITI 0      0:9    false true   
30     I/O dispatcher 1    main         5      RUNNA 0      0:2    false false  
155    NFLoadBalancer-Ping main         5      TIMED 0      0:0    false true   
150    PollingServerListUp main         5      WAITI 0      0:0    false true   
157    PollingServerListUp main         5      WAITI 0      0:0    false true   
2      Reference Handler   system       10     RUNNA 0      0:0    false true   
146    RibbonApacheHttpCli main         5      TIMED 0      0:0    false true   
135    RxComputationSchedu main         5      TIMED 0      1:44   false true   
132    RxIoScheduler-1 (Ev main         5      TIMED 0      0:0    false true   
4      Signal Dispatcher   system       9      RUNNA 0      0:0    false true   
114    SimplePauseDetector main         5      TIMED 0      2:10   false true   
115    SimplePauseDetector main         5      TIMED 0      2:10   false true   
116    SimplePauseDetector main         5      TIMED 0      2:12   false true


发现关键字PollingServerListUp有两个线程,150和157,分别看看他们在干什么:

[arthas@24]$ thread 150
thread 150
"PollingServerListUpdater-0" Id=150 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@778af81b
    at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@778af81b
    at java.base@11.0.4/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
    at java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
    at java.base@11.0.4/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
    at java.base@11.0.4/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
    at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
    at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
    at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
Affect(row-cnt:0) cost in 322 ms.
[arthas@24]$ thread 157
thread 157
"PollingServerListUpdater-1" Id=157 WAITING on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@3adcda2
    at java.base@11.0.4/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@3adcda2
    at java.base@11.0.4/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
    at java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
    at java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917)
    at java.base@11.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240)
    at java.base@11.0.4/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:959)
    at com.netflix.loadbalancer.BaseLoadBalancer.setServersList(BaseLoadBalancer.java:475)
    at com.netflix.loadbalancer.DynamicServerListLoadBalancer.setServersList(DynamicServerListLoadBalancer.java:156)
    at com.netflix.loadbalancer.DynamicServerListLoadBalancer.updateAllServerList(DynamicServerListLoadBalancer.java:267)
    at com.netflix.loadbalancer.DynamicServerListLoadBalancer.updateListOfServers(DynamicServerListLoadBalancer.java:250)
    at com.netflix.loadbalancer.DynamicServerListLoadBalancer$1.doUpdate(DynamicServerListLoadBalancer.java:62)
    at com.netflix.loadbalancer.PollingServerListUpdater$1.run(PollingServerListUpdater.java:116)
    at java.base@11.0.4/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base@11.0.4/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
    at java.base@11.0.4/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
    at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@6ab1c5f9

发现PollingServerListUpdater-1一直等待获取锁,看来问题就在这里了,查看对应的Ribbon代码,发现:

  1. PollingServerListUpdater-1需要获取allServerLock的写锁
  2. allServerLock的读锁,只有runPinger(对于每个实例ping,需要读取实例列表)获取。

那么是不是Ping出了问题,我们看下对应的获取锁的代码BaseLoadBalancer

public void runPinger() throws Exception {
    //省略无用代码
    allLock = allServerLock.readLock();
    allLock.lock();
    allServers = allServerList.toArray(new Server[allServerList.size()]);
    allLock.unlock();

我们看到,这里的代码,并没有用try{lock} finally {unlock}的套路,如果中间代码异常,则会锁不能释放。 并且这里是重入锁,lock多少次就要unlock多少次,少一次,其他线程都获取不到。

allServers = allServerList.toArray(new Server[allServerList.size()]);并没有涉及任何IO,只是数据变换,也没有空指针异常。猜想是发生了OOM异常,导致内存没有分配。检查日志,果然发现了OOM。

这件事告诉我们,对于锁,一定要try{lock} finally {unlock}。就算代码不会抛出任何异常,发生OOM时,也有可能导致锁不能释放

感觉这个代码还是修一下吧,提了个issue给ribbon


相关实践学习
SLB负载均衡实践
本场景通过使用阿里云负载均衡 SLB 以及对负载均衡 SLB 后端服务器 ECS 的权重进行修改,快速解决服务器响应速度慢的问题
负载均衡入门与产品使用指南
负载均衡(Server Load Balancer)是对多台云服务器进行流量分发的负载均衡服务,可以通过流量分发扩展应用系统对外的服务能力,通过消除单点故障提升应用系统的可用性。 本课程主要介绍负载均衡的相关技术以及阿里云负载均衡产品的使用方法。
相关文章
|
Arthas Java 测试技术
服务端接口响应慢怎么定位?阿里开源工具arthas帮你解决。
接口中又调用了三个方法,分别sleep了1秒,3秒,6秒,模拟代码逻辑耗时。 然后先下载arthas
|
Arthas Java 测试技术
将线程死锁检测集成进arthas以提高定位效率
将线程死锁检测集成进arthas以提高定位效率
398 0
|
Arthas Java 测试技术
59-微服务技术栈(高级):在线检测工具Arthas(精准定位Java应用CPU负载过高)
开发者对于生产问题故障的排查、定位,随着微服务的喷发,也不再像是以前那边依赖纯日志、gc日志进行问题排查与定位了,本节开始介绍一个生产环境使用的排错工具Arthas,帮助大家更高效、便捷地实现生产问题排错。
290 0
|
Arthas 监控 Java
使用阿里arthas 定位oom问题原因
1,最近我们生产服务器在做营销活动的时候突然发生oom,堆内存溢出。这个很让人头疼, 我们开发排查了好长时间,最终找到了问题的所在。这里我先买一个关子,问题的原因大家看完文章就一清二白了。
使用阿里arthas  定位oom问题原因
|
Arthas 缓存 前端开发
如何使用 Arthas 定位 Spring Boot 接口超时
公司有个渠道系统,专门对接三方渠道使用,没有什么业务逻辑,主要是转换报文和参数校验之类的工作,起着一个承上启下的作用。
如何使用 Arthas 定位 Spring Boot 接口超时
|
Arthas XML 监控
Spring Boot 微服务性能下降九成!使用 Arthas 定位根因
接收到公司业务部门的开发反馈,应用在升级公司内部框架后,UAT(预生产)环境接口性能压测不达标。
Spring Boot 微服务性能下降九成!使用 Arthas 定位根因
|
Arthas 负载均衡 Dubbo
Arthas 定位 Dubbo 手动注册 Eureka 异常
很久没有写技术分享博客,因为发现一个好的工具确实有点忍不住分享一下,毕竟独乐乐不如众乐乐。这里需要说的主角就是 Artahs。Arthas 使用文档很详细,我这里主要记录一下使用 Arthas 的一点总结。
Arthas 定位 Dubbo 手动注册 Eureka 异常
|
Arthas Cloud Native Java
利用 Arthas 精准定位 Java 应用 CPU 负载过高问题
最近我们线上有个应用服务器有点上头,CPU总能跑到99%,我寻思着它流量也不大啊,为啥能把自己整这么累?于是我登上这台服务器,看看它到底在干啥!
利用 Arthas 精准定位 Java 应用 CPU 负载过高问题
|
Arthas Dubbo Cloud Native
Arthas | 定位线上 Dubbo 线程池满异常
Dubbo 线程池满异常应该是大多数 Dubbo 用户都遇到过的一个问题,本文以 Arthas 3.1.7 版本为例,介绍如何针对该异常进行诊断,主要使用到 dashboard / thread 两个指令。
Arthas | 定位线上 Dubbo 线程池满异常
|
6月前
|
Arthas 监控 Java
Java 诊断利器 Arthas使用
Java 诊断利器 Arthas使用
213 0