ANR Application Not Responding-阿里云开发者社区

ANR全称Application Not Responding

一、ANR产生的原因。

只有当应用程序的UI线程响应超时才会引起ANR，超时产生原因一般有2种。

● 当前的事件没有机会得到处理

● 当前的事件正在处理，但是由于耗时太长没能及时完成

二、ANR的分类（三种）。

1.KeyDispatchTimeout:最常见一种类型，原因是View的按键事件或触摸事件在5秒内无法得到响应。

2.BroadcastTimout:原因是广播接收者(BrocastReceiver)的onReceive()函数在特定时间内(10秒)无法完成处理。

3.ServiceTimeout:原因是服务(Service)的各个声明周期函数在特定时间(20秒)内无法完成处理。

 a.View的点击事件或者触摸事件在特定的时间（5s）内无法得到响应
 
 b.主线程在执行BroadcastReceiver的onReceive()函数时10秒内没有处理完毕
 
 c.主线程在Service的各个生命周期函数时20秒内没有处理完毕。

三、典型的ANR问题场景。

1.应用程序UI线程存在耗时操作，例如在UI线程中进行网络请求，数据库操作或者文件操作等，可能会导致UI线程无法及时处理用户输入等。

2.应用程序UI线程等待子线程释放某个锁，从而无法处理用户的请求的输入。

3.耗时操作的动画需要大量的计算工作，可能导致CPU负载过重。

四、ANR的定位和分析。

1.LogCat日志信息。

2.手机内部的anr文件(位于/data/anr/)。例如 anr_2022-06-27-12-47-52-079

----- pid 8418 at 2022-06-27 12:47:52 ----- pid 在什么时候出现anr
Cmd line: com.example.testdemo              对应的 包名
 
 
"main" prio=5 tid=1 Sleeping
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x71b2a1f0 self=0xf1e5ce00
  | sysTid=8418 nice=-10 cgrp=default sched=0/0 handle=0xf238bdc0
  | state=S schedstat=( 1176699784 74358829 585 ) utm=110 stm=6 core=3 HZ=100
  | stack=0xff2d2000-0xff2d4000 stackSize=8192KB
  | held mutexes=
  at java.lang.Thread.sleep(Native method)
  - sleeping on <0x0d9beb8e> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:440)
  - locked <0x0d9beb8e> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:356)
  at com.example.testdemo.MainActivity$1.onClick(MainActivity.java:19)
  at android.view.View.performClick(View.java:7140)
  at com.google.android.material.button.MaterialButton.performClick(MaterialButton.java:1194)
  at android.view.View.performClickInternal(View.java:7117)
  at android.view.View.onKeyUp(View.java:14165)
  at android.widget.TextView.onKeyUp(TextView.java:8543)
  at android.view.KeyEvent.dispatch(KeyEvent.java:2825)
  at android.view.View.dispatchKeyEvent(View.java:13374)
  at android.view.ViewGroup.dispatchKeyEvent(ViewGroup.java:1922)
  at android.view.ViewGroup.dispatchKeyEvent(ViewGroup.java:1922)
  at android.view.ViewGroup.dispatchKeyEvent(ViewGroup.java:1922)
  ... repeated 2 times

"main" prio=5 tid=1 Sleeping

分别代表thread name, thread Priority, DVM thread id, DVM thread status

"main" :main thread -> activity thread

prio :java thread priority default is 5, （正常区域是1-10）

tid:是DVM thread id, 不是 linux thread id（下一行的sysTid才是）

Native:DVM thread Status 正常有这些状态（ZOMBIE, RUNNABLE, TIMED_WAIT, MONITOR, WAIT, INITALIZING,STARTING, NATIVE, VMWAIT, SUSPENDED,UNKNOWN）

group="main" sCount=1 dsCount=0 flags=1 obj=0x416eaf18 self=0x416d8650

代表 DVM thread status。

group:是线程所处的线程组 default is “main”

sCount: 线程被正常挂起的次数 1 (thread suspend count)

dsCount: 线程因调试而挂起次数 0 (thread dbg suspend count)

obj: 当前线程所关联的java线程对象 0x75720fb8 (thread obj address)

sef: 该线程本身的地址 0x7f7e8af800 (thread point address)

sysTid=30307 nice=0 sched=0/0 cgrp=apps handle=1074565528

代表Linux thread status显示线程调度信息

sysTId: linux系统下的本地线程id linux thread tid

Nice:线程的调度有优先级 linux thread nice value

cgrp: 优先组属 c group

sched: 调度策略 cgroup policy/gourp id

handle: 处理函数地址 handle address

state=S schedstat=( 0 0 0 ) utm=5 stm=4 core=3

代表CPU Sched stat 显示更多该线程当前上下文

state:调度状态 process/thread state （正常有 "R (running)", "S (sleeping)", "D (disk sleep)", "T (stopped)", "t (tracing stop)", "Z (zombie)", "X (dead)", "x (dead)", "K (wakekill)", "W (waking)",），通常一般的Process 处于的状态都是S (sleeping), 而如果一旦发现处于如D (disk sleep), T (stopped), Z (zombie) 等就要认真审查.

schedstat (Run CPU Clock/ns, Wait CPU Clock/ns, Slice times) 该线程运行信息

utm: utime, user space time 线程用户态下使用的时间值(单位是jiffies）

stm: stime, kernel space time 内核态下得调度时间值

core: now running in cpu. 最后运行改线程的cup标识

stack=0x7f7dc93000-0x7f7dc95000 stackSize=1020KB

代表堆栈地址区域及size

held mutexes=

代表是否被锁住，正常有四个属性(mutexes: tll=0 tsl=0 tscl=0 ghl=0)，0表示unlock，其它值都代表被lock，

tll: thread List Lock,

tsl: thread Suspend Lock,

tscl: thread Suspend Count Lock

ghl: gc Heap Lock

剩余的就是一些 Call Stack

五、ANR的处理

三种不同的情况, 一般的处理情况如下

1.主线程阻塞

开辟单独的子线程来处理耗时阻塞事务.

2.CPU满负荷, I/O阻塞

I/O阻塞一般来说就是文件读写或数据库操作执行在主线程了, 也可以通过开辟子线程的方式异步执行.

3.内存不够用

增大VM内存, 使用largeHeap属性, 排查内存泄露等.

六、ANR的检测

使用StrictMode

严格模式StrictMode是Android SDK提供的一个用来检测代码中是否存在违规操作的工具类，StrictMode主要检测两大类问题。

线程策略 ThreadPolicy

detectCustomSlowCalls：检测自定义耗时操作
detectDiskReads：检测是否存在磁盘读取操作
detectDiskWrites：检测是否存在磁盘写入操作
detectNetWork：检测是否存在网络操作

虚拟机策略VmPolicy

detectActivityLeaks：检测是否存在Activity泄露
detectLeakedClosableObjects：检测是否存在未关闭的Closeable对象泄露
detectLeakedSqlLiteObjects：检测是否存在Sqlite对象泄露
setClassInstanceLimit：检测类实例个数是否超过限制

可以看到，ThreadPolicy可以用来检测可能存在的主线程耗时操作，需要注意的是我们只能在Debug版本中使用它，发布到市场上的版本要关闭掉。StrictMode的使用很简单，我们只需要在应用初始化的地方例如Application或者MainActivity类的onCreate方法中执行如下代码：

StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder()
                .detectAll().penaltyLog().penaltyDialog().build());
        
StrictMode.setVmPolicy(new StrictMode.VmPolicy.Builder().detectAll()
.penaltyLog().build());
 
penaltyLog表示在Logcat中打印日志，
detectAll方法表示启动所有的检测策略

BlockCanary

BlockCanary是一个非侵入式的性能监控函数库，它的用法和leakCanary类似，只不过后者监控应用的内存泄露，而BlockCanary主要用来监控应用主线程的卡顿。它的基本原理是利用主线程的消息队列处理机制，通过对比消息分发开始和结束的时间点来判断是否超过设定的时间，如果是，则判断为主线程卡顿。它的集成很简单

1.在build.gradle中引入依赖

implementation 'com.github.markzhai:blockcanary-android:1.5.0'

2.在Application类中进行配置和初始化

BlockCanary.install(this,new MyBlockCanaryContext()).start();

ANR Application Not Responding

一、ANR产生的原因。

二、ANR的分类（三种）。

三、典型的ANR问题场景。

四、ANR的定位和分析。

五、ANR的处理

六、ANR的检测

热门文章

最新文章

相关电子书