原文出处:https://www.oracle.com/technical-resources/articles/javase/jvm-tool-interface.html
翻译内容来自有道翻译。
The JVM tool interface (JVM TI) is a standard native API that allows for native libraries to capture events and control a Java Virtual Machine (JVM) for the Java platform. * These native libraries are sometimes called agent libraries and are often used as a basis for the Java technology-level tool APIs, such as the Java Debugger Interface (JDI) that comes with the Java Development Kit (JDK). Profiler tool vendors will often need to create an agent library that uses JVM TI. This article explores some basics of writing a JVM TI agent library by walking through the heapTracker demo agent available in the JDK downloads.
JVM工具接口(JVM TI)是一个标准的本机API,允许本机库捕获事件并控制Java平台的Java虚拟机(JVM)。*这些本机库有时被称为代理库,通常用作Java技术级工具api的基础,例如Java开发工具包(JDK)附带的Java调试器接口(JDI)。分析器工具供应商通常需要创建一个使用JVM TI的代理库。本文通过讲解JDK下载中提供的heapTracker演示代理,探讨编写JVM TI代理库的一些基础知识。
In the releases prior to JDK 5.0, an agent was loaded into a virtual machine (VM) at initialization with the option -XrunNAME, where NAME is the name of the native shared library or DLL, such as libNAME.so or NAME.dll. For example, when using HPROF and inputting java - Xrunhprof, the library libhprof.so or hprof.dll would be found in the JDK. The VM would cause that library to be dynamically loaded, and the VM would make a call into that library to get it started. The option and timing of when these libraries were loaded were nonstandard. The -Xrun option would load the libraries after the VM had initialized itself, sometimes resulting in early VM events not being available to the agent. The library itself would use the Java Native Interface (JNI) and either JVM Debug Interface (JVMDI) for debugging or the experimental JVM Profiling Interface (JVMPI) for profiling. Both of these are being removed from future JDK releases.
在JDK 5.0之前的版本中,在初始化时使用-XrunNAME选项将代理加载到虚拟机(VM)中,其中NAME是本机共享库或DLL的名称,例如libNAME.so或NAME.dll。例如,当使用HPROF并输入java - Xrunhprof时,库libhprof.so或hprof.dll将在JDK中找到。VM将导致动态加载该库,并且VM将调用该库以启动它。加载这些库的选项和时间是不标准的。-Xrun选项将在VM初始化后加载库,有时会导致代理无法使用早期VM事件。库本身将使用Java本机接口(JNI)和JVM调试接口(JVMDI)进行调试,或者使用实验性的JVM分析接口(JVMPI)进行分析。这两个都将从未来的JDK版本中删除。
Beginning with JDK 5.0, the new standard option is -agentlib, for example, java -agentlib:hprof, although JDK 5.0 still accepts the -Xrun option. The new agent-loading options are documented and official. The library is loaded before the VM has initialized, allowing the agent library to capture early VM events that it could not access before. The library itself then uses JVM TI and JNI for debugging, profiling, or doing anything an agent does. A set of sample JVM TI agents is available in the demo directory of the JDK 5.0 or the JDK 6 download. Source and binaries are included for those interested in creating their own custom agent library.
从JDK 5.0开始,新的标准选项是-agentlib,例如java -agentlib:hprof,尽管JDK 5.0仍然接受-Xrun选项。新的代理加载选项有文档记录并且是正式的。该库在VM初始化之前加载,允许代理库捕获以前无法访问的早期VM事件。然后,库本身使用JVM TI和JNI进行调试、分析或执行代理所做的任何事情。JDK 5.0或JDK 6下载的演示目录中有一组示例JVM TI代理。源代码和二进制文件包括给那些有兴趣创建自己的自定义代理库的人。
Buyer Beware: Don’t Accept Agents From Strangers
Because the agent library will be operating in the same process and address space as the VM itself, anything inside the agent code will run in the VM process too. A bad agent can crash the entire VM process with a trivial null pointer dereference. Agents can also be very difficult to get right. Agent libraries must be re-entrant and MT-safe, and they must follow all the JVM TI and JNI rules. For instance, if your agent leaks memory by calling malloc() and not doing the free(), then the VM will appear to have a leak. Allocating too much memory will cause the VM process to fail with an out of memory error. You must pay close attention to detail when you add an agent library.
因为代理库将在与VM本身相同的进程和地址空间中运行,所以代理代码中的任何内容也将在VM进程中运行。一个坏的代理可以通过一个简单的空指针解引用使整个VM进程崩溃。要找到合适的代理也非常困难。代理库必须是可重入的、MT-安全的,并且必须遵循所有的JVM TI和JNI规则。例如,如果您的代理通过调用malloc()而不执行free()而泄漏内存,那么虚拟机将出现泄漏。分配过多内存会导致虚拟机进程失败,出现内存不足错误。在添加代理库时,必须密切注意细节。
Native Library Loading
The VM process must be able to locate the native library by way of the platform-specific search rules, so you must either copy the library into your JDK with the other shared libraries or make it accessible through a platform-specific mechani** so that a process can locate it. For example, you can use LD_LIBRARY_PATH on the Solaris Operating Environment or Linux operating system, or you can use PATH on Microsoft Windows. In addition, the agent library must be able to locate all the external symbols it needs from any platform-specific shared libraries. On the Solaris and Linux operating systems, you can use the ldd utility to verify that a native library knows how to find all the necessary externals. Once the VM process has successfully loaded an agent library, it looks for a symbol in it to call and establish the agent-to-VM connection. The native library should have exposed an exported symbol with the name Agent_Onload. This will be the first function called in the agent library.
VM进程必须能够通过特定于平台的搜索规则来定位本机库,因此您必须将该库与其他共享库一起复制到您的JDK中,或者通过特定于平台的机制使其可访问,以便进程能够定位它。例如,在Solaris操作环境或Linux操作系统中使用LD_LIBRARY_PATH,在Microsoft Windows操作系统中使用PATH。此外,代理库必须能够从任何特定于平台的共享库中定位所需的所有外部符号。在Solaris和Linux操作系统上,可以使用ldd实用程序来验证本机库是否知道如何查找所有必要的外部组件。一旦VM进程成功加载了一个代理库,它就会在其中寻找一个符号来调用并建立代理到VM的连接。本机库应该已经公开了一个名为Agent_Onload的导出符号。这将是代理库中调用的第一个函数。
The Dynamic Tracing (DTrace) Agent
An example VM agent of note is the Dynamic Tracing (DTrace) agent located at the Solaris 10 OS DTrace VM agents project. The dvm.zip file includes the built libraries and the source code to the Solaris OS JVM TI agent.
一个值得注意的虚拟机代理示例是位于Solaris 10 OS DTrace VM代理项目中的动态跟踪(DTrace)代理。dvm.zip文件包含构建的库和Solaris OS JVM TI代理的源代码。
DTrace is a comprehensive dynamic tracing framework for the Solaris OS that provides a powerful infrastructure to permit administrators, developers, and service personnel to concisely answer arbitrary questions about the behavior of the OS and user programs. The Solaris Dynamic Tracing Guide describes how to use DTrace to observe, debug, and tune system behavior. The guide also includes a complete reference for bundled DTrace observability tools and the D programming language.
DTrace是一个针对Solaris OS的全面动态跟踪框架,它提供了一个强大的基础设施,允许管理员、开发人员和服务人员简单地回答有关操作系统和用户程序行为的任意问题。Solaris动态跟踪指南描述了如何使用DTrace来观察、调试和调优系统行为。该指南还包括**的DTrace可观察性工具和D编程语言的完整参考。
The JDK 6 release includes built-in DTrace probes, whereas older JDKs such as 5.0 or 1.4.2 can be limited with respect to DTrace. The VM agent for use with Solaris 10 OS dynamic tracing ( DVM agent) is useful for older JDK releases in that it indirectly provides DTrace probes inside the agent library itself. The DVM agent is unusual in that it requests VM events but often does nothing itself with the event besides providing a DTrace probe point. Though this creates some unnecessary overhead, the functionality it provides can be extremely valuable.
JDK 6发行版包含内置的DTrace探测,而较老的JDK(如5.0或1.4.2)可以限制DTrace。用于Solaris 10操作系统动态跟踪的VM代理(DVM代理)对于旧版本的JDK非常有用,因为它间接地在代理库本身内部提供了DTrace探测。DVM代理的不同寻常之处在于,它请求VM事件,但除了提供DTrace探测点之外,本身通常不处理事件。尽管这会产生一些不必要的开销,但它提供的功能非常有价值。
Again, the JDK 6 release includes virtually all required built-in DTrace probes, thus eliminating the need for a DVM agent.
同样,JDK 6发行版包含了几乎所有必需的内置DTrace探针,从而消除了对DVM代理的需求。
Agent Interfaces
Great care must be taken when writing agents because exposing your agent requires that you have a well-planned testing strategy and are familiar with highly recursive and highly re-entrant coding.
在编写代理时必须非常小心,因为公开代理要求您拥有计划良好的测试策略,并且熟悉高度递归和高度可重入的编码。
In general, byte-code instrumentation (BCI) is the recommended way to instrument class files, and BCI is easy to do in JVM TI. BCI provides a way to inject code into the class file methods, either before the VM sees the class file ( ClassFileLoadHook) or by redefining the class files on the fly ( RedefineClass). See this blog entry for more information about BCI. Note: JVM TI does not provide code to do the BCI. Instead, its goal is to allow you to replace class files that have been BCI’d. See the section " BCI and BCI Events" later in this article for more information.
通常,字节码插装(BCI)是推荐的插装类文件的方法,而BCI在JVM TI中很容易实现。BCI提供了一种将代码注入类文件方法的方法,可以在VM看到类文件之前(ClassFileLoadHook),也可以通过动态重新定义类文件(RedefineClass)。有关BCI的更多信息,请参阅此博客条目。注意:JVM TI不提供执行BCI的代码。相反,它的目标是允许您替换已经BCI化的类文件。有关更多信息,请参阅本文后面的“BCI和BCI事件”一节。
Depending on your needs, it may be helpful to get a basic understanding of what agent interfaces can and cannot do. The documentation is worth browsing to become familiar with these interfaces. Look at the documentation for the following JDKs:
- JDK 5.0 agent API and JNI
- JDK 6 agent API and JNI
根据您的需要,基本了解代理接口能做什么和不能做什么可能会有所帮助。为了熟悉这些接口,文档值得一看。查看以下jdk的文档:
- JDK 5.0 agent API and JNI
- JDK 6 agent API and JNI
Agent Initialization
After the VM has located your shared library and successfully loaded it into the VM process, it looks in the library for Agent_OnLoad. JVM TI has capabilities that must be requested during the Agent_OnLoad execution. This better informs the VM what the agent will need to do and allows for optimal performance based on the capabilities you have requested. To avoid unnecessary overhead in the VM, agents generally should request only the capabilities they need.
在VM找到您的共享库并成功地将其加载到VM进程后,它将在库中查找Agent_OnLoad。JVM TI具有在Agent_OnLoad执行期间必须请求的功能。这将更好地通知VM代理需要做什么,并允许基于您所请求的功能实现最佳性能。为了避免VM中不必要的开销,代理通常应该只请求它们需要的功能。
So what does agent initialization look like? Following is some code to illustrate. Note: This sample code has been shortened to make it easier to follow, so it is incomplete. The code is mainly from the heapTracker demo JVM TI agent that is shipped with JDK 5.0 or more recent versions. To find all the details and comments, get the complete copy of heapTracker.c in the demo/jvmti/heapTracker directory of any JDK 5.0 or JDK 6 binary download.
那么代理初始化是什么样子的呢?下面是一些用于说明的代码。注意:为了便于理解,这个示例代码被缩短了,所以它是不完整的。代码主要来自JDK 5.0或最新版本附带的heapTracker演示JVM TI代理。要找到所有细节和注释,请在任何JDK 5.0或JDK 6二进制下载的demo/jvmti/heapTracker目录中获得heapTracker.c的完整副本。
#include "jvmti.h" #include "jni.h" static jrawMonitorID agent_lock; JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM *vm, char *options, void *reserved) { jvmtiEnv *jvmti; jvmtiError error; jint res; jvmtiCapabilities capabilities; jvmtiEventCallbacks callbacks; // Create the JVM TI environment (jvmti). res = (*vm)->GetEnv(vm, (void **)&jvmti, JVMTI_VERSION_1); // If res!=JNI_OK generate an error. // Parse the options supplied to this agent on the command line. parse_agent_options(options); // If options don't parse, do you want this to be an error? // Clear the capabilities structure and set the ones you need. (void)memset(&capabilities,0, sizeof(capabilities)); capabilities.can_generate_all_class_hook_events = 1; capabilities.can_tag_objects = 1; capabilities.can_generate_object_free_events = 1; capabilities.can_get_source_file_name = 1; capabilities.can_get_line_numbers = 1; capabilities.can_generate_vm_object_alloc_events = 1; // Request these capabilities for this JVM TI environment. error = (*jvmti)->AddCapabilities(jvmti, &capabilities); // If error!=JVMTI_ERROR_NONE, your agent may be in trouble. // Clear the callbacks structure and set the ones you want. (void)memset(&callbacks,0, sizeof(callbacks)); callbacks.VMStart = &cbVMStart; callbacks.VMInit = &cbVMInit; callbacks.VMDeath = &cbVMDeath; callbacks.ObjectFree = &cbObjectFree; callbacks.VMObjectAlloc = &cbVMObjectAlloc; callbacks.ClassFileLoadHook = &cbClassFileLoadHook; error = (*jvmti)->SetEventCallbacks(jvmti, &callbacks, (jint)sizeof(callbacks)); // If error!=JVMTI_ERROR_NONE, the callbacks were not accepted. // For each of the above callbacks, enable this event. error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_VM_START, (jthread)NULL); error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_VM_INIT, (jthread)NULL); error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_VM_DEATH, (jthread)NULL); error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_OBJECT_FREE, (jthread)NULL); error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_VM_OBJECT_ALLOC, (jthread)NULL); error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE, JVMTI_EVENT_CLASS_FILE_LOAD_HOOK, (jthread)NULL); // In all the above calls, check errors. // Create a raw monitor in the agent for critical sections. error = (*jvmti)->CreateRawMonitor(jvmti, "agent data", &(agent_lock)); // If error!=JVMTI_ERROR_NONE, then you haven't got a lock! return JNI_OK; // Indicates to the VM that the agent loaded OK. }
Note: The authors have ignored the error returns in this code sample, not a good practice. Do not copy the preceding code sample without adding checks on the error returns. In JVM TI, because agents run inside the VM itself, the developer should set up extensive error checking. Any error in an agent probably indicates a problem in the implementation of your agent and is something that you should address. Errors in the case of Agent_Onload should either cause the process to exit or should print an error and disable itself. It is important to decide what your agent should do in this situation.
注意:作者忽略了这个代码示例中的错误返回,这不是一个好的实践。如果没有对错误返回进行检查,就不要复制前面的代码示例。在JVM TI中,由于代理运行在VM内部,开发人员应该设置广泛的错误检查。代理中的任何错误都可能表明您的代理实现中存在问题,这是您应该解决的问题。在Agent_Onload的情况下,错误应该导致进程退出,或者应该打印错误并禁用自身。决定你的代理人在这种情况下应该怎么做是很重要的。
Event Callbacks
What do you do once you’ve set up Agent_OnLoad with the proper capabilities and event requests? Once the VM is running, you should start seeing calls to the functions you supplied to SetEventCallbacks. In this particular case, a naming convention, cb prefix, is used with these functions: cbVMStart, cbVMInit, cbObjectFree, cbVMObjectAlloc, cbClassFileLoadHook, and cbVMDeath. The cbClassFileLoadHook is called for each class file image being loaded, and it will likely be called first, at least until the first few basic system classes are loaded before cbVMStart.
一旦你用适当的功能和事件请求设置了Agent_OnLoad,你要做什么?VM运行后,应该开始看到对提供给SetEventCallbacks的函数的调用。在本例中,命名约定cb前缀用于下列函数:cbVMStart、cbVMInit、cbObjectFree、cbVMObjectAlloc、cbClassFileLoadHook和cbVMDeath。对于加载的每个类文件映像都会调用cbClassFileLoadHook,并且它可能会首先被调用,至少在cbVMStart之前加载前几个基本系统类之前都是如此。
JVMTI_EVENT_CLASS_FILE_LOAD_HOOK
Let’s now look at cbClassFileLoadHook, which is called before any class is loaded and, thus, will probably be called first:
现在让我们看看cbClassFileLoadHook,它在加载任何类之前被调用,因此可能会首先被调用:
static void JNICALL cbClassFileLoadHook(jvmtiEnv *jvmti, JNIEnv* env, jclass class_being_redefined, jobject loader, const char* name, jobject protection_domain, jint class_data_len, const unsigned char* class_data, jint* new_class_data_len, unsigned char** new_class_data) { enterCriticalSection(jvmti); { // Safety check, if VM is dead, skip this. if ( !gdata->vmDead ) { const char * classname; // If you have no classname, dig it out of the class file. if ( name == NULL ) { classname = java_crw_demo_classname(class_data, class_data_len, NULL); } else { classname = strdup(name); } // Assume you won't change the class file at first. *new_class_data_len = 0; *new_class_data = NULL; // Be careful that you don't track the tracker class. if (strcmp(classname, STRING(HEAP_TRACKER_class))!=0) { jint cnum; int systemClass; unsigned char *newImage; long newLength; // Processed class counter cnum = gdata->ccount++; // Tell java_crw_demo if this is an early class. systemClass = 0; if ( !gdata->vmStarted ) { systemClass = 1; } // Use java_crw_demo to create a new class file. newClassData = NULL; newLength = 0; java_crw_demo(cnum, classname, class_data, class_data_len, systemClass, STRING(HEAP_TRACKER_class), "L" STRING(HEAP_TRACKER_class) ";", NULL, NULL, NULL, NULL, STRING(HEAP_TRACKER_newobj), "(Ljava/lang/Object;)V", STRING(HEAP_TRACKER_newarr), "(Ljava/lang/Object;)V", &newClassData, &newLength, NULL, NULL); // If it did something, make a JVM TI copy. if ( newLength > 0 ) { unsigned char *jvmti_space; jvmti_space = (unsigned char *) allocate(jvmti, (jint)newLength); (void)memcpy(jvmti_space, newClassData, newLength); *new_class_data_len = (jint)newLength; *new_class_data = jvmti_space; } // Free any malloc space created. if ( newClassData != NULL ) { (void)free((void*)newClassData); } } // Free the classname (malloc space too). (void)free((void*)classname); } } exitCriticalSection(jvmti); }
The function java_crw_demo is a pure native and independent non-JNI library function that this article discusses in the section " BCI and BCI Events." Note that java_crw_demo accepts class data bytes and returns new class data bytes in memory. Nothing in the agent can disturb the global static data ( gdata) while the classload is processing. In effect, the classload has not occurred. Rather, this event represents the time during which the VM has located the class file and read it into memory, but before it has processed the class data bytes. During this event, the bytes that represent the class can be replaced, and the VM will load the replacement class data bytes. The java_crw_demo library is limited in terms of what gets changed in the class. It will not add methods, fields, or arguments to methods; nor will it change the basic interface or shape of the object. The intent here is to instrument the existing method byte codes.
函数java_crw_demo是一个纯粹的本地独立的非jni库函数,本文将在“BCI和BCI事件”一节中讨论它。注意,java_crw_demo接受类数据字节,并在内存中返回新的类数据字节。在处理类负载时,代理中的任何东西都不能干扰全局静态数据(gdata)。实际上,类加载没有发生。相反,此事件表示VM定位类文件并将其读入内存,但在它处理类数据字节之前的时间。在此事件期间,表示类的字节可以被替换,虚拟机将加载替换的类数据字节。java_crw_demo库在类中更改的内容方面受到限制。它不会向方法中添加方法、字段或参数;它也不会改变物体的基本界面或形状。这里的目的是检测现有的方法字节码。
Because some classes are loaded before the VM start event, what this callback does is quite important. This agent has requested these ClassFileLoadHook events from the beginning, so it needs to be very careful what it does if the VM has not started or been initialized before the callbacks have been made.
因为有些类是在VM启动事件之前加载的,所以这个回调所做的事情非常重要。这个代理从一开始就请求了这些ClassFileLoadHook事件,因此如果在进行回调之前VM还没有启动或初始化,那么它需要非常小心。
Notice that the test on gdata->vmDead offers protection if another thread is trying to terminate the VM. There is no need to process class files if VM death is imminent.
注意,在gdata->vmDead上的测试在另一个线程试图终止VM时提供了保护。如果VM即将死亡,则不需要处理类文件。
The classname NULL occurs rarely and only occurs when the ClassLoader.defineClass() method is used with a NULL name. When that happens, a java_crw_demo library function gets the name from the class file.
类名NULL很少出现,只有在ClassLoader.defineClass()方法使用NULL名称时才会出现。当这种情况发生时,java_crw_demo库函数从类文件中获取名称。
The new class data bytes may include calls to the Tracker class (see the section “BCI and BCI Events”), so the strcmp() on the HEAP_TRACKER_CLASS is very important. If you inject calls to the HEAP_TRACKER_CLASS inside the HEAP_TRACKER_CLASS, you will create an infinite loop.
新的类数据字节可能包括对Tracker类的调用(请参阅“BCI和BCI事件”一节),因此HEAP_TRACKER_CLASS上的strcmp()非常重要。如果您在HEAP_TRACKER_CLASS中注入对HEAP_TRACKER_CLASS的调用,您将创建一个无限循环。
The gdata->ccount allows a unique numeric ID for every class loaded. This is passed into the main java_crw_demo function.
gdata->ccount允许为加载的每个类使用唯一的数字ID。它被传递给主java_crw_demo函数。
Finally, note the use of gdata->vmStarted. A better solution might be coming, but for now, the first classes loaded between Agent_OnLoad and the VM_START event are considered, for lack of a better term, system classes. The java_crw_demo treats these classes, of which there are usually 12 or fewer, in a special way when instrumenting them due to their primordial nature and the state of the VM prior to the VM start event. For more on this subject, check the details of java_crw_demo in the demo/jvmti directory.
最后,注意gdata->vmStarted的使用。可能会有更好的解决方案,但目前,在Agent_OnLoad和VM_START事件之间加载的第一个类被认为是系统类,因为没有更好的术语。java_crw_demo在测试这些类(通常有12个或更少)时以一种特殊的方式处理它们,这是由于它们的原始性质和VM启动事件之前的VM状态。有关这个主题的更多信息,请查看demo/jvmti目录中java_crw_demo的详细信息。
The memory allocated by the java_crw_demo library is malloc() memory; it is not JVM TI-allocated memory. The VM gets the new class data bytes through the arguments new_class_data_len and new_class_data. The memory returned back to the VM must be allocated by way of JVM TI Allocate, which is why the malloc() allocated java_crw_demo memory is copied. The java_crw_demo code is neutral code and does not have any dependence on JVM TI or the VM. It’s a C library with standard C library dependencies.
java_crw_demo库分配的内存是malloc()内存;它不是JVM TI分配的内存。VM通过参数new_class_data_len和new_class_data获取新的类数据字节。返回给VM的内存必须通过JVM TI Allocate分配,这就是为什么要复制分配给java_crw_demo内存的malloc()。java_crw_demo代码是中立代码,不依赖于JVM TI或VM。它是一个带有标准C库依赖项的C库。
JVMTI_EVENT_VM_START
After some core system classes are loaded and the VM has started but not completely initialized, the VM_START event is posted. You can then call many JNI functions. However, because the VM is not fully initialized, there are limitations in what can be done at this point. At the VM start event, the VM is considered to be out of its primordial phase.
在加载了一些核心系统类并且虚拟机已经启动但还没有完全初始化之后,VM_START事件就会发布。然后可以调用许多JNI函数。但是,由于VM还没有完全初始化,所以在这一点上可以做的事情是有限制的。在虚拟机启动事件时,将认为虚拟机已脱离原始阶段。
static void JNICALL cbVMStart(jvmtiEnv *jvmti, JNIEnv *env) { enterCriticalSection(jvmti); { jclass klass; jfieldID field; jint rc; static JNINativeMethod registry[2] = { {STRING(HEAP_TRACKER_native_newobj), "(Ljava/lang/Object;Ljava/lang/Object;)V", (void*)&HEAP_TRACKER_native_newobj }, {STRING(HEAP_TRACKER_native_newarr), "(Ljava/lang/Object;Ljava/lang/Object;)V", (void*)&HEAP_TRACKER_native_newarr } }; // Find the tracker class. klass = (*env)->FindClass(env, STRING(HEAP_TRACKER_class)); // Register the native methods to the ones in this library. rc = (*env)->RegisterNatives(env, klass, registry, 2); // Get the static field "engaged" in this class. field = (*env)->GetStaticFieldID(env, klass, STRING(HEAP_TRACKER_engaged), "I"); // Set the value of this static field to "1." (*env)->SetStaticIntField(env, klass, field, 1); // Record that the VM has officially started. gdata->vmStarted = JNI_TRUE; } exitCriticalSection(jvmti); }
For the VM start event, you must first set up the Tracker class used for BCI. First, the JNI function FindClass is called to get the jclass handle. Note that this could trigger a ClassFileLoadHook event. Then the native methods are registered for the Tracker class with JNI RegisterNatives, and the jfieldID handle to engaged is obtained using JNI GetStaticFieldID. By setting this static field to 1, you essentially activate the calls inside this Tracker class. The Tracker source looks like this:
对于VM启动事件,必须首先设置用于BCI的Tracker类。首先,调用JNI函数FindClass来获取jclass句柄。注意,这可能触发ClassFileLoadHook事件。然后使用JNI RegisterNatives为Tracker类注册本机方法,并使用JNI GetStaticFieldID获得要engage的jfieldID句柄。通过将这个静态字段设置为1,您实际上激活了Tracker类中的调用。跟踪器的源代码是这样的:
public class HeapTracker { // The static field that controls tracking private static int engaged = 0; // Calls to this method will result in a call into the agent. private static native void _newobj(Object thread, Object o); // Calls to this method are injected into the class files. public static void newobj(Object o) { if ( engaged != 0 ) { _newobj(Thread.currentThread(), o); } } // Calls to this method will result in a call into the agent. private static native void _newarr(Object thread, Object a); // Calls to this method are injected into the class files. public static void newarr(Object a) { if ( engaged != 0 ) { _newarr(Thread.currentThread(), a); } } }
All the Tracker methods that will be called by the BCI classes modified by java_crw_demo are turned off by default until the engaged field value changes to 1. When this occurs, it triggers the injected Tracker Method calls to call the native methods that have already registered. Before this article discusses what occurs next in the native methods, note that these native calls cannot be turned on until the VM has started. It’s necessary to be able to call JNI functions in the native code, which is why engaged was set to 1 here and not earlier. If the above Tracker code were to initiate anything more sophisticated, such as calling methods in classes, that would require waiting until the VM initialization phase.
所有将被java_crw_demo修改的BCI类调用的Tracker方法在默认情况下都是关闭的,直到engaged字段值更改为1。当这种情况发生时,它会触发注入的Tracker Method调用,以调用已经注册的本机方法。在本文讨论本机方法中接下来会发生什么之前,请注意,在VM启动之前不能打开这些本机调用。能够在本机代码中调用JNI函数是必要的,这就是为什么在这里将engage设置为1而不是在前面。如果上面的Tracker代码要初始化任何更复杂的东西,比如在类中调用方法,那就需要等到VM初始化阶段。
TraceInfo and Tracker Methods
The heapTracker agent’s main function is to find out what is allocating the most space. At each object allocation, it’s important to find out what the stack trace is and then to tag the object with a reference to that trace information. Inside the agent itself, there is a TraceInfo struct, and pointers to these structs will serve as the value of the Object Tags. A tag is any 64-bit value. Along with the TraceInfo struct is support code to create a hash table for quick lookups. Because the creation or lookup of the TraceInfo will probably consume much application time when this agent is activated, speed and efficiency are important. Here are the basics to find TraceInfo:
heapTracker代理的主要功能是找出分配最多空间的是什么。在每次分配对象时,重要的是找出堆栈跟踪是什么,然后用对该跟踪信息的引用标记对象。在代理本身内部,有一个TraceInfo结构体,指向这些结构体的指针将作为对象标记的值。标签是任何64位的值。与TraceInfo结构体一起使用的是用于创建快速查找哈希表的支持代码。由于在激活此代理时,创建或查找TraceInfo可能会消耗大量应用程序时间,因此速度和效率非常重要。下面是查找TraceInfo的基本步骤:
static TraceInfo * findTraceInfo(jvmtiEnv *jvmti, jthread thread, TraceFlavor flavor) { TraceInfo *tinfo; jvmtiError error; tinfo = NULL; // The thread could be NULL in some situations, so be careful. if ( thread != NULL ) { static Trace empty; Trace trace; // Request a stack trace. trace = empty; error = (*jvmti)->GetStackTrace(jvmti, thread, 0, MAX_FRAMES+2, trace.frames, &(trace.nframes)); // If you get a PHASE error, the VM isn't ready, or it died. if ( error == JVMTI_ERROR_WRONG_PHASE ) { if ( flavor == TRACE_USER ) { tinfo = emptyTrace(TRACE_BEFORE_VM_INIT); } else { tinfo = emptyTrace(flavor); } } else { // If error!=JVMTI_ERROR_NONE, you have serious problems. check_jvmti_error(jvmti, error, "Cannot get stack trace"); // Look up this entry. tinfo = lookupOrEnter(jvmti, &trace, flavor); } } else { // If thread==NULL, it's assumed this is before VM_START. // But technically this should not happen, no tracking yet. if ( flavor == TRACE_USER ) { tinfo = emptyTrace(TRACE_BEFORE_VM_START); } else { tinfo = emptyTrace(flavor); } } return tinfo; }
If thread==NULL, that usually means that the VM initialization has not yet occurred, so you cannot get a stack trace. Calling GetStackTrace could also return an error message that the VM is not in the live phase, but a value of JVMTI_ERROR_NONE indicates that you did get a stack trace. Performing a lookupOrEnter of this stack trace into the hash table will return a reference to a TraceInfo structure. This pointer to a TraceInfo struct will then be used as the tag on this object. All objects allocated from the same stack trace will have the same tag.
如果thread==NULL,这通常意味着虚拟机初始化还没有发生,所以你不能得到堆栈跟踪。调用GetStackTrace还可以返回一个错误消息,说明虚拟机不在活动阶段,但是JVMTI_ERROR_NONE值表明您确实获得了堆栈跟踪。在哈希表中执行此堆栈跟踪的lookupOrEnter将返回对TraceInfo结构的引用。这个指向TraceInfo结构体的指针将被用作该对象上的标记。从同一个堆栈跟踪分配的所有对象将具有相同的标记。
When saving away stack trace information, pay attention to the TraceInfo struct and the number of stack traces received. There should be only one stack trace of an allocation byte code ( new or new array byte code), but how many stack traces will that be? It depends on the application, but it’s somewhat limited. Also, no cleanup is occurring in lookupOrEnter, so a very long-running application could experience some problems if the total number of allocation traces is very high. The hash table is also a fixed size, another potential problem. To avoid performance issues, being aware of the critical sections, one of which is in the lookupOrEnter() function, is crucial. Using too many critical sections could dramatically slow the entire application. The thread used here is the current user thread; limit what you are doing in these threads.
在保存堆栈跟踪信息时,请注意TraceInfo结构体和接收到的堆栈跟踪的数量。分配字节码(新的或新的数组字节码)应该只有一个堆栈跟踪,但是会有多少堆栈跟踪呢?这取决于应用程序,但有一定的局限性。另外,在lookupOrEnter中没有发生清理,因此如果分配跟踪的总数非常高,那么运行时间非常长的应用程序可能会遇到一些问题。哈希表也是固定大小的,这是另一个潜在的问题。为了避免性能问题,了解临界区(其中之一在lookupOrEnter()函数中)是至关重要的。使用太多临界区可能会极大地降低整个应用程序的速度。这里使用的线程是当前用户线程;限制你在这些线程中所做的事情。
Object allocations and object-free events are fairly critical-section free with regard to this particular agent.
对于这个特定的代理,对象分配和无对象事件在临界区是相当自由的。
JVMTI_EVENT_VM_INIT
After the VM_START event and approximately several hundred class load events, the VM will reach the fully initialized event.
在VM_START事件和大约几百个类加载事件之后,VM将达到完全初始化事件。
static void JNICALL cbVMInit(jvmtiEnv *jvmti, JNIEnv *env, jthread thread) { jvmtiError error; // Iterate over the entire heap and tag untagged objects. error = (*jvmti)->IterateOverHeap(jvmti, JVMTI_HEAP_OBJECT_UNTAGGED, &cbObjectTagger, NULL); enterCriticalSection(jvmti); { gdata->vmInitialized = JNI_TRUE; } exitCriticalSection(jvmti); }
Despite full initialization and setting gdata->vmInitialized, many objects were allocated but were not tracked because the Tracker classes are not turned on until the agent gets the VMStart event. Using the JVM TI IterateOverHeap to traverse the heap, objects can now be tagged. Remember: You cannot track objects unless you tag them.
尽管完全初始化并设置了gdata-> vminialized,但分配了许多对象但没有跟踪,因为直到代理获得VMStart事件才打开Tracker类。使用JVM TI IterateOverHeap遍历堆,现在可以标记对象了。记住:除非标记对象,否则无法跟踪对象。
JVMTI_EVENT_OBJECT_FREE
static void JNICALL cbObjectFree(jvmtiEnv *jvmti, jlong tag) { TraceInfo *tinfo; // Don't bother if dead. if ( gdata->vmDead ) { return; } // The object tag is actually a pointer to a TraceInfo struct. tinfo = (TraceInfo*)(void*)(ptrdiff_t)tag; // Decrement the use count. tinfo->useCount--; }
JVMTI_EVENT_VM_OBJECT_ALLOC
static void JNICALL cbVMObjectAlloc(jvmtiEnv *jvmti, JNIEnv *env, jthread thread, jobject object, jclass object_klass, jlong size) { TraceInfo *tinfo; // Don't bother if dead. if ( gdata->vmDead ) { return; } // Create a stack trace and tag the object. tinfo = findTraceInfo(jvmti, thread, TRACE_VM_OBJECT); tagObjectWithTraceInfo(jvmti, object, tinfo); }
JVMTI_EVENT_VM_DEATH
As the name implies, the JVM TI event named VM death is the last VM event. However, due to multiple threads, other event callbacks could still be in progress during this event callback. Depending on thread priorities, it’s hard to predict the timing of the code in these final callbacks. Some agents use a lock and a counter to keep track of the active callbacks, then wait in this VM death callback until the count reaches zero. Be careful here, and don’t assume that all event callbacks are completed.
顾名思义,名为VM死亡的JVM TI事件是最后一个VM事件。但是,由于存在多个线程,在此事件回调期间,其他事件回调可能仍在进行中。根据线程优先级,很难预测这些最终回调代码的时间。一些代理使用锁和计数器跟踪活动回调,然后在此VM死亡回调中等待,直到计数为零。这里要小心,不要假设所有事件回调都已完成。
static void JNICALL cbVMDeath(jvmtiEnv *jvmti, JNIEnv *env) { jvmtiError error; // IterateOverHeap can see garbage, so force a GC first. error = (*jvmti)->ForceGarbageCollection(jvmti); // Notice that you hold no locks on this call, that's important. error = (*jvmti)->IterateOverHeap(jvmti, JVMTI_HEAP_OBJECT_EITHER, &cbObjectSpaceCounter, NULL); enterCriticalSection(jvmti); { jclass klass; jfieldID field; jvmtiEventCallbacks callbacks; // Find the heap tracker class. klass = (*env)->FindClass(env, STRING(HEAP_TRACKER_class)); // Get the static "engaged" field. field = (*env)->GetStaticFieldID(env, klass, STRING(HEAP_TRACKER_engaged), "I"); // Set the engaged field to "0," turns off BCI calls in // Tracker class. (*env)->SetStaticIntField(env, klass, field, 0); // Clear the callbacks struct and clear the JVM TI callbacks. (void)memset(&callbacks,0, sizeof(callbacks)); error = (*jvmti)->SetEventCallbacks(jvmti, &callbacks, (jint)sizeof(callbacks)); // Consider the VM dead at this point. gdata->vmDead = JNI_TRUE; if ( gdata->traceInfoCount > 0 ) { TraceInfo **list; int count; int i; // Allocate space for a sorted list of TraceInfos. stdout_message("Dumping heap trace information\n"); list = (TraceInfo**)calloc(gdata->traceInfoCount, sizeof(TraceInfo*)); count = 0; for ( i = 0 ; i < HASH_BUCKET_COUNT ; i++ ) { TraceInfo *tinfo; tinfo = gdata->hashBuckets[i]; while ( tinfo != NULL ) { if ( count < gdata->traceInfoCount ) { list[count++] = tinfo; } tinfo = tinfo->next; } } // Sort the list and print out the top ones. qsort(list, count, sizeof(TraceInfo*), &compareInfo); for ( i = 0 ; i < count ; i++ ) { if ( i >= gdata->maxDump ) { break; } printTraceInfo(jvmti, i+1, list[i]); } // Free the space you allocated. (void)free(list); } } exitCriticalSection(jvmti); }
To summarize, first use JVM TI to force garbage collection in order to iterate over the heap and get a count, per stack trace, of the objects currently allocated. Next, turn the Tracker class off and disconnect all the JVM TI callbacks, keeping in mind that some callbacks may still be active and that you have turned off any future callbacks by removing their addresses from the JVM TI environment. You can accomplish the same thing by disabling the events. Finally, construct a single-dimensioned list of all the TraceInfo structures, sort it by allocation amount, and print out up to gdata->maxDump of the stack traces that allocated the most memory.
总之,首先使用JVM TI强制垃圾收集,以便遍历堆并获得当前分配的对象的每个堆栈跟踪的计数。接下来,关闭Tracker类并断开所有JVM TI回调,请记住一些回调可能仍然是活动的,并且您已经通过从JVM TI环境中删除它们的地址来关闭任何未来的回调。您可以通过禁用事件来实现同样的目的。最后,构造一个包含所有TraceInfo结构的一维列表,按分配量对其排序,并输出分配最多内存的堆栈跟踪直到gdata->maxDump。
Object Tagging
As you can see, working with VM agent code can be challenging. Look at the higher-level view to get a better grasp. This example code, called heapTracker, exists to track all object allocations in the heap, saving the stack trace where each object was allocated. Using BCI, this agent incorporates additional byte codes around the object allocations to capture the stack trace and tag the objects that were allocated with that stack trace. As the VM executes your byte code, it also executes the byte code that was added, calling the Tracker methods, which will then call the native methods that are registered for the Tracker class. Native methods create a TraceInfo struct and tag the object with that struct address.
如您所见,使用VM代理代码具有挑战性。查看更高级别的视图以更好地理解。这个示例代码称为heapTracker,用于跟踪堆中的所有对象分配,保存每个对象分配位置的堆栈跟踪。使用BCI,该代理在对象分配周围合并了额外的字节代码,以捕获堆栈跟踪并标记使用该堆栈跟踪分配的对象。当VM执行您的字节代码时,它也会执行添加的字节代码,调用Tracker方法,然后Tracker方法将调用为Tracker类注册的本机方法。本机方法创建一个TraceInfo结构体,并用该结构体地址标记对象。
Objects that have non-zero tags are treated differently. Tags are necessary for any objects you are concerned with – in this case, all objects. Only objects with tags will be seen in any JVMTI_EVENT_OBJECT_FREE event, so this is the only way that an object is currently allocated. To have a unique identification per object would require a unique tag value for every object. You could, for example, tag an object with an integer counter, but you would have only that counter, which represented when in allocation time an object was allocated. But you could capture more specific data if that counter were used to index additional data about an object. You could use the counter technique to track details about every 10th object, for example, or perhaps the last 1000 objects allocated. There are many possibilities.
具有非零标记的对象将被区别对待。标签对于您所关心的任何对象都是必要的——在本例中是所有对象。只有带有标记的对象才能在任何JVMTI_EVENT_OBJECT_FREE事件中看到,因此这是当前分配对象的唯一方式。要为每个对象提供唯一的标识,就需要为每个对象提供唯一的标记值。例如,您可以用一个整数计数器标记一个对象,但是您只有这个计数器,它表示在分配时间内分配对象的时间。但是,如果该计数器用于索引关于对象的其他数据,则可以捕获更具体的数据。例如,您可以使用计数器技术跟踪每10个对象的细节,或者可能是最近分配的1000个对象。有很多可能性。
A fairly common and low-impact shortcut when using JDK 6 is to tag only the actual Class objects and then use a JVM TI call to FollowReferences (new in JDK 6) to quickly sum up the object counts based solely on the type of classes they are. In this case, BCI is not needed, but you won’t know where the actual objects were allocated, just that they were allocated. Tagging only the Class objects creates very little overhead.
在使用JDK 6时,一种相当常见且影响较小的快捷方式是仅标记实际的Class对象,然后使用JVM TI调用FollowReferences (JDK 6中的新功能)来仅根据类的类型快速汇总对象计数。在这种情况下,不需要BCI,但是您不知道实际对象被分配到哪里,只知道它们被分配了。只标记Class对象只会产生很少的开销。
The VM’s garbage collector manages allocations by compacting, rearranging, and doing whatever is necessary to reclaim space and provide the space needed for allocations. In the process, objects get moved, which is why having a particular address in the process memory is not very helpful and why tags are used. If you want access to a tagged object, you can get the JNI jobject handle to an object with GetObjectsWithTags, and you can use any of the JNI or JVM TI calls to access that object through that JNI handle.
VM的垃圾收集器通过压缩、重新安排和执行任何必要的回收空间和提供分配所需的空间来管理分配。在进程中,对象会被移动,这就是为什么在进程内存中使用特定地址没有多大帮助,以及为什么要使用标记。如果希望访问带标记的对象,可以使用GetObjectsWithTags获取对象的JNI jobject句柄,并且可以使用任何JNI或JVM TI调用通过该JNI句柄访问该对象。
So how do you tag an object? There are multiple ways. You can use the explicit SetTag interface, and you can also simply assign the tag during some of the callbacks from interfaces such as IterateOverHeap. Both of these tagging mechani**s are used in the heapTracker.c example.
那么如何标记一个对象呢?有多种方法。您可以使用显式SetTag接口,也可以在从接口(如IterateOverHeap)回调的某些过程中简单地分配标记。在heapTracker.c示例中使用了这两种标记机制。
BCI and BCI Events
This example agent uses a native BCI library called java_crw_demo ( libjava_crw_demo.so or java_crw_demo.dll) that is available as part of the JDK downloads in the demo/jvmti/java_crw_demo directory.
这个示例代理使用一个名为java_crw_demo的本地BCI库。so或java_crw_demo.dll)可以作为JDK下载的一部分,在demo/jvmti/java_crw_demo目录下。
The java_crw_demo function provides for very simple byte-code instrumentation (BCI). It’s acceptable for some very basic needs such as these JVM TI demo agents and tools such as HPROF, but it has its limitations. The java_crw_demo function provides basic instrumentation of method entries, method returns, new byte codes, and new array byte codes. It’s written in C and has been used in several of the JVM TI demo agents and in HPROF. The byte codes injected are limited to simple dups and invokestatic byte codes to the methods of a Tracker class.
java_crw_demo函数提供了非常简单的字节码插装(BCI)。对于一些非常基本的需求(如JVM TI演示代理和HPROF等工具),它是可以接受的,但它有其局限性。java_crw_demo函数提供了方法条目、方法返回、新字节码和新数组字节码的基本插装。它是用C语言编写的,已经在几个JVM TI演示代理和HPROF中使用过。注入的字节码被限制为简单的dup,而invokestatic字节码被限制为Tracker类的方法。
Let’s look once more at the HeapTracker class:
public class HeapTracker { // The static field that controls tracking private static int engaged = 0; // Calls to this method will result in a call into the agent. private static native void _newobj(Object thread, Object o); // Calls to this method are injected into the class files. public static void newobj(Object o) { if ( engaged != 0 ) { _newobj(Thread.currentThread(), o); } } // Calls to this method will result in a call into the agent. private static native void _newarr(Object thread, Object a); // Calls to this method are injected into the class files. public static void newarr(Object a) { if ( engaged != 0 ) { _newarr(Thread.currentThread(), a); } } }
As you can see, there isn’t much to this class. The methods newobj() and newarr() will be called from the injected byte code of the application, which in turn will call the native methods, which have been registered as the functions HEAP_TRACKER_native_newobj() and HEAP_TRACKER_native_newarr() inside the native agent library. Of course, having the newobj() and newarr() methods call native methods was an implementation choice, and the developer used it in this case as the quickest way to get back into the native agent library.
如您所见,这个类没有太多内容。方法newobj()和newarr()将从应用程序注入的字节代码中调用,然后调用本机方法,本机方法已注册为本机代理库中的函数HEAP_TRACKER_native_newobj()和HEAP_TRACKER_native_newarr()。当然,使用newobj()和newarr()方法调用本机方法是一种实现选择,在这种情况下,开发人员将其用作返回本机代理库的最快方法。
The newobj() method needs to be called only in the <init> method of java.lang.Object. Though it is necessary to adjust the stacktrace to compensate for the few additional frames, it is an accurate accounting of all objects that are allocated and initialized. The byte-code injection is just a dup and an invokestatic byte-code insertion, along with the necessary constant pool entries for the Tracker classname and newobj() method name. The VM specification does not allow an object to be passed anywhere before it is initialized, so injecting byte codes after every new byte code will trigger verification errors when the object is passed into newobj(). An alternative is to find the new byte codes, add the dup after each, and then insert the invokestatic byte code after the matching method call for the specific class. Some profilers that use BCI may do this, but this **all demo does not.
newobj()方法只需要在java.lang.Object的<init>方法中调用。虽然有必要调整堆栈跟踪以补偿少数额外的帧,但它是分配和初始化的所有对象的准确计算。字节码注入只是一个dup和一个invokestatic字节码插入,以及Tracker类名和newobj()方法名所需的常量池条目。VM规范不允许对象在初始化之前传递到任何地方,因此在每个新的字节码之后注入字节码将在对象传递到newobj()时触发验证错误。另一种方法是找到新的字节码,在每个字节码之后添加dup,然后在特定类的匹配方法调用之后插入invokestatic字节码。一些使用BCI的分析器可能会这样做,但这个小演示不会。
For more details about how to modify class files, see the source code to the java_crw_demo library by downloading either JDK 5.0 or JDK 6.
有关如何修改类文件的详细信息,请参见下载JDK 5.0或JDK 6的java_crw_demo库的源代码。
You can use several BCI class libraries. The JDK provides a way for developers to write pure Java technology-based agents using the java.lang.instrument classes and the -javaagent option in JDK 5.0. So you are not limited to writing native C or C++ code when doing BCI.
您可以使用多个BCI类库。JDK为开发人员提供了一种方法,可以使用JDK 5.0中的Java .lang.instrument类和-javaagent选项编写纯Java技术的代理。因此,在进行BCI时,您不局限于编写本机C或c++代码。
Conclusion: Countless Solutions to Countless Problems
This article has focused on a particular application of JVM TI for a particular purpose. However, just as there are countless solutions to problems, there are countless varieties of agents that you can write. In general, you will need to experiment and take time to produce a good solution and a good robust agent. To gain better insight into your agent performance, use native tools such as those available in the Solaris 10 OS, DTrace, or Sun Studio Performance Analyzer, a tool that helps assess code performance, identify potential performance problems, and locates where problems occur inside the code.
本文主要介绍JVM TI用于特定目的的特定应用程序。然而,正如问题有无数种解决方案一样,也有无数种可以编写的代理。一般来说,你需要试验和花时间来产生一个好的解决方案和一个好的健壮剂。要更好地了解代理性能,请使用Solaris 10操作系统、DTrace或Sun Studio performance Analyzer等本地工具,这些工具有助于评估代码性能、识别潜在的性能问题,并定位代码中发生问题的位置。