作者:王智通
一、 前言
ajvm是笔者正在开发中的一个java虚拟机, 想通过编写这个jvm帮助程序员了解jvm的具体实现细节, 它是国内第一个开源的java虚拟机项目:https://github.com/cloudsec/ajvm, 同时笔者把它的开发笔记也分享到了ata上。 在前面4篇笔记中, 已经实现了class文件加载器, 反汇编器,jvm的crash信息处理, 同时它已经能运行简单的java代码了。 在今天的这篇笔记中, 将开始分享ajvm的内存管理模块是如何编写的。
二、内存分配
看下面一段java代码:
public class test6 { public static void main(String args[]) { int[] data, data1; int i; int num = 0; data = new int[2]; for (i = 0; i < 2; i++) { data[i] = i; } data1 = new int[3]; } }
首先用javac编译下, 然后用ajvm的反汇编器查看bytecode:
$./wvm -d test/test6.class Diassember bytecode: <init> ()V stack: 1 local: 1 0: aload_0 1: invokespecial #1 4: return main ([Ljava/lang/String;)V stack: 3 local: 5 0: iconst_0 1: istore 4 3: iconst_2 4: newarray 10 6: astore_1 7: iconst_0 8: istore_3 9: iload_3 10: iconst_2 11: if_icmpge 13 14: aload_1 15: iload_3 16: iload_3 17: iastore 18: iinc 3 1 21: goto 0xfffffff4 24: iconst_3 25: newarray 10 27: astore_2 28: return
源码中data = new int[2];对应的汇编指令为:
4: newarray 10
根据jvm虚拟机规范的描述, newarray指令的作用是, 从操作数堆栈用取出data数组的元素个数,然后根据newarray后面的type进行计算要申请的内存大小, type的值在虚拟机规范中如下:
#define T_BOOLEAN 4 #define T_CHAR 5 #define T_FLOAT 6 #define T_DOUBLE 7 #define T_BYTE 8 #define T_SHORT 9 #define T_INT 10 #define T_LONG 11
所以10代表这个int类型的数组, 接下来就要给data这个数组从heap中分配内存了。
void *alloc_newarray_memroy(u1 atype, int count) { void *addr = NULL; switch (atype) { case T_BOOLEAN: case T_CHAR: case T_BYTE: addr = (void *)slab_alloc(jvm_thread_mem, count * sizeof(char)); break; case T_SHORT: addr = (void *)slab_alloc(jvm_thread_mem, count * sizeof(short)); break; case T_INT: case T_FLOAT: addr = (void *)slab_alloc(jvm_thread_mem, count * sizeof(int)); break; case T_LONG: case T_DOUBLE: addr = (void *)slab_alloc(jvm_thread_mem, count * sizeof(long long)); break; default: error("bad atype value.n"); return NULL; } return addr; }
ajvm的内存堆用的是slab算法, slab的内存结构如下:
------- ------ ------ ------ |cache|--> |slab| --> |slab| -->|slab| ------- ------ ------ ------ |cache| ----- |cache| ... ----- ------ ------ ------ |cache|--> |slab| --> |slab| -->|slab| ----- ------ ----- ------ |cache| ... ------- |cache| ------- |cache|-->|slab|-->|slab| -->|slab| ------- ------ ------ ------
源码中的slab.c是它完整的实现, 不熟悉slab的同学请自行google。
三、垃圾回收
gc是java程序员普遍关心的问题, 当内存不够时, 将会触发jvm的垃圾回收机制。
ajvm使用最原始的引用计数法, 需要建立一个新的数据结构:
typedef struct jvm_object { int ref_count; CLASS *class; void *addr; int size; struct list_head list; }JVM_OBJECT;
当数组申请完内存后, 将会建立一个新的JVM_OBJECT与其对应, ref_count被初始化为0, addr指向数组的首地址, size表示数组的大小, JVM_OBJECT将会被加入到jvm_obj_list_head链表中, 在这将来的垃圾回收时将会用到。
int jvm_interp_newarray(u2 len, char *symbol, void *base) { ... addr = (void *)alloc_newarray_memroy(atype, count); if (!addr) { error("slab alloc failed.n"); return -1; } printf("addr: 0x%xn", addr); new_obj = create_new_obj(addr, count); if (!new_obj) { error("create new obj failed.n"); return -1; } ... }
当数组被引用时, 我们跟数组的地址在JVM_OBJECT链表中找到它, 并且把ref_count加1, 表示这个数组在被引用。 比如上面的:
17: iastore
这条指令就会对data数组进行引用, 我们只要在iastore的解释代码里, 对data对应的ref_count加1即可:
int jvm_interp_iastore(u2 len, char *symbol, void *base) { int *addr, index, value; if (jvm_arg->disass_class) { printf("%sn", symbol); return 0; } pop_operand_stack(int, value) pop_operand_stack(int, index) pop_operand_stack(int, addr) printf("addr: 0x%xtindex: %dt%dn", addr, index, value); *(int *)(addr + index) = value; if (inc_obj_ref(addr, (&jvm_obj_list_head)) == -1) { jvm_error(VM_ERROR_INTERP, "inc jvm obj ref failed.n"); return -1; } jvm_pc.pc += len; return 0; }
对于数组data1, 同样进行了内存分配, 但是始终没有被引用到, 所以data1将会是gc回收时要释放的对象。
void start_gc(struct list_head *list_head) { JVM_OBJECT *s; struct list_head *p, *q; list_for_each_safe(p, q, list_head) { s = list_entry(p, JVM_OBJECT, list); if (s && s->ref_count == 0) { printf("free addr: 0x%xtsize: %dtref_count: %dn", s->addr, s->size, s->ref_count); list_del(p); free_jvm_obj(s); } } }
这是ajvm最简单的gc算法了, 后续将会对其进行优化。
四、演示执行
下面是ajvm对上述java代码的解释和执行过程:
$./wvm -c test test6 jvm pc init at: 0x630510 main ([Ljava/lang/String;)V stack: 3 local : 5 code: 0x3 0x36 0x4 0x5 0xbc 0xa 0x4c 0x3 0x3e 0x1d 0x5 0xa2 0x0 0xd 0x2b 0x1d 0x1d 0x4f 0x84 0x3 0x1 0xa7 0xff 0xf4 0x6 0xbc 0xa 0x4d 0xb1 #local at: 0x630540 #stack at: 0x630554 [ 1] iconst_0 pc: 0x630510 -> 0x3 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 2] istore pc: 0x630511 -> 0x36 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 3] iconst_2 pc: 0x630513 -> 0x5 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x2 0x0 0x0 [ 4] newarray pc: 0x630514 -> 0xbc #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x2 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 [ 5] astore_1 pc: 0x630516 -> 0x4c #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x0 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 6] iconst_0 pc: 0x630517 -> 0x3 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 7] istore_3 pc: 0x630518 -> 0x3e #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 8] iload_3 pc: 0x630519 -> 0x1d #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 9] iconst_2 pc: 0x63051a -> 0x5 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x2 0x0 [ 10] if_icmpge pc: 0x63051b -> 0xa2 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x2 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 11] aload_1 pc: 0x63051e -> 0x2b #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 [ 12] iload_3 pc: 0x63051f -> 0x1d #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 [ 13] iload_3 pc: 0x630520 -> 0x1d #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 [ 14] iastore pc: 0x630521 -> 0x4f #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 [ 15] iinc pc: 0x630522 -> 0x84 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x0 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 [ 16] goto pc: 0x630525 -> 0xa7 [ 17] iload_3 pc: 0x630519 -> 0x1d #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x0 0x0 [ 18] iconst_2 pc: 0x63051a -> 0x5 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x2 0x0 [ 19] if_icmpge pc: 0x63051b -> 0xa2 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x2 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x1 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 [ 20] aload_1 pc: 0x63051e -> 0x2b #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 [ 21] iload_3 pc: 0x63051f -> 0x1d #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 [ 22] iload_3 pc: 0x630520 -> 0x1d #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x1 [ 23] iastore pc: 0x630521 -> 0x4f #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x1 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x1 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x627c20 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 [ 24] iinc pc: 0x630522 -> 0x84 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x1 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 [ 25] goto pc: 0x630525 -> 0xa7 [ 26] iload_3 pc: 0x630519 -> 0x1d #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x0 0x0 [ 27] iconst_2 pc: 0x63051a -> 0x5 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x2 0x0 [ 28] if_icmpge pc: 0x63051b -> 0xa2 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x2 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x2 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 [ 29] iconst_3 pc: 0x630528 -> 0x6 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x3 0x0 0x0 [ 30] newarray pc: 0x630529 -> 0xbc #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x3 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x627c80 0x0 0x0 [ 31] astore_2 pc: 0x63052b -> 0x4d #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x627c80 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x0 0x2 0x0 #stack: 0x0 0x0 0x0 #local: 0x0 0x627c20 0x627c80 0x2 0x0 #stack: 0x0 0x0 0x0 [ 32] return pc: 0x63052c -> 0xb1 #local: 0x0 0x627c20 0x627c80 0x2 0x0 #stack: 0x0 0x0 0x0 jvm stack depth is zero. interpret bytecode done.