cache_t
回顾objc_object的结构,内部有个cache_t,主要是用作方法缓存,是用哈希表来缓存调用过的方法,主要用来提高方法调用过程中的查找速度;
objc_object的结构如下:
struct objc_class : objc_object { // Class ISA; Class superclass; cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags class_rw_t *data() const { return bits.data(); } }
cache_t的结构如下:
struct cache_t { #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; static constexpr uintptr_t maskShift = 48; static constexpr uintptr_t maskZeroBits = 4; static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1; static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1; static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers."); #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; public: static bucket_t *emptyBuckets(); struct bucket_t *buckets(); mask_t mask(); mask_t occupied(); void incrementOccupied(); void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask); void initializeToEmpty(); unsigned capacity(); bool isConstantEmptyCache(); bool canBeFreed(); #if __LP64__ bool getBit(uint16_t flags) const { return _flags & flags; } void setBit(uint16_t set) { __c11_atomic_fetch_or((_Atomic(uint16_t) *)&_flags, set, __ATOMIC_RELAXED); } void clearBit(uint16_t clear) { __c11_atomic_fetch_and((_Atomic(uint16_t) *)&_flags, ~clear, __ATOMIC_RELAXED); } #endif static size_t bytesForCapacity(uint32_t cap); static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap); void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld); void insert(Class cls, SEL sel, IMP imp, id receiver); static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn, cold)); ... }
其中buckets为bucket_t 结构体类型的指针,bucket_t的结构如下:
struct bucket_t { private: explicit_atomic<uintptr_t> _imp; explicit_atomic<SEL> _sel; uintptr_t modifierForSEL(SEL newSel, Class cls) const { return (uintptr_t)&_imp ^ (uintptr_t)newSel ^ (uintptr_t)cls; } public: inline SEL sel() const { return _sel.load(memory_order::memory_order_relaxed); } inline IMP imp(Class cls) const { ... } void set(SEL newSel, IMP newImp, Class cls); ... }
存储的时候_sel作为key,_imp作为value,存入到哈希表中;当调用方法进行查找的时候直接利用sel作为key,来查找对应的实现imp;
根据源码,我们来自己构造一下这种结构,来探究一下内部到底是怎么存储的?测试代码如下:
typedef uint32_t mask_t; struct glt_bucket_t { SEL _sel; IMP _imp; }; struct glt_cache_t { struct glt_bucket_t *_buckets; mask_t _mask; uint16_t _flags; uint16_t _occupied; }; struct glt_objc_class { Class ISA; Class superClass; struct glt_cache_t cache; }; @interface Person : NSObject - (void)test1; - (void)test2; - (void)test3; @end @implementation Person - (void)test1 { } - (void)test2 { } - (void)test3 { } @end void log(Person *person) { struct glt_objc_class *personClass = (__bridge struct glt_objc_class *)[person class]; glt_cache_t cache = personClass->cache; glt_bucket_t *buckets = cache._buckets; NSLog(@"_mask = %d, _occupied = %d", cache._mask, cache._occupied); for (int i = 0; i < cache._mask; i++) { glt_bucket_t bucket = buckets[i]; NSLog(@"方法缓存: %@", NSStringFromSelector(bucket._sel)); } } int main(int argc, char * argv[]) { NSString * appDelegateClassName; @autoreleasepool { Person *person = [[Person alloc] init]; [person test1]; log(person); [person test2]; [person test3]; log(person); } return UIApplicationMain(argc, argv, nil, appDelegateClassName); }
运行结果:
_mask = 3, _occupied = 2 方法缓存: init 方法缓存: test1 方法缓存: (null) _mask = 7, _occupied = 2 方法缓存: test3 方法缓存: (null) 方法缓存: test2 方法缓存: (null) 方法缓存: (null) 方法缓存: (null) 方法缓存: (null)
根据log我们来探究一下_mask、_occupied值为什么会变化?以及在第二个log中init和test1方法为什么没有打印出来?通过源码分析,会发现调用到了下面这样一个函数:
void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver) { mask_t newOccupied = occupied() + 1; unsigned oldCapacity = capacity(), capacity = oldCapacity; if (slowpath(isConstantEmptyCache())) { // Cache is read-only. Replace it. if (!capacity) capacity = INIT_CACHE_SIZE; reallocate(oldCapacity, capacity, /* freeOld */false); } else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // Cache小于等于 3/4,此处不做处理 } else {//扩容为原来容量的2倍,即4\8\16... capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; if (capacity > MAX_CACHE_SIZE) { capacity = MAX_CACHE_SIZE; } //此处会释放掉所有的内容,清空buckets,把存储的内容都设置为null reallocate(oldCapacity, capacity, true); } bucket_t *b = buckets(); mask_t m = capacity - 1; mask_t begin = cache_hash(sel, m); mask_t i = begin; do { //当前缓存中没有找到sel,则直接存储 if (fastpath(b[i].sel() == 0)) { incrementOccupied(); //_occupied++ 增加计数 b[i].set<Atomic, Encoded>(sel, imp, cls); return; } //说明已经插入了无需再次插入 if (b[i].sel() == sel) { return; } //cache_next:说明这个位置已经被占用了、重新寻找下一个位置 //寻找的方式就是进行arm64下是进行-1操作,如果减到0了则直接等于mask } while (fastpath((i = cache_next(i, m)) != begin)); cache_t::bad_cache(receiver, (SEL)sel, cls); }
通过源码我们可以发现:capacity保存了当前的总容量,并且初始的容量为4,也就是哈希表的总长度,如果总容量小于等于3/4不做处理,否则的话会进行扩容,容量会扩充为原来的2倍(4、8、16...),并且在扩容的时候会调用reallocate函数,在其内部会释放掉原来存储的内容,故上面的init方法和test1方法没有打印出来,此外在其内部也会对_mask进行赋值,值为capacity-1;
接下来会插入_sel和_imp,具体步骤为:
1. 先查看_sel & mask后的结果,也就是和begain对比要插入的位置,得到的索引可能会相等(哈希冲突),如果相等,此处解决方式为将值-1后再进行查找,直到找到不相等的再进行插入;
static inline mask_t cache_next(mask_t i, mask_t mask) { return i ? i-1 : mask; }
2. 接着查看哈希表中是否存储即将要插入的_sel,如果没有,则直接将_sel作为key,_imp作为value插入到hash表中,并把_occupied的值+1;
3. 如果已经插入过了即b[i].sel() == sel,则不再插入;
总结
1. _mask和_occupied分别是什么?有什么作用?
_mask的值为哈希表总长度减1,主要是利用_sel & _mask来计算新的_sel要插入的位置,如果存在哈希冲突(也就是&出的结果一致),在arm64下会将递归遍历&后的计算结果使其-1,如果为0则等于mask,直到找到为止;
_occupied主要是用来存储已经缓存的方法数量;
2. bucket数据为什么会为空?
因为是在扩容的时候(4,8,16,...),会清除旧的数据,重新分配内存;
3. cache_t中buckets与_mask、_occupied、_sel、_imp的关系?
buckets中有多个bucket_t,_mask为哈希表的总长度-1,_occupied为已经缓存的方法数量,哈希表存储的时候_sel作为key, _imp作为value进行存储;本文部分内容来可能来源于网络,发布的内容如果侵犯了您的权益,请联系我们尽快删除!