ConcurrentHashMap原理分析-阿里云开发者社区

ConcurrentHashMap原理分析

很多网上的面试笔试题集锦都有关于HashTable和HashMap的区别，比如HashTable是线程安全的，key值不允许为空；而HashMap不是线程安全的，key值允许为空；两者的父不同，一个是Directory，一个是Map; 由于HashMap不是线程同步的，如果需要使用一个线程同步的HashMap，则需要额外进行同步的逻辑代码编写；或者也可以使用CollectionUtils提供的synchronizedMap()方法，该方法会返回一个线程同步的Map，这种方法也会额外增加同步的代价。JDK1.5提供了ConcurrentHashMap提供了简单、安全且代价较小的HashMap同步。

一、ConcurrentHashMap同步原理概述

不管是HashTable还是synchronizedMap的同步，都是使用了锁原理。操作需要访问对象，首先对其加锁；操作结束后，释放锁。通过Hashtable分析文已经就知道，HashTable的synchronized加锁是针对整张Hash表的，即每次操作都锁住整张表；而ConcurrentHashMap允许多个修改操作并发进行，其关键在于使用了Lock Stripping，即锁分离、分段锁或段锁技术。分段锁使用了多个锁来控制对hash表的不同部分进行的修改。ConcurrentHashMap内部使用段(Segment)来表示这些不同的部分，每个段其实就是一个小的hash table，它们有自己的锁。只要多个修改操作发生在不同的段上，它们就可以并发进行。由于引起了并发概念，其效率相对全部加锁就有了明显改善。

二、ConcurrentHashMap结构

由图中可以看出，我们可以将整张ConcurrentHashMap划分成不同的段，每个段可以看做一个HashTable，每个HashTable使用不同的锁，段更进一步细分就是entry即实体。即：

 
        /** 
       
        * The segments, each of which is a specialized hash table 
       
        */ 
       
        final 
        Segment<K,V>[] segments;

ConcurrentHashMap的概念包含ConcurrentHashMap、Segment和HashEntry。HashEntry定义如下：

 
        static 
        final 
        class 
        HashEntry<K,V> { 
       
        final 
        K key; 
       
        final 
        int 
        hash; 
       
        volatile 
        V value; 
       
        final 
        HashEntry<K,V> next; 
       
        HashEntry(K key, 
        int 
        hash, HashEntry<K,V> next, V value) { 
       
        this
        .key = key; 
       
        this
        .hash = hash; 
       
        this
        .next = next; 
       
        this
        .value = value; 
       
        } 
       
        @SuppressWarnings
        (
        "unchecked"
        ) 
       
        static 
        final 
        <K,V> HashEntry<K,V>[] newArray(
        int 
        i) { 
       
        return 
        new 
        HashEntry[i]; 
       
        } 
       
        }

读操作不需要加锁

可以看出，除了value以外，其他值均是final的（包括next），这就意味着添加entry只能在头上，而不能在中间或尾端。为了确保读操作能够看到最新的值，将value设置成volatile，这避免了加锁，从而提高了读的效率。

定位段的方法

为了加快定位段以及段中hash槽的速度，每个段hash槽的的个数都是2^n，这使得通过位运算就可以定位段和段中hash槽的位置。当并发级别为默认值16时，也就是段的个数，hash值的高4位决定分配在哪个段中，后四位决定段中的坐标。

 
        /** 
       
        * Mask value for indexing into segments. The upper bits of a 
       
        * key's hash code are used to choose the segment. 
       
        */ 
       
        final 
        int 
        segmentMask; 
       
        /** 
       
        * Shift value for indexing within segments. 
       
        */ 
       
        final 
        int 
        segmentShift;

segmentFor(int n)方法

 
        /**
       
        * Returns the segment that should be used for key with given hash 
       
        * @param hash the hash code for the key 
       
        * @return the segment 
       
        */ 
       
        final 
        Segment<K,V> segmentFor(
        int 
        hash) { 
       
        return 
        segments[(hash >>> segmentShift) & segmentMask]; 
       
        }

段的定义：

 
        /* 
       
        * Segments maintain a table of entry lists that are ALWAYS 
       
        * kept in a consistent state, so can be read without locking. 
       
        * Next fields of nodes are immutable (final).  All list 
       
        * additions are performed at the front of each bin. This 
       
        * makes it easy to check changes, and also fast to traverse. 
       
        * When nodes would otherwise be changed, new nodes are 
       
        * created to replace them. This works well for hash tables 
       
        * since the bin lists tend to be short. (The average length 
       
        * is less than two for the default load factor threshold.) 
       
        * 
       
        * Read operations can thus proceed without locking, but rely 
       
        * on selected uses of volatiles to ensure that completed 
       
        * write operations performed by other threads are 
       
        * noticed. For most purposes, the "count" field, tracking the 
       
        * number of elements, serves as that volatile variable 
       
        * ensuring visibility.  This is convenient because this field 
       
        * needs to be read in many read operations anyway: 
       
        * 
       
        *   - All (unsynchronized) read operations must first read the 
       
        *     "count" field, and should not look at table entries if 
       
        *     it is 0. 
       
        * 
       
        *   - All (synchronized) write operations should write to 
       
        *     the "count" field after structurally changing any bin. 
       
        *     The operations must not take any action that could even 
       
        *     momentarily cause a concurrent read operation to see 
       
        *     inconsistent data. This is made easier by the nature of 
       
        *     the read operations in Map. For example, no operation 
       
        *     can reveal that the table has grown but the threshold 
       
        *     has not yet been updated, so there are no atomicity 
       
        *     requirements for this with respect to reads. 
       
        * 
       
        * As a guide, all critical volatile reads and writes to the 
       
        * count field are marked in code comments. 
       
        */ 
       
        private static final long serialVersionUID = 2249069246763182397L; 
       
        /** 
       
        * The number of elements in this segment's region. 
       
        */ 
       
        transient volatile int count; 
       
        /** 
       
        * Number of updates that alter the size of the table. This is 
       
        * used during bulk-read methods to make sure they see a 
       
        * consistent snapshot: If modCounts change during a traversal 
       
        * of segments computing size or checking containsValue, then 
       
        * we might have an inconsistent view of state so (usually) 
       
        * must retry. 
       
        */ 
       
        transient int modCount; 
       
        /** 
       
        * The table is rehashed when its size exceeds this threshold. 
       
        * (The value of this field is always <tt>(int)(capacity * 
       
        * loadFactor)</tt>.) 
       
        */ 
       
        transient int threshold; 
       
        /** 
       
        * The per-segment table. 
       
        */ 
       
        transient volatile HashEntry<K,V>[] table; 
       
        /** 
       
        * The load factor for the hash table.  Even though this value 
       
        * is same for all segments, it is replicated to avoid needing 
       
        * links to outer object. 
       
        * @serial 
       
        */ 
       
        final 
        float 
        loadFactor;

count用来统计该段数据的个数，它是volatile，它用来协调修改和读取操作，以保证读取操作能够读取到几乎最新的修改。协调方式是这样的，每次修改操作做了结构上的改变，如增加/删除节点(修改节点的值不算结构上的改变)，都要写count值，每次读取操作开始都要读取count的值。这利用了 Java 5中对volatile语义的增强，对同一个volatile变量的写和读存在happens-before关系。modCount统计段结构改变的次数，主要是为了检测对多个段进行遍历过程中某个段是否发生改变，在讲述跨段操作时会还会详述。threashold用来表示需要进行rehash的界限值。table数组存储段中节点，每个数组元素是个hash链，用HashEntry表示。table也是volatile，这使得能够读取到最新的 table值而不需要同步。

删除操作的代码

 
        /** 
       
        * Remove; match on key only if value null, else match both. 
       
        */ 
       
        V remove(Object key, 
        int 
        hash, Object value) { 
       
        lock();
        //加锁 
       
        try 
        { 
       
        int 
        c = count - 
        1
        ; 
       
        HashEntry<K,V>[] tab = table;
        //优化volatile 
       
        int 
        index = hash & (tab.length - 
        1
        );
        //找到第一个节点位置 
       
        HashEntry<K,V> first = tab[index];
        //找到第一个节点 
       
        HashEntry<K,V> e = first; 
       
        while 
        (e != 
        null 
        && (e.hash != hash || !key.equals(e.key))) 
       
        e = e.next;
        //找到要删除的节点 
       
        V oldValue = 
        null
        ; 
       
        if 
        (e != 
        null
        ) { 
       
        V v = e.value; 
       
        if 
        (value == 
        null 
        || value.equals(v)) {
        //找到要删除的值 
       
        oldValue = v; 
       
        // All entries following removed node can stay 
       
        // in list, but all preceding ones need to be 
       
        // cloned.将删除 素之前的元素全部clone，然后将第一个指向删除元素///的next，第2个指向第1个，第3个，指向第二个，将删除元素的的前驱设置为第一个元素 
       
        ++modCount; 
       
        HashEntry<K,V> newFirst = e.next; 
       
        for 
        (HashEntry<K,V> p = first; p != e; p = p.next) 
       
        newFirst = 
        new 
        HashEntry<K,V>(p.key, p.hash, 
       
        newFirst, p.value); 
       
        tab[index] = newFirst; 
       
        count = c; 
        // write-volatile 
       
        } 
       
        } 
       
        return 
        oldValue; 
       
        } 
        finally 
        { 
       
        unlock(); 
       
        } 
       
        }

添加操作的代码

 
        V put(K key, 
        int 
        hash, V value, 
        boolean 
        onlyIfAbsent) { 
       
        lock(); 
       
        try 
        { 
       
        int 
        c = count; 
       
        if 
        (c++ > threshold) 
        // ensure capacity，如超限，rehash 
       
        rehash(); 
       
        HashEntry<K,V>[] tab = table; 
       
        int 
        index = hash & (tab.length - 
        1
        ); 
       
        HashEntry<K,V> first = tab[index]; 
       
        HashEntry<K,V> e = first; 
       
        while 
        (e != 
        null 
        && (e.hash != hash || !key.equals(e.key))) 
       
        e = e.next;
        //遍历  
       
        V oldValue; 
       
        if 
        (e != 
        null
        ) {
        //如找到相同key，value直接替换 
       
        oldValue = e.value; 
       
        if 
        (!onlyIfAbsent) 
       
        e.value = value; 
       
        } 
       
        else 
        {
        //如未找到，创建一个新元素，指向first 
       
        oldValue = 
        null
        ; 
       
        ++modCount; 
       
        tab[index] = 
        new 
        HashEntry<K,V>(key, hash, first, value); 
       
        count = c; 
        // write-volatile 
       
        } 
       
        return 
        oldValue; 
       
        } 
        finally 
        { 
       
        unlock(); 
       
        } 
       
        }

读操作

 
        V get(Object key, 
        int 
        hash) { 
       
        if 
        (count != 
        0
        ) { 
        // read-volatile 
       
        HashEntry<K,V> e = getFirst(hash);
        //获取头节点 
       
        while 
        (e != 
        null
        ) { 
       
        if 
        (e.hash == hash && key.equals(e.key)) { 
       
        V v = e.value; 
       
        if 
        (v != 
        null
        ) 
       
        return 
        v; 
       
        //如空表明有其他操作在改变元素值或table结构，需要加锁读 
       
        return 
        readValueUnderLock(e); 
        // recheck 
       
        } 
       
        e = e.next; 
       
        } 
       
        } 
       
        return 
        null
        ; 
       
        }

 
       本文转自 gaochaojs 51CTO博客，原文链接：http://blog.51cto.com/jncumter/1827861
  ，如需转载请自行联系原作者

ConcurrentHashMap原理分析

热门文章

最新文章

相关电子书