【源码系列】Java中的数据结构——数组与ArrayList2-阿里云开发者社区

2.3.4 get方法

我们知道ArrayList实际上就是数组的封装，所以对于具有随机访问的数组而言，根据下标访问元素实在太简单不过了。

public E get(int index) {
  //下标安全检查
       rangeCheck(index);
       return elementData(index);
}

2.3.5 set方法

修改index下标的元素为element

public E set(int index, E element) {
  //下标检查
        rangeCheck(index);
  //旧的值
        E oldValue = elementData(index);
        //更新值
        elementData[index] = element;
        //返回旧的值
        return oldValue;
    }

2.3.6 size方法

public int size() {
        return size;
    }

其实就是把size返回，但这里要注意的是size的意义，它不是指当前ArrayList的容量有多少，而是指当前ArrayList中有多少元素。

2.3.7 indexOf方法

public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }

该方法就是遍历查找是否有对象o，如果找到则返回下标。

2.3.8 toArray方法

public Object[] toArray() {
        return Arrays.copyOf(elementData, size);
    }

需要注意的是该方法是将其elementData对象数组从内存中复制了一份，而且长度就是元素的数量。所以不必担心修改toArray方法的得到的数组会影响原ArrayList对象。

2.3.9 removeAll和retainAll方法

①removeAll(Collection c)

删除ArrayList中所有和集合中相同的元素

public boolean removeAll(Collection<?> c) {
  //检查该集合是否为空
        Objects.requireNonNull(c);
        return batchRemove(c, false);
    }

该方法的核心在于boolean batchRemove(Collection c, boolean complement)方法，我们点进去看一下，

private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        //r为快指针，w为慢指针
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++)
                if (c.contains(elementData[r]) == complement)
                  //只有符合条件时，w才自增
                    elementData[w++] = elementData[r];
        } finally {
            // 这里正常情况都是r=size，只有在程序遇到异常时，才可能会遇到r！=size的情况。
            if (r != size) {
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            //当w！=size时，即有元素被抛弃了，这时需要把后面无用的元素值置为null
            if (w != size) {
                // 将后面的元素值置为null
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                //modcount加上修改过的元素
                modCount += size - w;
                //更新size
                size = w;
                //此时已经抛弃了一些元素，所以置为true
                modified = true;
            }
        }
        return modified;
    }

可以看到这里用了快慢指针的技巧，当遇到要保留的元素时w再自增，最终保留元素的下标便是0到w-1。

在遍历的过程中调用contains方法看看这个元素是否在集合c中。

这里巧妙的地方在于它把contains的返回值complement参数比较。

这样，Boolean类型的complement参数的作用就类似于一个开关，当为true时，这个函数就是保留在集合c中出现过的元素，否则保留未出现过的元素。

②retainAll(Collection c)

而retainAll也类似，只不过它把开关置为false，这样就会保留在集合c中出现过的元素。

public boolean retainAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, true);
    }

2.4 迭代器

2.4.1 普通迭代器Itr

private class Itr implements Iterator<E> {
  //下一个元素下标
        int cursor;       // index of next element to return
        //上一个元素下标
        int lastRet = -1; // index of last element returned; -1 if no such
        //期望修改次数（后面会讲，现在不用管）
        int expectedModCount = modCount;
        Itr() {}
  //判断是否有下一个
        public boolean hasNext() {
            return cursor != size;
        }
  //返回当前停留的元素
        @SuppressWarnings("unchecked")
        public E next() {
          //检查是否有修改，这个之后会讲，现在先不管它
            checkForComodification();
            int i = cursor;
            //如果此时下一次超出数组范围，则抛出没有NoSuchElementException异常
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            //如果此时i还能大于等于数组的长度，唯一的可能就是在遍历过程中有程序对其做了修改，所以此时应抛出并发异常ConcurrentModificationException
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            //下一个元素的下标为i+1
            cursor = i + 1;
            //返回下标为i的元素
            return (E) elementData[lastRet = i];
        }
        public void remove() {
          //如果当前的元素下标小于0，抛出IllegalStateException异常
            if (lastRet < 0)
                throw new IllegalStateException();
            //检查是否有修改
            checkForComodification();
            try {
              //删除上一个元素
                ArrayList.this.remove(lastRet);
                //指针回退
                cursor = lastRet;
                //上一个指针回归-1，这个就说明了不能连续调用remove方法
                lastRet = -1;
                //期望的修改值变为当前的修改值（这些之后会讲）
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
}

这里有个值得注意的地方——它在定义的迭代器的时候加了个private，那我们如何去创建迭代器呢？

其实它有个Iterator（）方法，

public Iterator<E> iterator() {
        return new Itr();
    }

它就返回了一个迭代器的对象。

这样的好处就在于面对不同的集合类只需调用通用的接口方法iterator（）便可以返回相应的迭代器。

看完上面的源码我们明白了以下几点：

1. 这个迭代器对象只能使用一次，因为它没有初始化重置的方法

2. 这个迭代器只会往前，因为它只有next方法

3. 不能连续调用remove方法，因为调用了一次后，指向上一个元素下标的lastRet变量便被赋值为-1，连续调用会抛出异常

看完上面这个迭代器的源码，你会发现——初看源码会有一种不知所云的感觉，但是去掉这些让我看起来不知所云的操作和边界条件的检查后，是不是感觉简单了起来？

没错，如果不管这些边界条件，不管并发可能带来的错误，那么我们也能写出ArrayList的迭代器。

这里，为了更好理解代码，我这里放上该迭代器使用简单例子

Iterator<Integer> iterator=list.iterator();
while (iterator.hasNext()){
    Integer i=iterator.next();
    iterator.remove();
}

看了源码后是不是更有感觉了呢？

2.4.2 迭代器的增强版ListItr

private class ListItr extends Itr implements ListIterator<E> {
        ListItr(int index) {
            super();
            cursor = index;
        }
        public boolean hasPrevious() {
            return cursor != 0;
        }
        public int nextIndex() {
            return cursor;
        }
        public int previousIndex() {
            return cursor - 1;
        }
        @SuppressWarnings("unchecked")
        public E previous() {
          //检查是否有修改
            checkForComodification();
            //cursor前移
            int i = cursor - 1;
            //检查i是否小于0
            if (i < 0)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            //如果此时i还能大于等于数组的长度，唯一的可能就是在遍历过程中有程序对其做了修改，所以此时应抛出并发异常ConcurrentModificationException
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i;
            return (E) elementData[lastRet = i];
        }
        public void set(E e) {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
    //修改当前下标的元素值
            try {
                ArrayList.this.set(lastRet, e);
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
        public void add(E e) {
            checkForComodification();
            try {
                int i = cursor;
                ArrayList.this.add(i, e);
                cursor = i + 1;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
    }

可以看到ListItr是Itr的子类，在有Itr的功能的基础上，拓展了前移指针的previous（）方法、增加元素的add（E e）方法和修改当前元素的方法set(E e)。源码操作也不难理解，这里就不多介绍了。

2.4.3 干扰检测

①为什么要有干扰检测

在看迭代器的源码，你是不是很疑惑为啥会有执行checkForComodification（）检查？

你是不是很疑惑为什么很多地方会有modCount++，expectedModCount = modCount这种操作？

你是不是很疑惑modCount，expectedModCount 是来用来干什么的呢？

带着这些疑问，我们来探究一下这样做的原因和原理。

首先，不知道你在写程序的时候遇到ConcurrentModificationException（）这个异常，在我们用迭代器遍历过程中，有没有删除增加过集合。比如下面这样：

for (Integer i:list){
      list.add(1);
}

注意：ArrayList的foreach增强遍历本质就是调用了迭代器。

我们在遍历途中不断向集合中添加整数1，这看似好像没有什么问题，可仔细一想，如果不断添加，那么这个迭代器调用的next真的是原来所期望的next吗？看似好像不会影响结果，但实际上这往往会发生一些逻辑上的错误，如果不对这些情况做出一些提示，程序员会非常苦恼，因为他们很难发现这里面的问题。

同时在并发编程中，如果你对共享变量误用了非线程安全的ArrayList，那么有个线程正在遍历ArrayList集合，可以另一个线程对其做了修改，这会不会发生一些错误呢？尽管在这种情况下ArrayList并未做出有效解决方案，但它必须有责任告知你这个情况，所以它会抛出这个ConcurrentModificationException异常。

②干扰检测原理

那么它是如何实现的呢？

核心就modcount和expectedModCount 变量。

还记得我们调用add，remove方法出现modcount++操作吗？

在这里modcount就是累计修改的次数，expectedModCount 就是预期修改的次数。

这里要注意的是modcount是全局的一个属性

protected transient int modCount = 0;

而expectedModCount 是在迭代器内部的一个属性。

言外之意就是该错误检测一定是在迭代器内部实现的。

阅读源码我们发现，在ArrayList每次进程执行add，remove方法时，都会触发一次modcount++。

同时在创建迭代器时，expectedModCount 的初值就被赋值为当时的modcount，

而每次迭代器在进行next和remove方法时都会做一次检测

点开方法查看

final void checkForComodification() {
       if (modCount != expectedModCount)
           throw new ConcurrentModificationException();
}

这个方法实质就是在比较modCount和expectedModCount是否相等，如果不相等就抛出ConcurrentModificationException异常。

检查完后，我们对集合进行修改时，如执行remove方法，那么就会modcount++,经过几步操作后在把modcount的值赋值给expectedModCount ，这是就保证了expectedModCount 与modcount是相等的，但是预期之外的修改可不会执行这句话，所以就会造成modCount和expectedModCount值不一致，从而检查出了错误。

到这里我们终于明白了它的核心思路：

在迭代器里的修改（主要是指add和remove方法）是被允许的，因为这是迭代器内可见的。但是在预期之外的修改是不被允许的，一旦迭代器发现了预期之外的修改（执行checkForComodification（）方法检查modCount和expectedModCount是否相等），那么迭代器就会抛出ConcurrentModificationException异常告诉工程师——啊，我迭代过程中出现问题了，可能会发生一些逻辑错误。

到现在你也明白了这样做的原因和原理，那么我问你个问题——你觉得这个过程完美吗？

看着很完美是吗？

其实不是的，实际上你仔细看看这段

ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;

modCount修改到expectedModCount被更新这里中间差了这么多步，那我们假设在这个过程中其他线程修改了集合，这时modcount又被更新了，但是expectedModCount 还没更新啊。当expectedModCount 更新时，它以为它更新的值是预期中的，但是实际上有预期外的修改混了进来，而它却浑然不知。

所以说，ArrayList是个非线程安全的类，即使在迭代过程中，它有着能感知这种意外的机制，但是这种机制并不是那么灵敏，在一些情况下，它检测不出并发错误。（当然啦，这个机制最主要的是检查工程师在遍历过程中误修改操作，对于并发错误的检测也只是顺带的罢了）

当然啦，源码中也有这么一段注释：

/*
         * If ArrayLists were immutable, or structurally immutable (no
         * adds, removes, etc), we could implement their spliterators
         * with Arrays.spliterator. Instead we detect as much
         * interference during traversal as practical without
         * sacrificing much performance. We rely primarily on
         * modCounts. These are not guaranteed to detect concurrency
         * violations, and are sometimes overly conservative about
         * within-thread interference, but detect enough problems to
         * be worthwhile in practice. To carry this out, we (1) lazily
         * initialize fence and expectedModCount until the latest
         * point that we need to commit to the state we are checking
         * against; thus improving precision.  (This doesn't apply to
         * SubLists, that create spliterators with current non-lazy
         * values).  (2) We perform only a single
         * ConcurrentModificationException check at the end of forEach
         * (the most performance-sensitive method). When using forEach
         * (as opposed to iterators), we can normally only detect
         * interference after actions, not before. Further
         * CME-triggering checks apply to all other possible
         * violations of assumptions for example null or too-small
         * elementData array given its size(), that could only have
         * occurred due to interference.  This allows the inner loop
         * of forEach to run without any further checks, and
         * simplifies lambda-resolution. While this does entail a
         * number of checks, note that in the common case of
         * list.stream().forEach(a), no checks or other computation
         * occur anywhere other than inside forEach itself.  The other
         * less-often-used methods cannot take advantage of most of
         * these streamlinings.
         */

翻译过来就是：

如果 ArrayLists 是不可变的，或者结构上不可变的（没有添加、删除等），我们可以使用 Arrays.spliterator 实现它们的拆分器。相反，我们在不牺牲太多性能的情况下，在遍历过程中检测到尽可能多的干扰。我们主要依赖 modCounts。这些不能保证检测到并发冲突，并且有时对线程内干扰过于保守，但是可以检测到足够的问题以值得实践。为了实现这一点，我们（1）延迟初始化fence和expectedModCount，直到我们需要提交到我们正在检查的状态的最新点；从而提高精度。（这不适用于创建具有当前非惰性值的拆分器的子列表）。 (2) 我们在 forEach 的末尾只执行一个ConcurrentModificationException 检查（对性能最敏感的方法）。当使用 forEach（而不是迭代器）时，我们通常只能在动作之后检测干扰，而不是之前。进一步的 CME触发检查适用于所有其他可能违反假设的情况，例如 null 或给定 size() 的太小的 elementData数组，这可能仅由于干扰而发生。这允许 forEach 的内部循环在没有任何进一步检查的情况下运行，并简化了 lambda解析。虽然这确实需要一些检查，但请注意，在 list.stream().forEach(a) 的常见情况下，除了 forEach本身之外，不会在任何地方进行检查或其他计算。其他不太常用的方法不能利用这些流线化中的大部分。

2.4.4 ArrayListSpliterator拆分迭代器

这个拆分迭代器时jdk8引入的，是为了满足流式编程的需要，用于并行流的拆分，这块我之后再补。（未完待续…）

static final class ArrayListSpliterator<E> implements Spliterator<E> {
        /*
         * If ArrayLists were immutable, or structurally immutable (no
         * adds, removes, etc), we could implement their spliterators
         * with Arrays.spliterator. Instead we detect as much
         * interference during traversal as practical without
         * sacrificing much performance. We rely primarily on
         * modCounts. These are not guaranteed to detect concurrency
         * violations, and are sometimes overly conservative about
         * within-thread interference, but detect enough problems to
         * be worthwhile in practice. To carry this out, we (1) lazily
         * initialize fence and expectedModCount until the latest
         * point that we need to commit to the state we are checking
         * against; thus improving precision.  (This doesn't apply to
         * SubLists, that create spliterators with current non-lazy
         * values).  (2) We perform only a single
         * ConcurrentModificationException check at the end of forEach
         * (the most performance-sensitive method). When using forEach
         * (as opposed to iterators), we can normally only detect
         * interference after actions, not before. Further
         * CME-triggering checks apply to all other possible
         * violations of assumptions for example null or too-small
         * elementData array given its size(), that could only have
         * occurred due to interference.  This allows the inner loop
         * of forEach to run without any further checks, and
         * simplifies lambda-resolution. While this does entail a
         * number of checks, note that in the common case of
         * list.stream().forEach(a), no checks or other computation
         * occur anywhere other than inside forEach itself.  The other
         * less-often-used methods cannot take advantage of most of
         * these streamlinings.
         */
        private final ArrayList<E> list;
        private int index; // current index, modified on advance/split
        private int fence; // -1 until used; then one past last index
        private int expectedModCount; // initialized when fence set
        /** Create new spliterator covering the given  range */
        ArrayListSpliterator(ArrayList<E> list, int origin, int fence,
                             int expectedModCount) {
            this.list = list; // OK if null unless traversed
            this.index = origin;
            this.fence = fence;
            this.expectedModCount = expectedModCount;
        }
        private int getFence() { // initialize fence to size on first use
            int hi; // (a specialized variant appears in method forEach)
            ArrayList<E> lst;
            if ((hi = fence) < 0) {
                if ((lst = list) == null)
                    hi = fence = 0;
                else {
                    expectedModCount = lst.modCount;
                    hi = fence = lst.size;
                }
            }
            return hi;
        }
        public ArrayListSpliterator<E> trySplit() {
            int hi = getFence(), lo = index, mid = (lo + hi) >>> 1;
            return (lo >= mid) ? null : // divide range in half unless too small
                new ArrayListSpliterator<E>(list, lo, index = mid,
                                            expectedModCount);
        }
        public boolean tryAdvance(Consumer<? super E> action) {
            if (action == null)
                throw new NullPointerException();
            int hi = getFence(), i = index;
            if (i < hi) {
                index = i + 1;
                @SuppressWarnings("unchecked") E e = (E)list.elementData[i];
                action.accept(e);
                if (list.modCount != expectedModCount)
                    throw new ConcurrentModificationException();
                return true;
            }
            return false;
        }
        public void forEachRemaining(Consumer<? super E> action) {
            int i, hi, mc; // hoist accesses and checks from loop
            ArrayList<E> lst; Object[] a;
            if (action == null)
                throw new NullPointerException();
            if ((lst = list) != null && (a = lst.elementData) != null) {
                if ((hi = fence) < 0) {
                    mc = lst.modCount;
                    hi = lst.size;
                }
                else
                    mc = expectedModCount;
                if ((i = index) >= 0 && (index = hi) <= a.length) {
                    for (; i < hi; ++i) {
                        @SuppressWarnings("unchecked") E e = (E) a[i];
                        action.accept(e);
                    }
                    if (lst.modCount == mc)
                        return;
                }
            }
            throw new ConcurrentModificationException();
        }
        public long estimateSize() {
            return (long) (getFence() - index);
        }
        public int characteristics() {
            return Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED;
        }
    }

2.4.5 迭代器总结

除了拆分迭代器，ArrayList中的迭代器其实很简单。

如果没有那些边界检查，没有并发错误检测，就算作为初学者的我们也能出色的完成任务。但是作为jdk类库中最常用的集合类，它必须是可靠的，它必须有着极高的性能，同时拥有较高的安全性。

迭代器作为遍历集合的手段，它可以在遍历的过程中不断遍历同时对元素做出修改。在这个过程中它会尽量避免一些错误的发生。

Itr是普遍的迭代器，它只能单向遍历，同时遍历完后就没用了，也无法连续调用remove方法。

ListItr作为Itr的增强版（也是子类），在其基础上拓展了双向遍历的功能，同时支持增删改的操作。

四、总结

总结一下，

ArrayList是对数据结构中数组的封装，其核心是就是对象数组elementData。它最大的特点就是让数组看起来是可以动态扩容的，并且提供各种方便的操作，极大提升了我们日常开发的效率。

对于ArrayList，你需要记住以下几点：

ArrayList的核心是对象数组elementData，它是存放数据的地方。

size并不代表这数组的容量，而是指当前list中元素的数量。

由于elementData数组并未对外开放，你无法直接获取elementData数组，除非你用反射，同时容量也无法直接获取。

ArrayList是看起来支持动态扩容，实际依旧摆脱不了数组的特性——创建容量即确定。扩容原理就是调用System.arrayCopy方法进行数组复制，把原来数组的内容复制到另外一个容量更大的数组里。

ArrayList创建时最好给它一个初始容量来避免后续的扩容。

ArrayList适用于查询多，增删操作少的情况。

在add方法执行后，如果容量不足它会自动扩容，一般情况下扩容后的容量时原来容量的3/2倍

迭代器为了保证迭代过程中不出现预期之外的修改，会进行检查。如果出现了预期之外的修改，会抛出ConcurrentModificationException异常。

整篇源码读下来，其实好处还是很多的，

在对外方法的暴露上，在对内代码的封装和复用上，在方法间的调用上，你可以发现各种设计模式，感叹设计的巧妙。

如此复杂的方法调用在这里被组织的井然有序，这是我们日常开发中需要好好学习的。

好了，这篇的内容到这里就差不多结束了，当然本文也并非讲解所有的源码，只是挑了核心原理的一部分来讲，至于其他方法，个人认为是一些其他方面的封装，这里我就不一一赘述了，感兴趣的同学可以自己去研读

【源码系列】Java中的数据结构——数组与ArrayList2

2.3.4 get方法

2.3.5 set方法

2.3.6 size方法

2.3.7 indexOf方法

2.3.8 toArray方法

2.3.9 removeAll和retainAll方法

2.4 迭代器

2.4.1 普通迭代器Itr

2.4.2 迭代器的增强版ListItr

2.4.3 干扰检测

2.4.4 ArrayListSpliterator拆分迭代器

2.4.5 迭代器总结

四、总结

热门文章

最新文章

相关课程

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

【源码系列】Java中的数据结构——数组与ArrayList2

2.3.4 get方法

2.3.5 set方法

2.3.6 size方法

2.3.7 indexOf方法

2.3.8 toArray方法

2.3.9 removeAll和retainAll方法

2.4 迭代器

2.4.1 普通迭代器Itr

2.4.2 迭代器的增强版ListItr

2.4.3 干扰检测

2.4.4 ArrayListSpliterator拆分迭代器

2.4.5 迭代器总结

四、总结

热门文章

最新文章

相关课程

相关电子书