内存优化：充满矛盾的SparseArray-阿里云开发者社区

SparseArray可直译为“稀疏数组”，虽然听上去很“松散”，但它其实非常“紧致”。这一篇将会通过分析SparseArray的源码来展现这个类的矛盾之处。

还记得分析RecyclerView缓存机制中提到的SparseArray吗？当时只知道它是一个存放键值对的容器。RecyclerView的回收池使用它按ViewType分类存放被回收的ViewHolder。 HashMap也存放键值对，为啥RecyclerView不用 HashMap？

带着这个疑问，点开SparseArray.java，一探究竟：

/**
 * SparseArrays map integers to Objects.  Unlike a normal array of Objects,
 * there can be gaps in the indices.  It is intended to be more memory efficient
 * than using a HashMap to map Integers to Objects, both because it avoids
 * auto-boxing keys and its data structure doesn‘t rely on an extra entry object
 * for each mapping.
 * SparseArrays建立int值和Object之间的关系
 * 不同于一般的数组对象，SparseArray数组对象间不会存在空隙
 * 相对于HashMap<Integer,Object>更加节省内存，因为避免了键自动装箱以及创建Entry实体 
 *
 * <p>Note that this container keeps its mappings in an array data structure,
 * using a binary search to find keys.  The implementation is not intended to be appropriate for
 * data structures
 * that may contain large numbers of items.  It is generally slower than a traditional
 * HashMap, since lookups require a binary search and adds and removes require inserting
 * and deleting entries in the array.  For containers holding up to hundreds of items,
 * the performance difference is not significant, less than 50%.</p>
 * SparseArrays将键值对保存在数组中并通过二分查找键值。所以在数据量很大的时候速度较慢。
 */
public class SparseArray<E> implements Cloneable { }

读完这段注释好像已经回答了刚才的问题：因为SparseArray相较于HashMap有更好的空间性能。而且RecyclerView回收池中每种viewType默认最多存放5个ViewHolder。少量数据也正好可以忽略SparseArray在性能上的劣势。

取值

好奇 SparseArray是如何做到用时间换空间的？容器类必然会提供存和取的操作，就以取操作为切入点，一探究竟：

public class SparseArray<E> implements Cloneable {
    //存储键的数组
    private int[] mKeys;
    //存储值的数组
    private Object[] mValues;
    
    /**
     * Gets the Object mapped from the specified key, or <code>null</code>
     * if no such mapping has been made.
     * 根据给定的key取对应value，如果没有取到则返回null
     */
    public E get(int key) {
        return get(key, null);
    }

    /**
     * Gets the Object mapped from the specified key, or the specified Object
     * if no such mapping has been made.
     * 根据给定的key取对应value
     */
    @SuppressWarnings("unchecked")
    public E get(int key, E valueIfKeyNotFound) {
        //用二分查找获取值的索引
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i < 0 || mValues[i] == DELETED) {
            return valueIfKeyNotFound;
        } else {
            //成功获取值
            return (E) mValues[i];
        }
    }
}

class ContainerHelpers {
    // This is Arrays.binarySearch(), but doesn’t do any argument validation.
    //二分查找
    static int binarySearch(int[] array, int size, int value) {
        int lo = 0;
        int hi = size - 1;

        while (lo <= hi) {
            final int mid = (lo + hi) >>> 1;
            final int midVal = array[mid];
            //递增序列
            if (midVal < value) {
                lo = mid + 1;
            } else if (midVal > value) {
                hi = mid - 1;
            } else {
                return mid;  // value found
            }
        }
        //若查找失败，则将索引置反（变为负数）
        return ~lo;  // value not present
    }
}

看到这里可以做一个阶段性总结：

SparseArray用两个长度相同的数组分别存储键和值，键是int类型的，值是Object类型的。而且一对键值在各自数组中的索引相同。
取值算法如下：按键取值时，会二分查找键数组。若找到，则用相同的索引去值数组取出值。
不由得产生一个疑问：键数组可以使用二分法查找，难道它是有序的吗？看下存值逻辑是否能找到更多线索。

存值

public class SparseArray<E> implements Cloneable {
    /**
     * Adds a mapping from the specified key to the specified value,
     * replacing the previous mapping from the specified key if there
     * was one.
     * 存值，相同键的新值会覆盖旧值
     */
    public void put(int key, E value) {
        //用二分查找获得键索引（键在键数组中的位置）
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        //1.若找到键索引（表示该键已存在），则用新值覆盖旧值
        if (i >= 0) {
            mValues[i] = value;
        }
        //若未找到存储索引，则存储新值
        else {
            //将负值索引恢复成正值（二分查找失败时会取反）
            i = ~i;

            //2.若目标插入位置是待删除位，则复用待删除位
            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }

            //当键数组满了且存在待删除位时，压缩数组
            if (mGarbage && mSize >= mKeys.length) {
                gc();

                // Search again because indices may have changed.
                //压缩数组产生了数组挪动，索引会改变，所以重新查找一遍以获新目标插入位置
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }

            //3.插入到新的位置
            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            //有效键值对+1
            mSize++;
        }
    }
}

存值时通过二分查找确定存入位置，如果每次存入的键都是新的，那每次查找必然失败，但查找失败返回的索引值（取反后）正是遵守了原有序列顺序产生的应该被插入新键的位置。所以，刚才的疑问解决了：键数组是有序的，且是递增的。

在获得目标插入位置后，分为三种情况：

1. 新值覆盖旧值

在键数组中二分查找给定键值，如果返回的索引大于0，则表示找到相同的键。直接将新值赋值给值数组的对应位置，覆盖掉旧值。

2. 复用待删除位

如果二分查找失败，表示给定键是一个新的键。如果新键对应的索引位是一个待删除位，则将新值存入该位置，以复用该待删除位。待删除位是啥？，看一下删除操作 SparseArray.delete()：

public class SparseArray<E> implements Cloneable {
    ...
    //用于标记某一位置的键值对已经被删除（假删）
    private static final Object DELETED = new Object();
    
    /**
     * Removes the mapping from the specified key, if there was any.
     * 删除键值对
     */
    public void delete(int key) {
        //获得待删除位置
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            //将值数组中对应位置标记为待删除
            if (mValues[i] != DELETED) {
                mValues[i] = DELETED;
                //打开回收开关
                mGarbage = true;
            }
        }
    }
}

删除键值对时并没有真正的删除，而是标记为删除（此处并没有将mSize--），但标记的后果是数组会变得“稀疏”，即有效键值对不是一个挨着一个，他们会被待删除键值对分割。

想想也对，数组的删除操作是费时的，因为需要整体挪动数组覆盖待删除位。想必在未来的某个时刻应该会执行一次真正的删除。

所以存值的第二种情况是：当待删除位还没来得及真正删除时被复用了。

3. 插入新位置

当给定键是一个新的键且其对应的索引位已经被其他键值所占用了，SparseArray通过挪动数组来解决：

public final class GrowingArrayUtils {
    /**
     * Primitive int version of {@link #insert(Object[], int, int, Object)}.
     * 数组插入操作
     * @param array       数组
     * @param currentSize 当前有效元素个数
     * @param index 待插入位置
     * @param element 带插入元素
     * @return
     */
    public static int[] insert(int[] array, int currentSize, int index, int element) {
        assert currentSize <= array.length;

        //1.如果array数组可以容纳下一个新元素，则将给定元素插入到指定位置
        //并将原本该位置以后的元素整体往后平移一个位置
        if (currentSize + 1 <= array.length) {
            System.arraycopy(array, index, array, index + 1, currentSize - index);
            array[index] = element;
            return array;
        }

        //2.如果array容纳不下，则将数组扩容
        //构造比老数组更大的新数组
        int[] newArray = ArrayUtils.newUnpaddedIntArray(growSize(currentSize));
        //将老数组中待插入位置前的所有数据复制到新数组
        System.arraycopy(array, 0, newArray, 0, index);
        //将新元素插入到待插入位置
        newArray[index] = element;
        //将老数组中待插入位置后的所有元素复制到新数组
        System.arraycopy(array, index, newArray, index + 1, array.length - index);
        return newArray;
    }
    
    //扩容算法
    public static int growSize(int currentSize) {
        return currentSize <= 4 ? 8 : currentSize * 2;
    }
}

其实除了发生冲突，还有另一种情况（注释中第2点）：原本的键数组满了。这种情况下只能通过数组扩容才能存取新的值。

紧致

刚才跳过了一个步骤（它是“紧致”的关键），即将新的键值插入到数组之前，还有一个回收操作SparseArray.gc()，点进去看看：

public class SparseArray<E> implements Cloneable {
    //压缩键数组和值数组，将待删除索引位覆盖(将数组中的空隙填满)
    private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);

        int n = mSize;
        //待删除位索引
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;

        //遍历值数组
        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    //将后面的有效键值对往前挪覆盖待删除位
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }

                o++;
            }
        }

        //压缩完之后 标记不需要再被压缩 直到下次键值对被删除后出现了空位
        mGarbage = false;
        //有效键值对个数再一次符合事实（当发生删除操作后，mSize会偏大）
        mSize = o;
    }
}

一图胜千言：

假设当前SparseArray中有4个键值对，此时mSize=4，如下所示：

	0	1	2	3
mKeys	55	57	58	60
mValues	A	B	D	F

删除第二个后，此时mSize=4：

	0	1	2	3
mKeys	55	57	58	60	65
mValues	A	DELETE	D	F	G

执行SparseArray.gc()后，此时mSize=3：

	0	1	2	3
mKeys	55	58	60	60
mValues	A	D	F	null

虽然经过SparseArray.gc()后，数组末尾会出现两个相同的键，但这不会影响下一次二分查找，因为二分查找的终点是mSize-1。

“紧致”是相对于“稀疏”来说的，当从SparseArray删除键值对，SparseArray中存储键值对的两个数组就变得“稀疏”，“紧致”就是覆盖掉待删除键值对，让有效键值对重新挨个排在一起，这样二叉查找效率更高。

SparseArray.gc()中运用了两个索引，分别是i和o。如果把遍历值数组比作跑步，那这两个索引就是两个选手，它们同时从起点出发。不同的是，每当遇到待删除坑位，o就会停下来，而i总是一往直前，所以i一直领先于o，当i找到有效键值对时，会将其覆盖到o的位置，然后o才向前进一步。通过这样的算法，就实现了将数组后面的内容往前挪覆盖掉待删位。

所以，明明很紧致，却偏偏要取一个稀疏的名字，这是一个充满矛盾的类。

总结

SparseArray用于存放键值对，键是int，值是Object。
SparseArray用两个长度相等的数组分别存储键和值，同一个键值对所在两个数组中的索引相等。
SparseArray比HashMap访问速度更慢，因为二分查找速度慢于散列定位。
SparseArray比HashMap更节省空间，因为不需要创建额外的Entry存放键值对。
SparseArray中存放键的数组是递增序列。
SparseArray删除元素时并不会真正删除，而是标记为待删除元素，在合适的时候会将后面的元素往前挪覆盖掉待删除元素。待删除元素在没有被覆盖前有可能被复用。

内存优化：充满矛盾的SparseArray

取值

存值

1. 新值覆盖旧值

2. 复用待删除位

3. 插入新位置

紧致

总结

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

内存优化：充满矛盾的SparseArray

取值

存值

1. 新值覆盖旧值

2. 复用待删除位

3. 插入新位置

紧致

总结

热门文章

最新文章

相关电子书