WeakHashMap的垃圾回收原理详解

2023-09-08 09:16:53 作者：半夏_2021

这篇文章主要介绍了WeakHashMap的垃圾回收原理详解,WeakHashMap 与 HashMap 的用法基本类似,与 HashMap 的区别在于,HashMap的key保留了对实际对象的强引用个,这意味着只要该HashMap对象不被销毁,该HashMap的所有key所引用的对象就不会被垃圾回收,需要的朋友可以参考下

WeakHashMap 介绍

WeakHashMap 与 HashMap 的用法基本类似。与 HashMap 的区别在于，HashMap 的key 保留了对实际对象的强引用个，这意味着只要该HashMap对象不被销毁，该HashMap的所有key所引用的对象就不会被垃圾回收，HashMap也不会自动删除这些key所对应的key-value 对；

但WeakHashMap的key 只保留了对实际对象的弱引用，这意味着如果WeakHashMap对象的key所引用的对象没有被其他强引用变量所引用个，则这些key所引用的对象可能被垃圾回收，WeakHashMap也可能自动删除这些key所对应的key-value对。

WeakHashMap的数据结构

类的定义

在这里插入图片描述

public class WeakHashMap<K,V>
    extends AbstractMap<K,V>
    implements Map<K,V> {
}

WeakHashMap 因为GC的时候会把没有强引用的key回收掉，所以它里面的元素不会太多。

因此，WeakHashMap 的存储结构只有数组 + 链表

变量和常量

  // 默认初始容量为16
   private static final int DEFAULT_INITIAL_CAPACITY = 16;
   // 最大容量为2的30次方
    private static final int MAXIMUM_CAPACITY = 1 << 30;
   //默认装载因子
   private static final float DEFAULT_LOAD_FACTOR = 0.75f;
   // 桶
    Entry<K,V>[] table;
    //元素个数
    private int size;
    // 扩容门槛，等于capacity * loadFactor
    private int threshold;
   // 装载因子
    private final float loadFactor;
    /**
     * 引用队列，当弱键失效的时候会把Entry添加到这个队列中
     */
    private final ReferenceQueue<Object> queue = new ReferenceQueue<>();
     // 修改次数
    int modCount;

Entry 内部类

Entry 内部类并没有key属性，因为key属性存储在 Reference 类中

private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> {
        V value;
        final int hash;
        Entry<K,V> next;
        /**
         * Creates new entry.
         */
        Entry(Object key, V value,
              ReferenceQueue<Object> queue,
              int hash, Entry<K,V> next) {
            // 调用Reference的构造方法初始化key和引用队列
            super(key, queue);
            this.value = value;
            this.hash  = hash;
            this.next  = next;
        }
}
public class WeakReference<T> extends Reference<T> {
   // 调用Reference的构造方法初始化 key 和 引用队列
    public WeakReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }
}
public abstract class Reference<T> {
    // 实际存储key的地方
    private T referent;      
    // 引用队列
    volatile ReferenceQueue<? super T> queue;
    Reference(T referent) {
        this(referent, null);
    }
    Reference(T referent, ReferenceQueue<? super T> queue) {
        this.referent = referent;
        this.queue = (queue == null) ? ReferenceQueue.NULL : queue;
    }
}

从Entry的构造方法我们知道，key和queue最终会传到到Reference的构造方法中，这里的key就是Reference的referent属性，它会被gc特殊对待，即当没有强引用存在时，当下一次gc的时候会被清除。

构造方法

public WeakHashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Initial Capacity: "+
                initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal Load factor: "+
                loadFactor);
    int capacity = 1;
    while (capacity < initialCapacity)
        capacity <<= 1;
    table = newTable(capacity);
    this.loadFactor = loadFactor;
    threshold = (int)(capacity * loadFactor);
}
public WeakHashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
public WeakHashMap() {
    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
public WeakHashMap(Map<? extends K, ? extends V> m) {
    this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,
            DEFAULT_INITIAL_CAPACITY),
            DEFAULT_LOAD_FACTOR);
    putAll(m);
}

构造方法和 HashMap基本类似，初始容量为大于等于传入容量的2的n次方，扩容的门槛 threshold 等于 capacity * loadFactor 。

put(K key, V value)

    public V put(K key, V value) {
        // 如果key为空，用空对象代替
        Object k = maskNull(key);
        // 计算key的hash值
        int h = hash(k);
        // 获取桶
        Entry<K,V>[] tab = getTable();
        // 计算元素在哪个桶中，h & (length-1)
        int i = indexFor(h, tab.length);
      // 遍历桶对应的链表
        for (Entry<K,V> e = tab[i]; e != null; e = e.next) {
            if (h == e.hash && eq(k, e.get())) {
               // 如果找到了元素就使用新值替换旧值，并返回旧值
                V oldValue = e.value;
                if (value != oldValue)
                    e.value = value;
                return oldValue;
            }
        }
        modCount++;
        // 如果没找到就把新值插入到链表的头部
        Entry<K,V> e = tab[i];
        tab[i] = new Entry<>(k, value, queue, h, e);
        // 如果插入元素后数量达到了扩容门槛就把桶的数量扩容为2倍大小
        if (++size >= threshold)
            resize(tab.length * 2);
        return null;
    }

（1）计算hash；

与HashMap不同的是，key为null 时，返回的hash时0

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

而 WeakHashMap 用空对象来计算

    private static final Object NULL_KEY = new Object();
    private static Object maskNull(Object key) {
        return (key == null) ? NULL_KEY : key;
    }

另外，HashMap 计算hash 只用了依次异或，而这里使用了四次

    final int hash(Object k) {
        int h = k.hashCode();
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

(2) 计算再哪个桶

(3) 遍历桶对应的链表

(4) 如果能找到元素，则用新值代替旧值

(5) 如果没有找到就在链表的头部插入新元素

(6) 如果元素数量达到了扩容门槛，就把容量扩大到原来容量的2倍；

resize(int newCapacity)

    void resize(int newCapacity) {
     // 获取旧桶，getTable()的时候会剔除失效的Entry
        Entry<K,V>[] oldTable = getTable();
         // 旧容量
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
       // 新桶
        Entry<K,V>[] newTable = newTable(newCapacity);
        // 把元素从旧桶转移到新桶
        transfer(oldTable, newTable);
        table = newTable;
     // 如果元素个数大于扩容门槛的一半，则使用新桶和新容量，并计算新的扩容门槛
        if (size >= threshold / 2) {
            threshold = (int)(newCapacity * loadFactor);
        } else {
          // 否则把元素再转移回旧桶，还是使用旧桶
        // 因为在transfer的时候会清除失效的Entry，所以元素个数可能没有那么大了，就不需要扩容了
            expungeStaleEntries();
            transfer(newTable, oldTable);
            table = oldTable;
        }
    }
    private void transfer(Entry<K,V>[] src, Entry<K,V>[] dest) {
    // 遍历旧桶
        for (int j = 0; j < src.length; ++j) {
            Entry<K,V> e = src[j];
            src[j] = null;
            while (e != null) {
                Entry<K,V> next = e.next;
                Object key = e.get();
                // 如果key等于了null就清除，说明key被gc清理掉了，则把整个Entry清除
                if (key == null) {
                    e.next = null;  // Help GC
                    e.value = null; //  "   "
                    size--;
                } else {
                 // 否则就计算在新桶中的位置并把这个元素放在新桶对应链表的头部
                    int i = indexFor(e.hash, dest.length);
                    e.next = dest[i];
                    dest[i] = e;
                }
                e = next;
            }
        }
    }

（1）判断旧容量是否达到最大容量；

（2）新建新桶并把元素全部转移到新桶中；

（3）如果转移后元素个数不到扩容门槛的一半，则把元素再转移回旧桶，继续使用旧桶，说明不需要扩容；

（4）否则使用新桶，并计算新的扩容门槛；

（5）转移元素的过程中会把key为null的元素清除掉，所以size会变小；

垃圾回收原理

WeakHashMap 通过将一些没有被引用的键的值赋值为null, 这样就会告诉GC去回收这些存储的值。

我们看下面例子：

public class WeakHashMapTest {
    public static void main(String[] args) {
        House seller1 = new House("1号卖家房源.");
        SellerInfo sellerInfo1 = new SellerInfo();
        House seller2 = new House("2号卖家房源");
        SellerInfo sellerInfo2 = new SellerInfo();
        WeakHashMap<House,SellerInfo> weakHashMap = new WeakHashMap<>();
        //如果换成 HashMap ，则Key是对House对象的强引用
        weakHashMap.put(seller1,sellerInfo1);
        weakHashMap.put(seller2,sellerInfo2);
        System.out.println("weakHashMap before null,size="+weakHashMap.size());
        seller1 = null;
        System.gc();
        System.runFinalization();
        //如果换成 HashMap ，size 依然等于2
        System.out.println("weakHashMap after null, size = "+weakHashMap.size());
        System.out.println(weakHashMap);
    }
}
class SellerInfo{}

最终的结果是size = 1,为什么为1呢？因为 seller1 为null, 从而引起GC，那么为什么我们把 null 作为键存进去，为什么不会导致被回收呢？

那么我们看 put 方法的源码:

public V put(K key, V value) {
        K k = (K) maskNull(key);// 重点看这里
        int h = HashMap.hash(k.hashCode());
        Entry[] tab = getTable();
        int i = indexFor(h, tab.length);
        for (Entry<K,V> e = tab[i]; e != null; e = e.next) {
            if (h == e.hash && eq(k, e.get())) {
                V oldValue = e.value;
                if (value != oldValue)
                    e.value = value;
                return oldValue;
            }
        }
        modCount++;
    Entry<K,V> e = tab[i];
        tab[i] = new Entry<K,V>(k, value, queue, h, e);
        if (++size >= threshold)
            resize(tab.length * 2);
        return null;
    }

我们重点看一些这行代码 K k = (K) maskNull(key)；

    private static final Object NULL_KEY = new Object();
    private static Object maskNull(Object key) {
        return (key == null) ? NULL_KEY : key;
    }

如果key为null的话，返回的是NULL_KEY 这个静态值，这个静态值就是 Object ，所以WeakHashMap 在存储null为键的时候，其实存储的是其本身的静态成员变量 Object，也就是存储不是null。

那WeakHashMap 是如何跟WeakReference 关联起来的呢？

private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> {
        V value;
        final int hash;
        Entry<K,V> next;
        /**
         * Creates new entry.
         */
        Entry(Object key, V value,
              ReferenceQueue<Object> queue,
              int hash, Entry<K,V> next) {
            super(key, queue);
            this.value = value;
            this.hash  = hash;
            this.next  = next;
        }
}

WeakHashMap的Entry是继承WeakReference，这样一来，整个Entry就是一个WeakReference，再来看看Entry的构造方法，调用了super(key, queue)，也就是调用了这个构造方法

public class WeakReference<T> extends Reference<T> {
    public WeakReference(T referent) {
        super(referent);
    }
    public WeakReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }
}

有两个参数，一个key，一个是queue, 这个key就是WeakHashMap 中存储的key的值，这个queue 是WeakHashMap 中创建的 ReferenceQueue 。那么 ReferenceQueue 有什么用呢？

当GC某个对象时，如果有此对象上还有弱引用与其关联，会将WeakReference对象与Reference 类的pending 引用关联起来，然后由 Reference Handler线程将该插入ReferenceQueue队列。

也就是说Entry中的key被GC时，会你那个Entry 放入到 ReferenceQueue中，WeakHashMap就能通过ReferenceQueue中的Entry了解到哪些key已经被GC，或者即将马上被GC，起到了通知的作用。

那么什么时候来判断要讲没有被引用的key标记为null的呢？

在WeakHashMap的put()，get()，remove()等等方法中都调用了一个getTable()方法，而这个getTable()方法的源码如下：

    private Entry<K,V>[] getTable() {
        expungeStaleEntries();
        return table;
    }

其实都是调用 expungeStaleEntries() 方法，我们看其源码:

    private void expungeStaleEntries() {
        for (Object x; (x = queue.poll()) != null; ) {
            synchronized (queue) {
                @SuppressWarnings("unchecked")
                    Entry<K,V> e = (Entry<K,V>) x;
                int i = indexFor(e.hash, table.length);
                Entry<K,V> prev = table[i];
                Entry<K,V> p = prev;
                while (p != null) {
                    Entry<K,V> next = p.next;
                    if (p == e) {
                        if (prev == e)
                            table[i] = next;
                        else
                            prev.next = next;
                        // Must not null out e.next;
                        // stale entries may be in use by a HashIterator
                        e.value = null; // Help GC
                        size--;
                        break;
                    }
                    prev = p;
                    p = next;
                }
            }
        }
    }

上面代码中的queue 就是定义的成员变量

private final ReferenceQueue<Object> queue = new ReferenceQueue<>();

可以看到每调用一次expungeStaleEntries()方法，就会在引用队列中寻找是否有将要被清除的key对象，如果有则在table中找到其值，并将value设置为null，next指针也设置为null，让GC去回收这些资源。

总结

（1）WeakHashMap使用（数组 + 链表）存储结构；

（2）WeakHashMap中的key是弱引用，gc的时候会被清除；

（3）每次对map的操作都会剔除失效key对应的Entry；

（4）使用String作为key时，一定要使用new String()这样的方式声明key，才会失效，其它的基本类型的包装类型是一样的；

（5）WeakHashMap常用来作为缓存使用；

到此这篇关于WeakHashMap的垃圾回收原理详解的文章就介绍到这了,更多相关WeakHashMap垃圾回收内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家！