Java1.8版本:https://blog.csdn.net/LawssssCat/article/details/103214258


说明

hash(Object k)

hashmap理所当然先知道hash值怎么计算的。

然后,讨论HashMap的<mark>基本功能实现</mark>:
<mark>Part 1 - 添加数据</mark>

源码(1.7)

 	/** * Retrieve object hash code and applies a supplemental hash function to the * result hash, which defends against poor quality hash functions. (接收k的hash值,并且根据收到的hash值再计算一次hash值,可以抵抗一些质量很差的hash值被传入。) * This is critical because HashMap uses power-of-two length hash tables, * that otherwise encounter collisions for hashCodes that do not differ * in lower bits. (这是经常被诟病的,因为HashMap用了二的幂次作为长度,而如果用一些无法分辨低位的hashcode会遇到很多碰撞。) * Note: Null keys always map to hash 0, thus index 0. */
    final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();



Part 1 - 添加功能 - 实现

put(K key, V value) - 添加

源码:(1.7)

    /** * Associates the specified value with the specified key in this map. * (把key和value关联进this.map) * * If the map previously contained a mapping for the key, the old * value is replaced. * (如果map中已经包含了key映射,那么旧的value就会被替代) * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) * * return:如果没有存在key映射,就返回null,否则就返回之前的映射值。 * (返回null的情况还包含,存在key映射,但是key映射值为null的去情况) */
    public V put(K key, V value) {
        if (table == EMPTY_TABLE) {//初始化操作
            inflateTable(threshold);//
        }
        //处理异常操作
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);//计算hash值
        
        int i = indexFor(hash, table.length);//转换成散列表中的下标
        //遍历下标中链表。
        //找到链表中key相同的value,覆盖,退出方法。
        //找不到相同的key,跳出循环,创建
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
		//修改次数+1, 与线程安全相关,不是分析重点
        modCount++;
        //跳进函数(下面分析,代码在下面),用户添加链表节点。
        addEntry(hash, key, value, i);
        return null;
    }


inflateTable(int toSize) 、 threshold - 初始化

  • inflateTable(int toSize):用于初始化/扩充散列表
  • threshold:

源码(1.7)

 	/** * Inflates the table. */
    private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        //找到一个2的幂次,这个数>=toSize
        //容量必须是2的幂次,原因后面讲。
        int capacity = roundUpToPowerOf2(toSize);

		//容量阈值不能草果最大的容量MAXIMUM_CAPACITY+1;
        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        //创建新表
        table = new Entry[capacity];

        initHashSeedAsNeeded(capacity); //先不讨论
    }

	/** * The next size value at which to resize (capacity * load factor). (threshold存放下一个量表容量阈值,threshold=capacity*loadfactor) * @serial */
    // If table == EMPTY_TABLE then this is the initial capacity at which the
    // table will be created when inflated.
// (如果是个空表,那么threshold就是表被创建时候的初始容量值)
    int threshold;
    
  1. roundUpToPowerOf2 的分析在:https://blog.csdn.net/LawssssCat/article/details/103210688


addEntry() - 判断是否需要扩容

  • bucketIndex - 散列表中的下标
  • threshold - 散列表容量阈值

源码(1.7)

	/** * Adds a new entry with the specified key, value and hash code to * the specified bucket. (往桶里面添加新的Entry-"链表节点",节点包括key、value、hashCode) * It is the responsibility of this * method to resize the table if appropriate. (适当的时候重置散列表的容量,是这个方法的职责) * Subclass overrides this to alter the behavior of put method. (待翻译...) */
     
    void addEntry(int hash, K key, V value, int bucketIndex) {
    	//扩容操作:
    	// 如果当前数据容量大小大于等于容量阈值
    	//并且,删列表下标位置有值
    	//那么,才会扩容
        if ((size >= threshold) && (null != table[bucketIndex])) {
   //这里会有误解=》size>=threshold不一定会扩容 
            resize(2 * table.length);//更新容量阈值
        	//再算一次 hash 值 
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

		//创建Entry链表节点操作
		//放入散列表的桶中,并且放在链表头部。
        createEntry(hash, key, value, bucketIndex);
    }


resize() - 要扩容

  • newCapacity - 新容量
	/** * Rehashes the contents of this map into a new array with a * larger capacity. () * This method is called automatically when the * number of keys in this map reaches its threshold. * * If current capacity is MAXIMUM_CAPACITY, this method does not * resize the map, but sets threshold to Integer.MAX_VALUE. * This has the effect of preventing future calls. * * @param newCapacity the new capacity, MUST be a power of two; * must be greater than current capacity unless current * capacity is MAXIMUM_CAPACITY (in which case value * is irrelevant). */
    void resize(int newCapacity) {
    	//弄个索引指向扩容前的散列表
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length; //旧散列表的长度

		//一个判断:对数据超标的判断(先跳过)
		//如果,添加数据前,旧散列表就到了最大的容量属性值
		// Note: 最大的容量 => 1 << 30
		//那么,就把阈值更新为 最大整数值 (1 << 31) - 1 ; //变成奇数了,最大了,不会变了。没关系。
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

		//创建一个空的散列表
        Entry[] newTable = new Entry[newCapacity];
        //把oldTable数据转移到newTable
        //先不管initHashSeedAsNeeded(newCapacity),把它看为 false , 也就是说不rehash
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        //HashMap 成员属性 table 指向新的 散列表
        table = newTable;
        
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }


transfer() - 旧散列表数据移动到新表中

源码(1.7)

	/** * Transfers all entries from current table to newTable. */
    void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;//获取新散列的数量
        //遍历获得旧的散列中的每个头部节点
        for (Entry<K,V> e : table) {
        	//只要节点存在
            while(null != e) {
            	//先存下链表的下一个节点
                Entry<K,V> next = e.next;
                //第二个形参为true时候,才要rehash值
                if (rehash) {
           // 这里会有误解 , 不一定会跳入这个判断,进行rehash
                    e.hash = null == e.key ? 0 : hash(e.key);
                }

				//根据新的容量newCapacity算出散列表中的下标
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }


	 /** * Returns index for hash code h. */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

更新散列表下标的操作:
e.g.

<mark>扩容前</mark>:oldCapacity=16(假设)

<mark>扩容后</mark>:oldCapacity=32

<mark>对于这个hashCode,扩容后,在散列表的下标为原下标往左移动原容量的位数</mark>
(也有一种情况,为下标不变)
这样,就能让散列表的元素变得更分散一点!



createEntry() - 处理是否扩容问题后 - 创建Entry链表节点

源码(1.7)

	/** * Like addEntry except that this version is used when creating entries * as part of Map construction or "pseudo-construction" (cloning, * deserialization). * This version needn't worry about resizing the table. (这个版本不需要关心散列表resize的事情) * Subclass overrides this to alter the behavior of HashMap(Map), * clone, and readObject. */
    void createEntry(int hash, K key, V value, int bucketIndex) {
    	//定位到散列表bucketIndex下标的的链表的头节点
        Entry<K,V> e = table[bucketIndex];
        //创建一个Entry节点指向头节点,并且把这个节点更新为头结点
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        //数据量的记录+1
        size++;
    }