文章目录
Java1.8版本:https://blog.csdn.net/LawssssCat/article/details/103214258
说明
hash(Object k)
hashmap理所当然先知道hash值怎么计算的。
然后,讨论HashMap的<mark>基本功能实现</mark>:
<mark>Part 1 - 添加数据</mark>
源码(1.7)
/** * Retrieve object hash code and applies a supplemental hash function to the * result hash, which defends against poor quality hash functions. (接收k的hash值,并且根据收到的hash值再计算一次hash值,可以抵抗一些质量很差的hash值被传入。) * This is critical because HashMap uses power-of-two length hash tables, * that otherwise encounter collisions for hashCodes that do not differ * in lower bits. (这是经常被诟病的,因为HashMap用了二的幂次作为长度,而如果用一些无法分辨低位的hashcode会遇到很多碰撞。) * Note: Null keys always map to hash 0, thus index 0. */
final int hash(Object k) {
int h = hashSeed;
if (0 != h && k instanceof String) {
return sun.misc.Hashing.stringHash32((String) k);
}
h ^= k.hashCode();
Part 1 - 添加功能 - 实现
put(K key, V value) - 添加
源码:(1.7)
/** * Associates the specified value with the specified key in this map. * (把key和value关联进this.map) * * If the map previously contained a mapping for the key, the old * value is replaced. * (如果map中已经包含了key映射,那么旧的value就会被替代) * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) * * return:如果没有存在key映射,就返回null,否则就返回之前的映射值。 * (返回null的情况还包含,存在key映射,但是key映射值为null的去情况) */
public V put(K key, V value) {
if (table == EMPTY_TABLE) {//初始化操作
inflateTable(threshold);//
}
//处理异常操作
if (key == null)
return putForNullKey(value);
int hash = hash(key);//计算hash值
int i = indexFor(hash, table.length);//转换成散列表中的下标
//遍历下标中链表。
//找到链表中key相同的value,覆盖,退出方法。
//找不到相同的key,跳出循环,创建
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
//修改次数+1, 与线程安全相关,不是分析重点
modCount++;
//跳进函数(下面分析,代码在下面),用户添加链表节点。
addEntry(hash, key, value, i);
return null;
}
inflateTable(int toSize) 、 threshold - 初始化
- inflateTable(int toSize):用于初始化/扩充散列表
- threshold:
源码(1.7)
/** * Inflates the table. */
private void inflateTable(int toSize) {
// Find a power of 2 >= toSize
//找到一个2的幂次,这个数>=toSize
//容量必须是2的幂次,原因后面讲。
int capacity = roundUpToPowerOf2(toSize);
//容量阈值不能草果最大的容量MAXIMUM_CAPACITY+1;
threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
//创建新表
table = new Entry[capacity];
initHashSeedAsNeeded(capacity); //先不讨论
}
/** * The next size value at which to resize (capacity * load factor). (threshold存放下一个量表容量阈值,threshold=capacity*loadfactor) * @serial */
// If table == EMPTY_TABLE then this is the initial capacity at which the
// table will be created when inflated.
// (如果是个空表,那么threshold就是表被创建时候的初始容量值)
int threshold;
- roundUpToPowerOf2 的分析在:https://blog.csdn.net/LawssssCat/article/details/103210688
addEntry() - 判断是否需要扩容
- bucketIndex - 散列表中的下标
- threshold - 散列表容量阈值
源码(1.7)
/** * Adds a new entry with the specified key, value and hash code to * the specified bucket. (往桶里面添加新的Entry-"链表节点",节点包括key、value、hashCode) * It is the responsibility of this * method to resize the table if appropriate. (适当的时候重置散列表的容量,是这个方法的职责) * Subclass overrides this to alter the behavior of put method. (待翻译...) */
void addEntry(int hash, K key, V value, int bucketIndex) {
//扩容操作:
// 如果当前数据容量大小大于等于容量阈值
//并且,删列表下标位置有值
//那么,才会扩容
if ((size >= threshold) && (null != table[bucketIndex])) {
//这里会有误解=》size>=threshold不一定会扩容
resize(2 * table.length);//更新容量阈值
//再算一次 hash 值
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
//创建Entry链表节点操作
//放入散列表的桶中,并且放在链表头部。
createEntry(hash, key, value, bucketIndex);
}
resize() - 要扩容
- newCapacity - 新容量
/** * Rehashes the contents of this map into a new array with a * larger capacity. () * This method is called automatically when the * number of keys in this map reaches its threshold. * * If current capacity is MAXIMUM_CAPACITY, this method does not * resize the map, but sets threshold to Integer.MAX_VALUE. * This has the effect of preventing future calls. * * @param newCapacity the new capacity, MUST be a power of two; * must be greater than current capacity unless current * capacity is MAXIMUM_CAPACITY (in which case value * is irrelevant). */
void resize(int newCapacity) {
//弄个索引指向扩容前的散列表
Entry[] oldTable = table;
int oldCapacity = oldTable.length; //旧散列表的长度
//一个判断:对数据超标的判断(先跳过)
//如果,添加数据前,旧散列表就到了最大的容量属性值
// Note: 最大的容量 => 1 << 30
//那么,就把阈值更新为 最大整数值 (1 << 31) - 1 ; //变成奇数了,最大了,不会变了。没关系。
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
//创建一个空的散列表
Entry[] newTable = new Entry[newCapacity];
//把oldTable数据转移到newTable
//先不管initHashSeedAsNeeded(newCapacity),把它看为 false , 也就是说不rehash
transfer(newTable, initHashSeedAsNeeded(newCapacity));
//HashMap 成员属性 table 指向新的 散列表
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
transfer() - 旧散列表数据移动到新表中
源码(1.7)
/** * Transfers all entries from current table to newTable. */
void transfer(Entry[] newTable, boolean rehash) {
int newCapacity = newTable.length;//获取新散列的数量
//遍历获得旧的散列中的每个头部节点
for (Entry<K,V> e : table) {
//只要节点存在
while(null != e) {
//先存下链表的下一个节点
Entry<K,V> next = e.next;
//第二个形参为true时候,才要rehash值
if (rehash) {
// 这里会有误解 , 不一定会跳入这个判断,进行rehash
e.hash = null == e.key ? 0 : hash(e.key);
}
//根据新的容量newCapacity算出散列表中的下标
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
/** * Returns index for hash code h. */
static int indexFor(int h, int length) {
// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
return h & (length-1);
}
更新散列表下标的操作:
e.g.
<mark>扩容前</mark>:oldCapacity=16(假设)
<mark>扩容后</mark>:oldCapacity=32
<mark>对于这个hashCode,扩容后,在散列表的下标为原下标往左移动原容量的位数</mark>
(也有一种情况,为下标不变)
这样,就能让散列表的元素变得更分散一点!
createEntry() - 处理是否扩容问题后 - 创建Entry链表节点
源码(1.7)
/** * Like addEntry except that this version is used when creating entries * as part of Map construction or "pseudo-construction" (cloning, * deserialization). * This version needn't worry about resizing the table. (这个版本不需要关心散列表resize的事情) * Subclass overrides this to alter the behavior of HashMap(Map), * clone, and readObject. */
void createEntry(int hash, K key, V value, int bucketIndex) {
//定位到散列表bucketIndex下标的的链表的头节点
Entry<K,V> e = table[bucketIndex];
//创建一个Entry节点指向头节点,并且把这个节点更新为头结点
table[bucketIndex] = new Entry<>(hash, key, value, e);
//数据量的记录+1
size++;
}