数据结构---哈希表

一、哈希表

1、哈希表（Hash table）

哈希表（Hash table），也叫散列表，是根据关键码值(Key value)而直接进行访问的数据结构。也就是说，它通过把关键码值映射到表中一个位置来访问记录，以加快查找的速度。这个映射函数叫做哈希（散列）函数，存放记录的数组叫做散列表。哈希表的核心思想就是使用哈希函数将键映射到存储桶。

2、哈希函数

哈希函数能使对一个数据序列的访问更加迅速有效，通过哈希函数，将关键字的集合映射到某个地址集合上，数据元素将被更快定位。
- 直接寻址法
- 数字分析法
- 平方取中法
- 折叠法
- 随机数法
- 除留余数法（常用）：n mod size

3、结构：

数组+链表+红黑树
- 当我们插入一个新的键时，哈希函数将决定该键应该分配到哪个桶中，并将该键存储在相应的桶中；
- 当我们想要搜索一个键时，哈希表将使用相同的哈希函数来查找对应的桶，并只在特定的桶中进行搜索。

4、哈希表的优缺点

优势：它可以提供非常快速的插入-删除-查找操作，无论多少数据，插入和删除值需要接近常量的时间。在查找方面，哈希表的速度比树还要快，基本可以瞬间查找到想要的元素。
不足：哈希表中的数据是没有顺序概念的，所以不能以一种固定的方式（比如从小到大）来遍历其中的元素。在数据处理顺序敏感的问题时，选择哈希表并不是个好的处理方法。同时，哈希表中的key 是不允许重复的，在重复性非常高的数据中，哈希表也不是个好的选择。

5、基于哈希表的HashMap和HashTable的比较

HashTable

底层数组+链表实现，无论key还是value都不能为null，线程安全，实现线程安全的方式是在修改数据时锁住整个HashTable，效率低，ConcurrentHashMap做了相关优化
初始size为11，扩容：newsize = olesize*2+1
计算index的方法：index = (hash & 0x7FFFFFFF) % tab.length Hashtable同样是基于哈希表实现的，同样每个元素是一个key-value对，其内部也是通过单链表解决冲突问题，容量不足（超过了阀值）时，同样会自动增长。

HashMap

底层数组+链表+红黑树实现，可以存储null键和null值，线程不安全
初始size为16，扩容：newsize = oldsize*2，size一定为2的n次幂
扩容针对整个Map，每次扩容时，原来数组中的元素依次重新计算存放位置，并重新插入
插入元素后才判断该不该扩容，有可能无效扩容（插入后如果扩容，如果没有再次插入，就会产生无效扩容）
当Map中元素总数超过Entry数组的75%，触发扩容操作，为了减少链表长度，元素分配更均匀
计算index方法：index = hash & (tab.length – 1)
HashMap是基于哈希表实现的，每一个元素是一个key(数据类型必须一致)-value对，其内部通过单链表解决冲突问题，容量不足（超过了阀值）时，同样会自动增长。
HashMap是非线程安全的，只是用于单线程环境下，多线程环境下可以采用concurrent并发包下的concurrentHashMap。
HashMap 实现了Serializable接口，因此它支持序列化，实现了Cloneable接口，能被克隆。

6、哈希表经典面试题

链接：面试题链接

二、手写哈希表

情景: google公司上机题, 有一个公司，当有新员工来报到时，要求将该员工的信息加入（ID，姓名，年龄，地址），当输入员工的ID时，要求查找到该员工的所有信息
要求：不使用数据库，速度越快越好====》言外之意，用hash表来完成

public class MicroSoft {
	public static void main(String[] args) {
		Hashtable h=new Hashtable(5);
//		int id, String name, String sex, int age, int phone
		Emp e1=new Emp(1,"tom","男",23,119);
		Emp e2=new Emp(2,"jack","男",24,120);
		Emp e3=new Emp(6,"lucy","女",25,121);
		Emp e4=new Emp(8,"lulu","女",26,122);
		
		h.add(e1);
		h.add(e2);
		h.add(e3);
		h.add(e4);
		
		h.list();
		System.out.println("**************");
		h.findEmpByid(2);
		System.out.println("**************");
		h.delete(6);
		h.list();
		
	}

}

class Hashtable{
	private EmpLinkedList[] emplist;
	private int size;
	
	public Hashtable(int size){
		this.size=size;
		emplist=new EmpLinkedList[size];
		for(int i=0;i<size;i++) {
			emplist[i]=new EmpLinkedList();
		}
	}
	
	//核心，散列函数
	public int hashFun(int id) {
		return id % size;
	}
	
	//增加
	public void add(Emp emp) {
		int empno=hashFun(emp.id);
		emplist[empno].add(emp);
	}
	
	//删除
	public void delete(int id) {
		int empno=hashFun(id);
		emplist[empno].delete(id);;
	}
	
	//根据id查看
	public void findEmpByid(int id) {
		int empno=hashFun(id);
		Emp emp=emplist[empno].findEmpById(id);
		if(emp!=null) {
			System.out.println("在第"+(empno+1)+"条链表中找到员工id="+id);
		}
		else {
			System.out.println("未找到该员工");
		}
	}
	
	//查看全部
	public void list() {
		for(int i=0;i<size;i++) {
			emplist[i].list(i);
		}
	}
	
	
}

//创建Emp的链表
class EmpLinkedList{
	private Emp head;
	
	//添加
	public void add(Emp emp) {
		//因为头节点是包含数据的，所以添加第一个节点的时候，我们让head=emp
		if(head==null) {
			head=emp;
			return;
		}
		//找到链表的最后一个节点
		Emp temp=head;
		while(true){
			//如果到了最后就结束循环
			if(temp.next==null) {
				break;
			}
			//后移
			temp=temp.next;
		}
		//找到最后位置，添加数据
		temp.next=emp;
	}
	
	//删除
	public void delete(int id) {
		//如果链表为空，则直接返回
		if(head==null) {
			System.out.println("该链表为空");
			return;
		}
		//定义辅助指针找到对应位置，做删除操作
		Emp temp=head;
		while(true) {
			//只有一个节点的话
			if(temp.id==id) {
				head=head.next;
				break;
			}
			//有多个节点
			if(temp.next.id==id) {
				temp.next=temp.next.next;
				break;
			}
			if(temp.next==null) {
				return;
			}
			temp=temp.next;
			
		}
	}
	
	
//	根据id查看
	public Emp findEmpById(int id) {
		if(head==null) {
			System.out.println("链表为空");
			return null;
		}
		
		Emp temp=head;
		while(true) {
			if(temp.id==id) {
				break;
			}
			if(temp.next==null) {
				break;
			}
			temp=temp.next;
		}
		return temp;
	}
	
//	显示全部
	public void list(int id) {
		if(head==null) {
			System.out.println("第"+(id+1)+"个链表为空");
			return;
		}
		
		System.out.print("第"+(id+1)+"个链表的信息为：");
		Emp temp=head;
		while(true) {
			System.out.print("  --->"+temp.id+"\t"+temp.name+"\t"+temp.sex+"\t"+temp.age+"\t"+temp.phone);
			if(temp.next==null) {
				temp=null;
				break;
			}
			temp=temp.next;
		}
		System.out.println();
	}
}

//创建一个员工类，相当于一个节点
class Emp{
	public int id;
	public String name;
	public String sex;
	public int age;
	public int phone;
	
	public Emp next;

	
	
	public Emp(int id, String name, String sex, int age, int phone) {
		super();
		this.id = id;
		this.name = name;
		this.sex = sex;
		this.age = age;
		this.phone = phone;
	}



	public Emp(int id, String name, String sex, int age, int phone, Emp next) {
		super();
		this.id = id;
		this.name = name;
		this.sex = sex;
		this.age = age;
		this.phone = phone;
		this.next = next;
	}
	
}

三、对应习题

1、两数之和

题目描述

给出一个整型数组 numbers 和一个目标值 target，请在数组中找出两个加起来等于目标值的数的下标，返回的下标按升序排列。（注：返回的数组下标从1开始算起，保证target一定可以由数组里面2个数字相加得到）
分析思路

遍历数组元素 number，遍历时检测 target - number 是否在HashMap中 , 如果不在 , 则加入到哈希表中，将 target - number 的值作为 HashMap 集合的 Key 键 , 将该 number 的索引作为 Value 值 ;
时间复杂度：O(n) 一次遍历hash索引查找时间复杂度为O(1)
空间复杂度：O(n) 申请了n大小的map空间
核心代码：

     public int[] twoSum (int[] numbers, int target) {
       HashMap<Integer, Integer> map = new HashMap<>();
        for(int i = 0; i< numbers.length; i++) {
            int num = target - numbers[i];
            //如果hashmap中包含target - numbers[i]，则找到这两个数，返回下标
            if(map.containsKey(num)) {  
            //注意：返回的下标要从1开始
                return new int[] {map.get(num)+1,i+1};   
            }
            //将numbers[]数组中的值和下标存入hashmap中
            map.put(numbers[i], i);        
        }
        return null;
    }

2、数组中只出现一次的两个数字

描述：一个整型数组里除了两个数字只出现一次，其他的数字都出现了两次。请写程序找出这两个只出现一次的数字，输出时按非降序排列。
分析把数组里的每个元素作为哈希表的键存储，出现次数作为值，最后通过键取出值进行遍历比较，排序后输出出现次数为1的两个数字
时间复杂度：O(n)，其中n为数组长度，两次单独的遍历数组每个元素
空间复杂度：O(n)，哈希表的长度应该为(n−2)/2
核心代码

    public int[] FindNumsAppearOnce (int[] array) {
        // write code here
        HashMap<Integer,Integer> h=new HashMap();
        int[] arr=new int[2];
        int j=0;
        for(int i=0;i<array.length;i++){
            if(h.containsKey(array[i]))
                h.put(array[i],h.get(array[i])+1);
            else
                h.put(array[i],1);
        }
        
        for(int i:h.keySet()){
            if(h.get(i)==1){
               arr[j++]=i;
            }
                
        }
        
        if(arr[0]>arr[1]){
            int temp=arr[0];
            arr[0]=arr[1];
            arr[1]=temp;
        }
        
        return arr;
    }
}

3、缺失的第一个正整数

描述：给定一个无重复元素的整数数组nums，请你找出其中没有出现的最小的正整数
空间复杂度 O(n)
时间复杂度 O(n)
思路分析：使用哈希表记录数组中出现的每个数字，从1开始找到哈希表中第一个没有出现的正整数
核心代码：

import java.util.*;
public class Solution {
    public int minNumberDisappeared (int[] nums) {
        HashMap<Integer, Integer> mp = new HashMap<Integer, Integer>(); 
        //哈希表记录数组中出现的每个数字
        for(int i = 0; i < nums.length; i++) 
            mp.put(nums[i], 1);
        int res = 1;
        //从1开始找到哈希表中第一个没有出现的正整数
        while(mp.containsKey(res)) 
            res++;
        return res;
    }
}

进阶：空间复杂度 O(1)，时间复杂度 O(n)
核心代码：

import java.util.*;
public class Solution {
    public int minNumberDisappeared (int[] nums) {
        // write code here
         //HashMap<Integer, Integer> mp = new HashMap<Integer, Integer>(); 
        //遍历数组
        Arrays.sort(nums);
        if(nums[0]>1){
            return 1;
        }
        for(int i = 0; i < nums.length-1; i++){ 
            
            if(nums[i+1]-nums[i]>1){
                if(nums[i]<0 && nums[i+1]>=2 ){
                    return 1;
                }
                 if(nums[i]>0 && nums[i+1]>0 ){
                    return nums[i]+1;
                }
            }
        }
        return nums[nums.length-1]+1;
    }
}

4、三数之和

描述：给出一个有n个元素的数组S，S中是否有元素a,b,c满足a+b+c=0？找出数组S中所有满足条件的三元组。注意：三元组（a、b、c）中的元素可以按任意顺序排列。解集中不能包含重复的三元组。
思路分析：三数之和和两数之和类似，先固定一个数，然后两次循环找和为相反数的两个数
时间复杂度：O(N^2),N为数组长度，两层循环开销
空间复杂度：O(1), 常数空间
核心代码

import java.util.*;
public class Solution {
   public static ArrayList<ArrayList<Integer>> threeSum(int[] nums) {
    int length = nums.length;
    Arrays.sort(nums);
    HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
    for ( int i=0; i<length; i++ ) {
        map.put(nums[i], i);
    }
    ArrayList<ArrayList<Integer>> AII = new ArrayList<ArrayList<Integer>>();
    for ( int a=0; a<length; a++ ) {
        if ( a==0 || nums[a-1]!=nums[a] ) {
            for ( int b=a+1; b<length; b++ ) {
                if ( b==a+1 || nums[b-1]!=nums[b] ) {
                    int c = -(nums[a]+nums[b]);
                    if ( map.containsKey(c) && map.get(c)>b) {
                        ArrayList<Integer> AI = new ArrayList<Integer>();
                        AI.add(nums[a]);AI.add(nums[b]);AI.add(c);
                        AII.add(AI);
                    }   
                }
            }
        }
    }
    return AII;
}
}

5、最长回文串

描述：给定一个包含大写字母和小写字母的字符串，找到通过这些字母构造成的最长的回文串。在构造过程中，请注意区分大小写。比如 "Aa" 不能当做一个回文字符串。注意: 假设字符串的长度不会超过 1010。
思路：用一个HashMap来统计字符出现的次数，然后遍历，判断如果是偶数就累加字母出现的次数，如果是奇数就减一让他变成偶数再累加，最后就得到答案res，但是还没大功告成，因为中点插进一个字母，他还是对称的，所以最后要判断一下累加的长度是否等于原来的字符串长度，再决定要不要再加上一个字母的长度。
代码：

public int longestPalindrome(String s) {
    HashMap<Character, Integer> map = new HashMap<>();
    //统计字符出现的次数
    for (char c : s.toCharArray()) {
        map.put(c, map.getOrDefault(c, 0) + 1);
    }
    int res = 0;
    for (Character key : map.keySet()) {
        Integer val = map.get(key);
        //如果是奇数次，减一成为偶数，再累加
        if (val % 2 != 0) {
            res += (val - 1);
        } else {
            //如果是偶数次，直接累加
            res += val;
        }
    }
    int length = s.length();
    return res == length ? length : (res + 1);
}

优化：使用数组来代替HashMap，以此提高代码的执行效率。我们只需要一个长度为128的数组来统计字符出现的次数，代替HashMap。其他逻辑不变即可。

public int longestPalindrome(String s) {
    int[] hash = new int[128];
    for (char c : s.toCharArray()) {
        hash[c - 'A']++;
    }
    int res = 0;
    for (int count : hash) {
        if (count % 2 != 0) {
            res += (count - 1);
        } else {
            res += count;
        }
    }
    int length = s.length();
    return res == length ? length : (res + 1);
}