自己实现的HashTable
自己实现的HashTable,主要是学习一下Hash的冲突检测以及冲突解决方式,没有详细的测试,希望大家多多指教。
import java.util.Locale;/** * Implement a hash table to store strings (i.e. objects of the Java String * class) * * @author LiYazhou * @version 1.0 */public class HashTable {/** * The initialize size of this hash table. */private final static int INITIAL_SIZE = 50;/** * The internal array to hold the elements */private String[] elements;/** * The real number of this hast table */private int capacity;/** * The no argument constructor to initialize the HashTable to size 101 */public HashTable() {this(INITIAL_SIZE);}/** * Creates a hash table to store n items, the size of the underlying array * should be a prime number, roughly 2 * n * * @param n */public HashTable(int n) {int size = 2 * n;elements = new String[getNearestPrime(size)];}/** * Inserts the parameter into the hash table, return true if the parameter * was inserted * * @param value * The value to be inserted. * @return Whether or not the value to be inserted. */public boolean insert(String value) {if (loadFactor() >= 0.75) {reHash();}return insert(elements, value);}/** * Inserts the parameter into the hash table, return true if the parameter * was inserted * * @param strArray * To insert the value to the strArray array * @param value * The value to be inserted. * @return Whether or not the value to be inserted. */private boolean insert(String[] strArray, String value) {int hash_key = hash(value, strArray.length);int increment = 1;int probe_counter = 0;int hash_mid = (strArray.length + 1) / 2;while (true) {// Check for overflowif (probe_counter > hash_mid) {return false;}String tmp = strArray[hash_key];// Empty slotif (tmp == null) {break;// Position already taken, duplicate keys are not allowed} else if (tmp.equals(value)) {return false;// Handles collision using h+i^2} else {hash_key = (hash_key + increment) % strArray.length;increment += 2;probe_counter++;}}strArray[hash_key] = value;capacity++;return true;}/** * return true if the given string is in the table, false otherwise * * @param str * The value to be checked in this hash table * @return Whether or not the value is in this hash table. */public boolean lookUp(String str) {int hash_key = hash(str, elements.length);int increment = 1;int probe_counter = 0;int hash_mid = (elements.length + 1) / 2;while (true) {// Overflow encountered if the probe is bigger than hash_size/2if (probe_counter > hash_mid) {return false;}String tmp = elements[hash_key];// Record empty for the keyif (tmp == null) {break;} else if (tmp.equals(str)) {return true;} else {hash_key = (hash_key + increment) % elements.length;increment += 2;probe_counter++;}}return false;}/** * Returns the size of the hash table (the size of the underlying array) * * @return The size of the hash table */public int length() {return elements.length;}/** * Returns the number of strings currently stored in the hash table * * @return The number of strings currently stored in the hash table */public int getNum() {return capacity;}/** * The current load factor of this hash table. * * @return The current load factor of this hash table. */private float loadFactor() {return (float) capacity / elements.length;}/** * This method is to get the next nearest prime number after size * * @param size * The base which you want to find * @return The next nearest prime number after size */private int getNearestPrime(int size) {boolean isPrime = checkPrime(size);while (!isPrime) {size++;isPrime = checkPrime(size);}return size;}/** * This is the hash function to return a hash code according to the key * * @param word * The key which you want to generate the hash code * @param size * The size of current array's size * @return The hash code for the give word */private int hash(String word, int size) {word = word.toLowerCase(Locale.ENGLISH);int hashValue = 0;for (int i = 0; i < word.length(); i++) {hashValue = ((hashValue * 32) + (word.charAt(i) - 'a')) % size;}if(hashValue < 0){hashValue = hashValue + size;}return hashValue;}/** * This method is to implement the rehash function when the load factor is * greater than the initial load factor. */private void reHash() {capacity = 0;String[] newArray = new String[getNearestPrime(2 * elements.length)];for (int i = 0; i < elements.length; i++) {if (elements[i] != null) {insert(newArray, elements[i]);}}elements = newArray;}/** * This method is to check whether this number is a prime value or not * * @param n * The value which you want to check whether it is a prime value * @return Whether the given value is a prime value. */private boolean checkPrime(int n) {if (n < 1) {return false;}if (n % 2 == 0) {return false;}for (int i = 3; i * i <= n; i += 2) {if (n % i == 0) {return false;}}return true;}} 1 楼 robertliudeqiang 2010-03-06 asialee 写道[size=medium]/**
* Inserts the parameter into the hash table, return true if the parameter
* was inserted
*
* @param value
* The value to be inserted.
* @return Whether or not the value to be inserted.
*/
public boolean insert(String value) {
if (loadFactor() >= 0.75) {
reHash();
}
return insert(elements, value);
}
如果是实现HashTable, ms方法应该是:
public boolean insert(String key, String value); 2 楼 asialee 2010-03-06 robertliudeqiang 写道asialee 写道[size=medium]
/**
* Inserts the parameter into the hash table, return true if the parameter
* was inserted
*
* @param value
* The value to be inserted.
* @return Whether or not the value to be inserted.
*/
public boolean insert(String value) {
if (loadFactor() >= 0.75) {
reHash();
}
return insert(elements, value);
}
如果是实现HashTable, ms方法应该是:
public boolean insert(String key, String value);
这个是这样的,JDK的HashTable是用的key的hash值查找的,它里面存储的是Entry,我是直接根据要插入的值生成Hash值的, 实际上查找的时候是直接查找值的。 3 楼 asialee 2010-03-06 另外,现在这个只是支持String,我主要是想看看冲突检测和解决的。 4 楼 captmjc 2010-03-06 1, 你说你这个只支出String的,Object[]或者泛型不就可以了
2, 这个貌似实现的是HashSet的类似功能,而不是HashTable的键值对 5 楼 robertliudeqiang 2010-03-06 楼上说的对,你这个实现的类似HashSet。在JDK里,HashSet是基于HashMap实现的。
对程序的个人观点:
1 你使用的是“开放寻址法”,个人觉得没有JDK里面HashMap用的bucket的结构有效率,主要原因是当loadFactor比较高的时候冲突会比较多。
2 private boolean checkPrime(int n)
可能有时候并不需要真的找一个素数,特别是当n很大的时候,只需要找一个可能的大素数就行了,应该有相应算法。 6 楼 asialee 2010-03-06 robertliudeqiang 写道楼上说的对,你这个实现的类似HashSet。在JDK里,HashSet是基于HashMap实现的。
对程序的个人观点:
1 你使用的是“开放寻址法”,个人觉得没有JDK里面HashMap用的bucket的结构有效率,主要原因是当loadFactor比较高的时候冲突会比较多。
2 private boolean checkPrime(int n)
可能有时候并不需要真的找一个素数,特别是当n很大的时候,只需要找一个可能的大素数就行了,应该有相应算法。
非常感谢,看来我真的有修改了,应该叫做HashSet,还有,我没有去看JDK里面的HashSet的实现,对它检测冲突的方法也不是很了解,建议给我扫盲一下。 7 楼 captmjc 2010-03-07 另外,对于checkPrime,建议lz google "prime sieve",并以此为基础,实现准备好前N个Prime (http://primes.utm.edu/lists/small/1000.txt,10000.txt ...)以加快速度 8 楼 captmjc 2010-03-07 asialee 写道robertliudeqiang 写道楼上说的对,你这个实现的类似HashSet。在JDK里,HashSet是基于HashMap实现的。
对程序的个人观点:
1 你使用的是“开放寻址法”,个人觉得没有JDK里面HashMap用的bucket的结构有效率,主要原因是当loadFactor比较高的时候冲突会比较多。
2 private boolean checkPrime(int n)
可能有时候并不需要真的找一个素数,特别是当n很大的时候,只需要找一个可能的大素数就行了,应该有相应算法。
非常感谢,看来我真的有修改了,应该叫做HashSet,还有,我没有去看JDK里面的HashSet的实现,对它检测冲突的方法也不是很了解,建议给我扫盲一下。
HashSet其实利用了HashMap 9 楼 asialee 2010-03-07 captmjc 写道另外,对于checkPrime,建议lz google "prime sieve",并以此为基础,实现准备好前N个Prime (http://primes.utm.edu/lists/small/1000.txt,10000.txt ...)以加快速度
非常感谢 10 楼 yegaofei 2010-04-01 借楼主的帖子问个问题:
当我们遍历Hashtable的key集合时,常用Enumeration的nextElement()方法,我看了下jdk的代码,nextElement()方法是以从后往前的顺序遍历的,而且注释写的是为了更快速的遍历。
我想问一下,为什么从后往前的速度会比较快?