Reading-notes

Hashtables

The basic idea of a hashtable is the ability to store the key into this data structure, and quickly retrieve the value. This is done through what we call a hash. A hash is the ability to encode the key that will eventually map to a specific location in the data structure that we can look at directly to retrieve the value

hash

Hashtables are a data structure that utilize key value pairs. This means every Node or Bucket has both a key, and a value.

example :

["Greenwood:98103", "Downtown:98101", "Alki Beach:98116", "Bainbridge Island:98110", ...]

Structure

1- Hashing (hash code turns a key into an integer)

heir output is determined only by their input. Hash codes should never have randomness to them. The same key should always produce the same hash code.

Collisions

A collision occurs when more than one key hashes to the same index in an array. As mentioned earlier, a “perfect hash” will never have any collisions. To put this into perspective, the worst possible hash is one that hashes every single key to the same exact index of an array. The more keys you have hashed to a specific index, the more key/value pair combos you can potentially have.

If two keys ever ultimately resolved to the same index, then two calls to .Add(key, val) with different keys would overwrite each other.

Collisions are solved by changing the initial state of the buckets. Instead of starting them all as null we can initialize a LinkedList in each one! Now if two keys resolve to the same index in the array then their key/value pairs can be stored as a node in a linked list. Each index in the array is called a “bucket” because it can store multiple key/value pairs.

If we didn’t store the key, the bucket would look like this. Accessing .get(“Pioneer Square”) or .get(“Alki Beach”) would hash the keys and still lead to bucket 92, but it would be impossible to tell which of the zip code values there to return.

Hash maps do this to store values:

Hash maps do this to read value:

Bucket Sizes

It’s possible to compute the “load factor” of a hash table. The load factor tells us something about how full the hash table is. A hash table can start with only a few buckets, calculate it’s own load factor, recognize when it gets too full and automatically grow and add more buckets to itself to accommodate more data.

Internal Methods

send the key to the GetHash method. Once you determine the index of where it should be placed, go to that index Check if something exists at that index already, if it doesn’t, add it with the key/value pair. If something does exist, add the new key/value pair to the data structure within that bucket.

more details