Introduction
You already know the magic behind hash maps. Now it’s time to write your own implementation!
Limitation
Before we get started, we need to lay down some ground rules. Ruby’s dynamic nature of arrays allows us to insert and retrieve indexes that are outside our array size range. Example: if we create an array of size 16 to represent our buckets, nothing stops us from storing items at index 500. This defeats the purpose of limiting storage size in hash maps, so we need to enforce some restrictions.
Use the following snippet whenever you access a bucket through an index. We want to raise an error if we try to access an out-of-bounds index:
raise IndexError if index.negative? || index >= @buckets.length
Assignment
Start by creating a HashMap class. It should have at least two variables for load factor and capacity. For a load factor of 0.75 you should have an initial capacity of size 16. Then proceed to create the following methods:
-
#hash(key)takes a key and produces a hash code with it. We already implemented a fairly goodhashfunction in the previous lesson. As a reminder:def hash(key) hash_code = 0 prime_number = 31 key.each_char { |char| hash_code = prime_number * hash_code + char.ord } hash_code endYou are free to use that, or you can conduct your own research on hashing algorithms. Beware, this is a deep, deep rabbit hole.
You might find yourself confusing keys with hash codes while accessing key-value pairs later. We would like to stress that the key is what your
hashfunction will take as an input. In a way, we could say that the key is important for us only inside thehashfunction, as we never access a bucket directly with the key. Instead, we always do so with the hash code.Limiting key types to strings
In the real world, hash maps can accommodate various data types as keys, including numbers, strings, or objects. However, for this project, we will only handle keys of type
string. -
#set(key, value)takes two arguments, the first is a key and the second is a value that is assigned to this key. If a key already exists, then the old value is overwritten or we can say that we update the key’s value (e.g.Carlosis our key but it is called twice: once with valueI am the old value., and once with valueI am the new value.. From the logic stated above,Carlosshould contain only the latter value).In the meantime, a collision is when TWO DIFFERENT keys sit inside the same bucket, because they generate the same hash code (e.g.
RamaandSitaare both hashed to3, so3becomes a location forRamaANDSita. However, we know that it is the collision. It means we should find a way how to resolve it — how to deal with collisions, which was mentioned in the previous lesson).- Remember to grow your buckets size when it needs to, by calculating if your bucket has reached the
load factor. Some of the methods in this assignment that are mentioned later could be reused to help you handle that growth logic more easily. So you may want to hold onto implementing your growing functionality just for now. However, the reason why we mention it with#setis because it’s important to grow buckets exactly when they are being expanded.
- Remember to grow your buckets size when it needs to, by calculating if your bucket has reached the
-
#get(key)takes one argument as a key and returns the value that is assigned to this key. If key is not found, returnnil. -
#has?(key)takes a key as an argument and returnstrueorfalsebased on whether or not the key is in the hash map. -
#remove(key)takes a key as an argument. If the given key is in the hash map, it should remove the entry with that key and return the deleted entry’s value. If the key isn’t in the hash map, it should returnnil. -
#lengthreturns the number of stored keys in the hash map. -
#clearremoves all entries in the hash map. -
#keysreturns an array containing all the keys inside the hash map. -
#valuesreturns an array containing all the values. -
#entriesreturns an array that contains eachkey, valuepair. Example:[[first_key, first_value], [second_key, second_value]]
Remember that our hash map does not preserve insertion order when you are retrieving your hash map’s data. It is normal and expected for keys and values to appear out of the order you inserted them in.
Test Your Hash Map
-
Create a new Ruby file.
-
Create a new instance of your hash map and set the load factor to be
0.75.test = HashMap.new -
Populate your hash map using the
#set(key, value)method by copying the following:test.set('apple', 'red') test.set('banana', 'yellow') test.set('carrot', 'orange') test.set('dog', 'brown') test.set('elephant', 'gray') test.set('frog', 'green') test.set('grape', 'purple') test.set('hat', 'black') test.set('ice cream', 'white') test.set('jacket', 'blue') test.set('kite', 'pink') test.set('lion', 'golden') -
After populating your hash map with the data above, your hash map’s current load levels should now be at
0.75(full capacity). -
Now with a full hash map, try overwriting a few nodes using
#set(key, value). This should only overwrite the existingvaluesof your nodes and not add new ones, so#lengthshould still return the same value andcapacityshould remain the same. -
After that, populate your hash map with the last node below. This will make your load levels exceed your
load factor, triggering your hash map’s growth functionality and doubling itscapacity:test.set('moon', 'silver') -
If you have implemented your hash map correctly, the load levels of your expanded hash map should drop well below your load factor, and the entries should be spread evenly among the expanded buckets.
-
With your new hash map, try overwriting a few nodes using
#set(key, value). Again, this should only overwrite existingvaluesof your nodes. -
Test the other methods of your hash map, such as
#get(key),#has?(key),#remove(key),#length,#clear,#keys,#values, and#entries, to check if they are still working as expected after expanding your hash map.
Extra Credit
- Create a class
HashSetthat behaves the same as aHashMapbut only containskeyswith novalues.