Hash Map
Overview
Maps keys to values using a hash function. Python's dict is a production-ready hashmap.
| Operation | Average | Worst |
|---|---|---|
| Get | O(1) | O(n) |
| Set | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Search | O(1) | O(n) |
Worst case happens only with extreme hash collisions — rare in practice.
Properties
- Fast Lookups — O(1) average for get/set/delete
- Unordered — (Python dicts are insertion-ordered since 3.7, but don't rely on sort order)
- No Range Queries — can't find "largest 10 keys" efficiently (use BST for that)
- Hashable Keys only — strings, ints, tuples ✅ | lists, dicts ❌
- Dynamic resizing — Python dict doubles in size when load factor exceeded
Python dict Usage
# Basic operations
d = {}
d["key"] = "value" # set
val = d.get("key", default) # get with default (no KeyError)
del d["key"] # delete
"key" in d # O(1) membership check
# Frequency counter (most common pattern)
from collections import Counter
count = Counter([1, 2, 2, 3, 3, 3]) # {3:3, 2:2, 1:1}
count.most_common(2) # [(3,3), (2,2)]
# Default dict (avoids KeyError for missing keys)
from collections import defaultdict
graph = defaultdict(list)
graph["a"].append("b") # no KeyError even if "a" not in graph
# Group by pattern
from collections import defaultdict
groups = defaultdict(list)
for word in words:
key = "".join(sorted(word)) # anagram key
groups[key].append(word)
Custom HashMap Implementation
class HashMap:
def __init__(self, size=16):
self.size = size
self.table = [None] * size
def _hash(self, key):
return sum(ord(c) for c in str(key)) % self.size
def set(self, key, value):
idx = self._hash(key)
# Linear probing for collision resolution
while self.table[idx] is not None and self.table[idx][0] != key:
idx = (idx + 1) % self.size
self.table[idx] = (key, value)
def get(self, key):
idx = self._hash(key)
while self.table[idx] is not None:
if self.table[idx][0] == key:
return self.table[idx][1]
idx = (idx + 1) % self.size
raise KeyError(key)
Collision Resolution
| Method | How | Pros | Cons |
|---|---|---|---|
| Linear Probing | Try next slot sequentially | Cache friendly | Clustering |
| Chaining | Each slot holds a linked list | Simple | Pointer overhead |
| Double Hashing | Second hash for step size | Fewer clusters | Complex |
Python uses a variant of open addressing with pseudo-random probing.
Interview Patterns
# Two Sum — classic O(n) hashmap solution
def two_sum(nums, target):
seen = {}
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
# Sliding window + hashmap (character frequency)
from collections import defaultdict
def length_of_longest_substring(s):
char_idx = {}
left = max_len = 0
for right, ch in enumerate(s):
if ch in char_idx and char_idx[ch] >= left:
left = char_idx[ch] + 1
char_idx[ch] = right
max_len = max(max_len, right - left + 1)
return max_len