Back to Notes

Hash Map

Overview

Maps keys to values using a hash function. Python's dict is a production-ready hashmap.

OperationAverageWorst
GetO(1)O(n)
SetO(1)O(n)
DeleteO(1)O(n)
SearchO(1)O(n)

Worst case happens only with extreme hash collisions — rare in practice.


Properties

  • Fast Lookups — O(1) average for get/set/delete
  • Unordered — (Python dicts are insertion-ordered since 3.7, but don't rely on sort order)
  • No Range Queries — can't find "largest 10 keys" efficiently (use BST for that)
  • Hashable Keys only — strings, ints, tuples ✅ | lists, dicts ❌
  • Dynamic resizing — Python dict doubles in size when load factor exceeded

Python dict Usage

# Basic operations
d = {}
d["key"] = "value"          # set
val = d.get("key", default) # get with default (no KeyError)
del d["key"]                # delete
"key" in d                  # O(1) membership check

# Frequency counter (most common pattern)
from collections import Counter
count = Counter([1, 2, 2, 3, 3, 3])  # {3:3, 2:2, 1:1}
count.most_common(2)                  # [(3,3), (2,2)]

# Default dict (avoids KeyError for missing keys)
from collections import defaultdict
graph = defaultdict(list)
graph["a"].append("b")      # no KeyError even if "a" not in graph

# Group by pattern
from collections import defaultdict
groups = defaultdict(list)
for word in words:
    key = "".join(sorted(word))   # anagram key
    groups[key].append(word)

Custom HashMap Implementation

class HashMap:
    def __init__(self, size=16):
        self.size = size
        self.table = [None] * size

    def _hash(self, key):
        return sum(ord(c) for c in str(key)) % self.size

    def set(self, key, value):
        idx = self._hash(key)
        # Linear probing for collision resolution
        while self.table[idx] is not None and self.table[idx][0] != key:
            idx = (idx + 1) % self.size
        self.table[idx] = (key, value)

    def get(self, key):
        idx = self._hash(key)
        while self.table[idx] is not None:
            if self.table[idx][0] == key:
                return self.table[idx][1]
            idx = (idx + 1) % self.size
        raise KeyError(key)

Collision Resolution

MethodHowProsCons
Linear ProbingTry next slot sequentiallyCache friendlyClustering
ChainingEach slot holds a linked listSimplePointer overhead
Double HashingSecond hash for step sizeFewer clustersComplex

Python uses a variant of open addressing with pseudo-random probing.


Interview Patterns

# Two Sum — classic O(n) hashmap solution
def two_sum(nums, target):
    seen = {}
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i

# Sliding window + hashmap (character frequency)
from collections import defaultdict
def length_of_longest_substring(s):
    char_idx = {}
    left = max_len = 0
    for right, ch in enumerate(s):
        if ch in char_idx and char_idx[ch] >= left:
            left = char_idx[ch] + 1
        char_idx[ch] = right
        max_len = max(max_len, right - left + 1)
    return max_len