Core Data Structures for Programming Interviews

July 20, 2025

Core Data Structures
Advanced Data Structures
Interview Preparation Strategy

Core Data Structures

Array

Definition: A data structure consisting of a collection of elements, each identified by at least one array index or key. Elements are stored in contiguous memory locations.

Time Complexity:

Access: O(1)
Search: O(N)
Insertion (at end): O(1) (amortized for dynamic arrays)
Insertion (in middle): O(N)
Deletion (from end): O(1)
Deletion (from middle): O(N)

Space Complexity: O(N) for N elements.

Python Example: This example demonstrates basic list creation, access, and modification.

# Python's list serves as a dynamic array
my_list = [10, 20, 30, 40, 50]

# Access element by index (O(1))
print(f"Element at index 2: {my_list[2]}")
# Expected Output: Element at index 2: 30

# Append an element (O(1) amortized)
my_list.append(60)
print(f"List after appending 60: {my_list}")
# Expected Output: List after appending 60: [10, 20, 30, 40, 50, 60]

# Remove an element (O(N))
my_list.remove(30)
print(f"List after removing 30: {my_list}")
# Expected Output: List after removing 30: [10, 20, 40, 50, 60]

Java Example: This example demonstrates basic array creation (using ArrayList), access, and modification.

import java.util.ArrayList;
import java.util.Arrays;

public class ArrayExample {
    public static void main(String[] args) {
        // Java's ArrayList serves as a dynamic array.
        // We initialize it with values from a static array.
        ArrayList<Integer> myList = new ArrayList<>(Arrays.asList(10, 20, 30, 40, 50));

        // Access element by index (O(1))
        System.out.println("Element at index 2: " + myList.get(2));
        // Expected Output: Element at index 2: 30

        // Append an element (O(1) amortized)
        myList.add(60);
        System.out.println("List after appending 60: " + myList);
        // Expected Output: List after appending 60: [10, 20, 30, 40, 50, 60]

        // Remove an element by value (O(N))
        // We use Integer.valueOf() to remove the object 30, not the element at index 30.
        myList.remove(Integer.valueOf(30));
        System.out.println("List after removing 30: " + myList);
        // Expected Output: List after removing 30: [10, 20, 40, 50, 60]
    }
}

Visual Example

String

Definition: A sequence of characters. In many languages, strings are immutable, meaning they cannot be changed after creation.

Time Complexity: Varies by language implementation. For immutable strings:

Access: O(1)
Search (substring): O(N⋅M) where N and M are lengths of string and substring.
Append/Concatenate: O(N) because a new string must be created.

Space Complexity: O(N) for a string of length N.

Python Example: This example shows basic string operations like slicing and searching.

my_string = "hello world"

# Slicing to get a substring
substring = my_string[6:11]
print(f"Substring: {substring}")
# Expected Output: Substring: world

# Check for existence
if "world" in my_string:
    print("Found 'world'")
    # Expected Output: Found 'world'

Linked List

Definition: A linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next.

Time Complexity:

Access/Search: O(N)
Insertion/Deletion (at beginning): O(1)
Insertion/Deletion (at end): O(N) without a tail pointer, O(1) with a tail pointer.

Space Complexity: O(N) for N elements.

Python Example: This example defines a ListNode and shows how to traverse it.

class ListNode:
    def __init__(self, value=0, next=None):
        self.value = value
        self.next = next

# Create a linked list: 1 -> 2 -> 3
head = ListNode(1, ListNode(2, ListNode(3)))

# Traverse the linked list and print its contents
current = head
while current:
    print(current.value, end=" -> ")
    current = current.next
print("None")
# Expected Output: 1 -> 2 -> 3 -> None

Hash Table (Hash Map / Set)

Definition: A data structure that maps keys to values using a hash function to compute an index into an array of buckets or slots.

Time Complexity:

Insert, Delete, Search: O(1) on average, O(N) in the worst case (due to hash collisions).

Space Complexity: O(N) to store N key-value pairs.

Python Example: This example uses a dictionary (dict) to count character frequencies.

# A dictionary is Python's implementation of a hash map
frequency_map = {}
text = "hello world"

for char in text:
    frequency_map[char] = frequency_map.get(char, 0) + 1

print(f"Character frequencies: {frequency_map}")
# Expected Output: Character frequencies: {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

# A set is an implementation of a hash set for unique items
unique_chars = set(text)
print(f"Unique characters: {unique_chars}")
# Expected Output: Unique characters: {'r', 'w', 'd', 'o', 'l', ' ', 'h', 'e'} (order may vary)

Stack

Definition: A linear data structure that follows the Last-In, First-Out (LIFO) principle.

Time Complexity:

Push (insert), Pop (remove), Peek (view top): O(1)

Space Complexity: O(N) for N elements.

Python Example: This example uses a Python list as a stack.

stack = []
stack.append('A') # Push
stack.append('B')
stack.append('C')
print(f"Stack: {stack}")
# Expected Output: Stack: ['A', 'B', 'C']

top_item = stack.pop() # Pop
print(f"Popped item: {top_item}")
# Expected Output: Popped item: C
print(f"Stack after pop: {stack}")
# Expected Output: Stack after pop: ['A', 'B']

Queue

Definition: A linear data structure that follows the First-In, First-Out (FIFO) principle.

Time Complexity:

Enqueue (insert), Dequeue (remove), Peek (view front): O(1)

Space Complexity: O(N) for N elements.

Python Example: This example uses collections.deque for an efficient queue implementation.

from collections import deque

queue = deque()
queue.append('A') # Enqueue
queue.append('B')
queue.append('C')
print(f"Queue: {queue}")
# Expected Output: Queue: deque(['A', 'B', 'C'])

front_item = queue.popleft() # Dequeue
print(f"Dequeued item: {front_item}")
# Expected Output: Dequeued item: A
print(f"Queue after dequeue: {queue}")
# Expected Output: Queue after dequeue: deque(['B', 'C'])

Python Example (no imports): This example uses a list as a queue. Note that pop(0) is inefficient (O(N)).

queue = []
queue.append('A') # Enqueue
queue.append('B')
queue.append('C')
print(f"Queue: {queue}")
# Expected Output: Queue: ['A', 'B', 'C']

front_item = queue.pop(0) # Dequeue (inefficient)
print(f"Dequeued item: {front_item}")
# Expected Output: Dequeued item: A
print(f"Queue after dequeue: {queue}")
# Expected Output: Queue after dequeue: ['B', 'C']

Advanced Data Structures

Trees & Binary Search Trees

Definition: A hierarchical data structure with nodes connected by edges. A Binary Search Tree (BST) is a binary tree where each node's left subtree contains only nodes with values less than the node's value, and the right subtree contains only nodes with values greater than the node's value.

Time Complexity (BST):

Search, Insert, Delete: O(logN) average, O(N) worst case (for unbalanced trees)

Space Complexity: O(N) for N nodes.

Python Example: Basic BST implementation.

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def insert_bst(root, val):
    if not root:
        return TreeNode(val)
    
    if val < root.val:
        root.left = insert_bst(root.left, val)
    else:
        root.right = insert_bst(root.right, val)
    
    return root

# Create a BST: 5 -> 3,7 -> 1,4,6,8
root = TreeNode(5)
insert_bst(root, 3)
insert_bst(root, 7)
insert_bst(root, 1)
insert_bst(root, 4)
insert_bst(root, 6)
insert_bst(root, 8)

Heaps & Priority Queues

Definition: A complete binary tree that satisfies the heap property. In a min-heap, each parent node is smaller than or equal to its children.

Time Complexity:

Insert, Extract Min/Max: O(logN)
Get Min/Max: O(1)

Space Complexity: O(N) for N elements.

Python Example: Using Python's heapq module.

import heapq

# Min-heap (default)
min_heap = []
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 3)
heapq.heappush(min_heap, 7)
heapq.heappush(min_heap, 1)

print(f"Min-heap: {min_heap}")
# Expected Output: Min-heap: [1, 3, 7, 5]

min_val = heapq.heappop(min_heap)
print(f"Popped minimum: {min_val}")
# Expected Output: Popped minimum: 1

# Max-heap (using negative values)
max_heap = []
for num in [5, 3, 7, 1]:
    heapq.heappush(max_heap, -num)

print(f"Max-heap (negative values): {max_heap}")
# Expected Output: Max-heap (negative values): [-7, -3, -5, -1]

max_val = -heapq.heappop(max_heap)
print(f"Popped maximum: {max_val}")
# Expected Output: Popped maximum: 7

Graphs

Definition: A collection of nodes (vertices) connected by edges. Can be directed or undirected, weighted or unweighted.

Time Complexity:

BFS/DFS: O(V+E)
Shortest path (Dijkstra): O((V+E)logV) with binary heap

Space Complexity: O(V+E) for adjacency list representation.

Python Example: Graph representation and basic operations.

from collections import defaultdict

class Graph:
    def __init__(self):
        self.graph = defaultdict(list)
    
    def add_edge(self, u, v):
        self.graph[u].append(v)
    
    def dfs(self, start, visited=None):
        if visited is None:
            visited = set()
        
        visited.add(start)
        print(start, end=" ")
        
        for neighbor in self.graph[start]:
            if neighbor not in visited:
                self.dfs(neighbor, visited)

# Create a graph: 0 -> 1,2; 1 -> 2; 2 -> 0,3; 3 -> 3
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 3)

print("DFS starting from vertex 2:")
g.dfs(2)
# Expected Output: DFS starting from vertex 2: 2 0 1 3

Union-Find (Disjoint Set)

Definition: A data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets.

Time Complexity:

Union, Find: O(α(N)) amortized (where α is the inverse Ackermann function, practically constant)

Space Complexity: O(N) for N elements.

Python Example: Union-Find implementation with path compression and union by rank.

class UnionFind:
    def __init__(self, size):
        self.parent = list(range(size))
        self.rank = [0] * size
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]
    
    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
        
        if root_x == root_y:
            return
        
        # Union by rank
        if self.rank[root_x] < self.rank[root_y]:
            self.parent[root_x] = root_y
        elif self.rank[root_x] > self.rank[root_y]:
            self.parent[root_y] = root_x
        else:
            self.parent[root_y] = root_x
            self.rank[root_x] += 1
    
    def connected(self, x, y):
        return self.find(x) == self.find(y)

# Example usage
uf = UnionFind(5)
uf.union(0, 1)
uf.union(2, 3)
uf.union(1, 2)

print(f"Are 0 and 3 connected? {uf.connected(0, 3)}")
# Expected Output: Are 0 and 3 connected? True
print(f"Are 0 and 4 connected? {uf.connected(0, 4)}")
# Expected Output: Are 0 and 4 connected? False

Interview Preparation Strategy

Arrays & Strings (Sequences)

Concepts: indexing, slicing, mutability, two-pointer techniques, sliding window, in-place vs. extra space.

Why: Almost every problem can be reduced to manipulating a sequence in one way or another.

Key Problems:

Two Sum [#1] (array + hash-table)
Best Time to Buy and Sell Stock [#121] (array, one-pass)
Valid Anagram [#242] (string, hash-table)
Longest Substring Without Repeating Characters [#3] (string, sliding window)

Hash Tables (Dictionaries & Sets)

Concepts: O(1) average lookup/insertion, handling collisions (conceptually), using defaultdict or Counter.

Why: They let you trade extra memory for constant-time lookup—crucial for frequency problems, "seen-before" checks, grouping.

Key Problems:

Contains Duplicate [#217]
Intersection of Two Arrays II [#350]
Group Anagrams [#49] (hash + string sort or char-count signature)
Top K Frequent Elements [#347]

Linked Lists

Concepts: Node vs. pointer, singly vs. doubly, cycle detection (Floyd's Tortoise & Hare), in-place reversal, merge two sorted lists.

Why: Exercises pointer manipulation, often hidden edge cases (null checks, off-by-one).

Key Problems:

Reverse Linked List [#206]
Merge Two Sorted Lists [#21]
Linked List Cycle [#141]
Remove Nth Node From End of List [#19]
Add Two Numbers [#2] (medium)

Stacks & Queues (Linear Data Structures)

Concepts: LIFO vs. FIFO, using Python's collections.deque, monotonic-stack pattern, implementing with arrays/linked lists.

Why: Parentheses matching, "next greater element," level-order tree traversal (BFS), sliding window (deque).

Key Problems:

Valid Parentheses [#20]
Min Stack [#155]
Decode String [#394] (stack + string manipulation)
Sliding Window Maximum [#239] (deque, hard-level pattern)

Trees & Tries

Concepts:

Binary Tree vs. BST: properties, recursion (preorder/inorder/postorder), level-order (BFS).
Trie (Prefix Tree): insertion/traversal, storing strings for efficient prefix lookups.

Why: Most "medium" questions revolve around tree recursion or BFS/DFS. Tries appear in word/auto-complete problems.

Key Problems:

Maximum Depth of Binary Tree [#104]
Validate Binary Search Tree [#98]
Binary Tree Level Order Traversal [#102]
Serialize and Deserialize Binary Tree [#297] (hard)
Implement Trie (Prefix Tree) [#208]

Heaps & Priority Queues

Concepts: Min-heap vs. max-heap, using Python's heapq (default is min-heap), "kth largest" pattern, merging k sorted lists.

Why: Efficiently track min/max of a stream, "top-k" problems, Dijkstra's algorithm.

Key Problems:

Merge k Sorted Lists [#23]
Find Median from Data Stream [#295]
Kth Largest Element in an Array [#215]
Sliding Window Median [#480] (hard)

Graphs (Adjacency List/Matrix)

Concepts: Directed vs. undirected, weighted vs. unweighted, BFS vs. DFS, connected components, cycle detection (directed/undirected), topological sort, Dijkstra's shortest path, Union-Find (Disjoint Set) for connectivity.

Why: Many "medium" and "hard" questions fall under shortest path, scheduling (DAG), or connectivity/union-find.

Key Problems:

Number of Islands [#200] (DFS/BFS grid)
Clone Graph [#133]
Course Schedule [#207] (DAG + topological sort)
Minimum Height Trees [#310] (tree + BFS)
Word Ladder [#127] (BFS + bidirectional BFS)

Union-Find (Disjoint Set)

Concepts: Path compression, union by rank, connectivity queries, cycle detection in undirected.

Why: Faster than DFS/BFS for repeated connectivity checks, commonly used in Kruskal's MST or dynamic island-count problems.

Key Problems:

Number of Connected Components in an Undirected Graph [#323]
Redundant Connection [#684]
Number of Islands II [#305] (hard)

Advanced Structures (Optional but Nice to Know)

Segment Tree / Fenwick Tree (Binary Indexed Tree)

Concepts: Range sum/maximum queries with point updates.

Key Problems:

Range Sum Query – Mutable [#307]
Count of Smaller Numbers After Self [#315] (uses BIT or merge-sort)

LRU Cache Implementation (combines hash + linked list).

Key Problems:

LRU Cache [#146]

Remember: The goal isn't just to memorize implementations, but to understand the underlying principles and trade-offs that make each data structure suitable for specific problems.

For a comprehensive guide to algorithms and algorithmic techniques, check out Core Algorithms for Programming Interviews.

Core Data Structures for Programming Interviews

Table of Contents