LeetCode 981 - Time Based Key-Value Store

The problem asks us to design a data structure that behaves like a key-value store, but with an important twist: the same key can have multiple values over time.

LeetCode Problem 981

Difficulty: 🟡 Medium
Topics: Hash Table, String, Binary Search, Design

Solution

Problem Understanding

The problem asks us to design a data structure that behaves like a key-value store, but with an important twist: the same key can have multiple values over time. Every value is associated with a timestamp, and later queries must return the most recent value that existed at or before a requested timestamp.

In a normal hash map, each key points to exactly one value. In this problem, a single key can accumulate many historical versions. For example, if we store:

  • "foo" -> "bar" at timestamp 1
  • "foo" -> "bar2" at timestamp 4

then querying "foo" at timestamp 5 should return "bar2" because it is the latest value that existed before or at time 5. Querying "foo" at timestamp 3 should return "bar" because "bar2" did not exist yet.

The input consists of two operations:

  • set(key, value, timestamp) stores a value for a key at a specific timestamp.
  • get(key, timestamp) retrieves the value associated with the greatest timestamp less than or equal to the requested timestamp.

The constraints are extremely important for determining the correct solution strategy. There can be up to 2 * 10^5 operations, so inefficient lookups will become too slow. The timestamps provided to set are guaranteed to be strictly increasing for each insertion globally, which gives us an important ordering property we can exploit.

The problem guarantees:

  • Keys and values are short strings.
  • Timestamps are positive integers.
  • Timestamps inserted through set are strictly increasing.

That final guarantee is the key insight because it means values for a key are naturally stored in sorted timestamp order without requiring explicit sorting.

Several edge cases are important:

  • Querying a key that was never inserted should return "".
  • Querying before the first timestamp for a key should return "".
  • Querying exactly at a stored timestamp should return the exact matching value.
  • Querying between two timestamps should return the closest earlier timestamp.
  • Multiple values for the same key must preserve historical ordering.

Approaches

Brute Force Approach

The most direct solution is to store every (timestamp, value) pair for each key in a list. During a get operation, we scan the entire list for that key and find the largest timestamp less than or equal to the requested timestamp.

This approach is correct because we explicitly examine every stored timestamp and choose the best valid candidate. However, it becomes inefficient when many values exist for the same key.

Suppose one key has n stored timestamps. A single get operation would require scanning all n entries. Since the problem allows up to 2 * 10^5 operations, repeated linear scans could degrade to quadratic behavior in the worst case.

Optimal Approach

The key observation is that timestamps are inserted in strictly increasing order. That means for every key, the timestamps are already sorted naturally as we append new entries.

Whenever data is sorted and we need to find the largest value less than or equal to a target, binary search becomes the ideal tool.

We store, for each key, a list of (timestamp, value) pairs in chronological order. During get, instead of scanning linearly, we binary search for the rightmost timestamp less than or equal to the requested timestamp.

This reduces lookup time dramatically from linear time to logarithmic time.

Approach Time Complexity Space Complexity Notes
Brute Force set: O(1), get: O(n) O(n) Linearly scans all timestamps for a key
Optimal set: O(1), get: O(log n) O(n) Uses binary search on sorted timestamps

Algorithm Walkthrough

  1. Initialize a hash map where each key maps to a list of (timestamp, value) pairs.

The hash map gives constant-time access to the history of any key. Each list stores the chronological history for that key. 2. During set(key, value, timestamp), append (timestamp, value) to the key's list.

Because timestamps are guaranteed to be strictly increasing, appending automatically keeps the list sorted. No extra sorting is needed. 3. During get(key, timestamp), first check whether the key exists.

If the key does not exist, return "" immediately because no historical values are available. 4. Retrieve the list of timestamp-value pairs for the key.

Since the timestamps are sorted, we can perform binary search efficiently. 5. Use binary search to locate the rightmost timestamp less than or equal to the requested timestamp.

The goal is not merely to find an exact match. We need the latest valid historical value. Binary search helps us efficiently narrow the search space. 6. If a valid timestamp is found, return its associated value.

Otherwise, return "".

Why it works

The correctness depends on one invariant: for every key, timestamps are stored in strictly increasing order. Because of this ordering, binary search can safely eliminate half the remaining search space at each step while preserving the guarantee that the rightmost valid timestamp will eventually be found.

The algorithm always returns the value associated with the largest timestamp less than or equal to the requested timestamp, exactly matching the problem definition.

Python Solution

from collections import defaultdict
from typing import List, Tuple

class TimeMap:

    def __init__(self):
        self.store: defaultdict[str, List[Tuple[int, str]]] = defaultdict(list)

    def set(self, key: str, value: str, timestamp: int) -> None:
        self.store[key].append((timestamp, value))

    def get(self, key: str, timestamp: int) -> str:
        if key not in self.store:
            return ""

        values = self.store[key]

        left = 0
        right = len(values) - 1

        answer = ""

        while left <= right:
            mid = (left + right) // 2

            current_timestamp, current_value = values[mid]

            if current_timestamp <= timestamp:
                answer = current_value
                left = mid + 1
            else:
                right = mid - 1

        return answer

# Your TimeMap object will be instantiated and called as such:
# obj = TimeMap()
# obj.set(key, value, timestamp)
# param_2 = obj.get(key, timestamp)

The implementation begins by creating a defaultdict where each key maps to a list of timestamp-value pairs. This structure allows efficient grouping of all historical values for a key.

The set method is extremely efficient because timestamps arrive in sorted order. We simply append the new pair to the list.

The get method first checks whether the key exists. If not, it returns an empty string immediately.

The binary search maintains two pointers, left and right. Whenever the midpoint timestamp is less than or equal to the requested timestamp, it becomes a valid candidate answer. However, we continue searching to the right because there may be a newer valid timestamp.

If the midpoint timestamp is too large, we discard the right half and continue searching left.

At the end of the search, answer contains the value associated with the largest valid timestamp.

Go Solution

package main

type Pair struct {
	timestamp int
	value     string
}

type TimeMap struct {
	store map[string][]Pair
}

func Constructor() TimeMap {
	return TimeMap{
		store: make(map[string][]Pair),
	}
}

func (this *TimeMap) Set(key string, value string, timestamp int) {
	this.store[key] = append(this.store[key], Pair{
		timestamp: timestamp,
		value:     value,
	})
}

func (this *TimeMap) Get(key string, timestamp int) string {
	values, exists := this.store[key]

	if !exists {
		return ""
	}

	left := 0
	right := len(values) - 1

	answer := ""

	for left <= right {
		mid := (left + right) / 2

		current := values[mid]

		if current.timestamp <= timestamp {
			answer = current.value
			left = mid + 1
		} else {
			right = mid - 1
		}
	}

	return answer
}

/**
 * Your TimeMap object will be instantiated and called as such:
 * obj := Constructor();
 * obj.Set(key,value,timestamp);
 * param_2 := obj.Get(timestamp);
 */

The Go implementation follows the same algorithmic structure as the Python version. One notable difference is that Go requires an explicit Pair struct to store timestamps and values together.

Go maps return two values during lookup, the value and a boolean indicating whether the key exists. This allows us to distinguish between a missing key and an empty slice.

Slices in Go naturally support append operations efficiently, making them a good fit for storing chronologically ordered timestamp-value pairs.

Worked Examples

Example 1

Input:

["TimeMap", "set", "get", "get", "set", "get", "get"]
[[], ["foo", "bar", 1], ["foo", 1], ["foo", 3], ["foo", "bar2", 4], ["foo", 4], ["foo", 5]]

Step-by-step execution

Operation Internal Store Result
set("foo", "bar", 1) {"foo": [(1, "bar")]} -
get("foo", 1) unchanged "bar"
get("foo", 3) unchanged "bar"
set("foo", "bar2", 4) {"foo": [(1, "bar"), (4, "bar2")]} -
get("foo", 4) unchanged "bar2"
get("foo", 5) unchanged "bar2"

Binary search trace for get("foo", 3)

Stored list:

[(1, "bar"), (4, "bar2")]

Search target: 3

Left Right Mid Timestamp at Mid Action
0 1 0 1 Valid candidate, move right
1 1 1 4 Too large, move left

Final answer: "bar"

Complexity Analysis

Measure Complexity Explanation
Time set: O(1), get: O(log n) Binary search is used during retrieval
Space O(n) All timestamp-value pairs must be stored

The set operation only appends to a list, making it constant time. The get operation performs binary search on the timestamp list for a key, which requires logarithmic time.

The space complexity is linear because every inserted value must remain stored to support historical queries.

Test Cases

# Provided example
tm = TimeMap()
tm.set("foo", "bar", 1)
assert tm.get("foo", 1) == "bar"          # exact timestamp
assert tm.get("foo", 3) == "bar"          # closest earlier timestamp

tm.set("foo", "bar2", 4)
assert tm.get("foo", 4) == "bar2"         # exact newer timestamp
assert tm.get("foo", 5) == "bar2"         # future timestamp

# Missing key
tm = TimeMap()
assert tm.get("missing", 10) == ""        # nonexistent key

# Query before first timestamp
tm = TimeMap()
tm.set("a", "x", 5)
assert tm.get("a", 1) == ""               # no earlier timestamp exists

# Multiple updates
tm = TimeMap()
tm.set("k", "v1", 1)
tm.set("k", "v2", 2)
tm.set("k", "v3", 3)

assert tm.get("k", 1) == "v1"             # first value
assert tm.get("k", 2) == "v2"             # middle value
assert tm.get("k", 3) == "v3"             # latest exact match
assert tm.get("k", 100) == "v3"           # latest historical value

# Different keys
tm = TimeMap()
tm.set("a", "apple", 1)
tm.set("b", "banana", 2)

assert tm.get("a", 2) == "apple"          # independent key lookup
assert tm.get("b", 2) == "banana"

# Large timestamp gap
tm = TimeMap()
tm.set("gap", "start", 1)
tm.set("gap", "end", 10000000)

assert tm.get("gap", 5000000) == "start"  # large search range
assert tm.get("gap", 10000000) == "end"
Test Why
Provided example Validates standard behavior
Missing key Ensures nonexistent keys return empty string
Query before first timestamp Verifies lower-bound edge handling
Multiple updates Tests binary search correctness
Different keys Confirms hash map separation
Large timestamp gap Ensures timestamps themselves do not affect correctness

Edge Cases

One important edge case occurs when querying a key that has never been inserted. A naive implementation might attempt to binary search an empty list or access missing data, causing runtime errors. The implementation avoids this by checking whether the key exists in the hash map before searching.

Another critical edge case is querying a timestamp earlier than the first stored timestamp. For example, if the earliest value for a key is stored at timestamp 5, then querying timestamp 2 should return an empty string. The binary search naturally handles this because no valid timestamp less than or equal to the target is ever found, so the answer variable remains empty.

A third subtle edge case involves querying between two stored timestamps. Suppose values exist at timestamps 1 and 10, and we query timestamp 7. The algorithm must return the value from timestamp 1, not fail because there is no exact match. This is why the binary search searches for the rightmost timestamp less than or equal to the target instead of only checking for equality.

Another potential source of bugs is maintaining sorted order. Many binary search solutions fail if the underlying data is not sorted. This problem guarantees strictly increasing timestamps for set operations, allowing append-only insertion while preserving sorted order automatically.