LeetCode 635 - Design Log Storage System

Difficulty: 🟡 Medium
Topics: Hash Table, String, Design, Ordered Set

Solution

Problem Understanding

The problem is asking us to design a log storage system that supports two primary operations: storing logs with unique IDs and timestamps, and retrieving logs based on a timestamp range and granularity. Each timestamp is represented as a string in the format YYYY:MM:DD:HH:MM:SS, which is zero-padded and guarantees lexicographic ordering aligns with chronological order.

The put operation simply stores a log with its ID and timestamp. The retrieve operation is more nuanced: given a start timestamp, an end timestamp, and a granularity, we must return all log IDs whose timestamps fall within the inclusive range, considering only the parts of the timestamp up to the specified granularity. For example, if the granularity is Day, we ignore Hour, Minute, and Second when comparing timestamps.

The constraints ensure the number of logs and operations is modest (at most 500), so we do not need extreme optimization for massive datasets. Each ID is unique, the years are between 2000 and 2017, and the timestamps are well-formed. Edge cases include querying ranges that exactly match log timestamps, using the largest granularity Year, or the smallest granularity Second, and timestamps at the edges of the valid range.

Approaches

The brute-force approach stores logs in a simple list of (id, timestamp) pairs. For retrieval, we iterate over the entire list, slice the timestamps according to the granularity, and include IDs that fall within the range. This works correctly because we can directly compare timestamp strings lexicographically after slicing, but it is inefficient if the number of logs grows large because retrieval scans all entries.

The key insight for optimization is that timestamps are naturally lexicographically sortable in their string form, and the number of distinct granularity levels is fixed and small. We can therefore maintain a simple list and perform slicing on the timestamp for granularity comparisons. Due to the constraints, a more sophisticated structure like a tree or index is unnecessary. The optimization here is minimal but ensures correctness while leveraging the lexicographic property of timestamp strings.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n) per retrieve	O(n)	Store all logs in a list; compare each timestamp at query time
Optimal	O(n) per retrieve	O(n)	Same as brute force due to small constraints, but lexicographic string slicing ensures correctness and simplicity

Algorithm Walkthrough

Initialize the LogSystem object with an empty list to store logs.
In put(id, timestamp), append the tuple (id, timestamp) to the list.
In retrieve(start, end, granularity), map each granularity to its corresponding index in the timestamp string: Year → 4, Month → 7, Day → 10, Hour → 13, Minute → 16, Second → 19.
Slice the start and end timestamps according to the granularity index. This effectively ignores the less significant parts of the timestamp.
Iterate through all stored logs and slice their timestamps similarly. If the sliced log timestamp falls within the sliced start and end timestamps (inclusive), add the log ID to the result list.
Return the list of IDs that meet the criteria.

Why it works: The lexicographic property of zero-padded timestamps ensures that string comparisons correctly reflect chronological order. By slicing timestamps according to granularity, we effectively truncate less significant units and include logs in the requested range. This guarantees correctness without complex data structures.

Python Solution

from typing import List

class LogSystem:

    def __init__(self):
        self.logs = []
        self.granularity_index = {
            "Year": 4,
            "Month": 7,
            "Day": 10,
            "Hour": 13,
            "Minute": 16,
            "Second": 19
        }

    def put(self, id: int, timestamp: str) -> None:
        self.logs.append((id, timestamp))

    def retrieve(self, start: str, end: str, granularity: str) -> List[int]:
        idx = self.granularity_index[granularity]
        start_slice, end_slice = start[:idx], end[:idx]
        result = []
        for log_id, timestamp in self.logs:
            ts_slice = timestamp[:idx]
            if start_slice <= ts_slice <= end_slice:
                result.append(log_id)
        return result

The Python implementation uses a list to store logs and a dictionary to map granularities to the correct slicing indices. The retrieve method slices both the query and log timestamps and compares them lexicographically, appending IDs to the result list if they fall within the specified range.

Go Solution

type LogSystem struct {
	logs             [][2]string
	granularityIndex map[string]int
}

func Constructor() LogSystem {
	return LogSystem{
		logs: make([][2]string, 0),
		granularityIndex: map[string]int{
			"Year":  4,
			"Month": 7,
			"Day":   10,
			"Hour":  13,
			"Minute": 16,
			"Second": 19,
		},
	}
}

func (this *LogSystem) Put(id int, timestamp string) {
	this.logs = append(this.logs, [2]string{fmt.Sprint(id), timestamp})
}

func (this *LogSystem) Retrieve(start string, end string, granularity string) []int {
	idx := this.granularityIndex[granularity]
	startSlice := start[:idx]
	endSlice := end[:idx]
	result := []int{}
	for _, log := range this.logs {
		tsSlice := log[1][:idx]
		if startSlice <= tsSlice && tsSlice <= endSlice {
			id, _ := strconv.Atoi(log[0])
			result = append(result, id)
		}
	}
	return result
}

In Go, we store logs as a slice of [2]string pairs to hold ID and timestamp. We use string slicing for timestamp comparison, convert IDs back to integers when appending to the result slice, and maintain a granularity index for consistency.

Worked Examples

Example 1:

put(1, "2017:01:01:23:59:59") → logs = [(1, "2017:01:01:23:59:59")]
put(2, "2017:01:01:22:59:59") → logs = [(1, ...), (2, ...)]
put(3, "2016:01:01:00:00:00") → logs = [(1, ...), (2, ...), (3, ...)]
retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Year") slices to "2016" and "2017", all logs fall within this, result = [3, 2, 1].
retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Hour") slices to "2016:01:01:01" and "2017:01:01:23". Only logs 2 and 1 fall in this range, result = [2, 1].

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n) per retrieve	We iterate through all stored logs for each retrieve call
Space	O(n)	Store all logs in a list; result list may store all IDs in worst case

The complexity is acceptable because n ≤ 500. Slicing timestamps and comparing strings are O(1) operations due to fixed-length strings.

Test Cases

logSystem = LogSystem()
logSystem.put(1, "2017:01:01:23:59:59")  # store log 1
logSystem.put(2, "2017:01:01:22:59:59")  # store log 2
logSystem.put(3, "2016:01:01:00:00:00")  # store log 3

assert logSystem.retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Year") == [3, 2, 1]  # Year granularity
assert logSystem.retrieve("2016:01:01:01:01:01", "2017:01:01:23:00:00", "Hour") == [2, 1]    # Hour granularity

# Boundary cases
logSystem.put(4, "2000:01:01:00:00:00")
assert logSystem.retrieve("2000:01:01:00:00:00", "2000:01:01:00:00:00", "Second") == [4]  # exact match

logSystem.put(5, "2017:12:31:23:59:59")
assert logSystem.retrieve("2017:01:01:00:00:00", "2017:12:31:23:59:59", "Day") == [1, 2, 5]  # range by day

# No logs match
assert logSystem.retrieve("2015