LeetCode 2273 - Find Resultant Array After Removing Anagrams

This problem asks us to repeatedly remove words from an array when two adjacent words are anagrams of each other. We are given a 0-indexed string array words, where every string contains only lowercase English letters.

LeetCode Problem 2273

Difficulty: 🟢 Easy
Topics: Array, Hash Table, String, Sorting

Solution

Problem Understanding

This problem asks us to repeatedly remove words from an array when two adjacent words are anagrams of each other.

We are given a 0-indexed string array words, where every string contains only lowercase English letters. During each operation, we may choose an index i such that the word at words[i] is an anagram of the word immediately before it, words[i - 1]. When this condition is true, we remove words[i] from the array.

The process continues until no adjacent pair of words forms an anagram relationship.

The important detail is that the problem guarantees that the final result is independent of the order of deletions. This means we do not need to simulate every possible deletion sequence. Instead, we only need to determine which words survive in the final array.

In simpler terms, we want to scan through the words and keep a word only if it is not an anagram of the most recently kept word.

The input consists of:

  • words, a list of lowercase strings.

The expected output is:

  • A new list containing the words that remain after repeatedly removing adjacent anagrams.

The constraints are very small:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 10

Since both the number of words and word length are small, even somewhat inefficient approaches would work. However, we should still aim for a clean and efficient solution.

Several edge cases are important to think about upfront. Consecutive chains of anagrams can be tricky because deleting one word changes adjacency. For example, ["abba","baba","bbaa"] should reduce to only ["abba"], since all words are anagrams of each other. Another case is when no removals happen at all, such as ["a","b","c"]. We also need to correctly handle repeated identical words because identical strings are also anagrams, for example ["cd","cd"].

Approaches

Brute Force Approach

A straightforward solution is to repeatedly simulate the deletion process.

We could repeatedly scan the array and look for adjacent anagrams. Whenever we find one, we delete the second word and restart the scan. To determine whether two words are anagrams, we sort their characters and compare the sorted versions.

This approach is correct because it exactly follows the problem statement, repeatedly removing adjacent anagrams until no valid move remains.

However, it is inefficient because every deletion shifts array elements, and repeated rescanning increases the total work. In the worst case, we may repeatedly traverse nearly the entire list after each deletion.

Key Insight for an Optimal Solution

The key observation is that we only care about the most recently kept word.

Suppose we already built the resultant array. When processing a new word:

  • If it is an anagram of the most recently kept word, it would eventually be deleted.
  • Otherwise, it survives and becomes part of the final result.

Because the final answer is independent of deletion order, we do not need to simulate removals. Instead, we can greedily build the answer from left to right.

To efficiently compare words, we compute a canonical representation for each word. The simplest choice is the sorted version of the string. Two words are anagrams if and only if their sorted forms are identical.

Approach Time Complexity Space Complexity Notes
Brute Force O(n² · k log k) O(n) Repeatedly removes adjacent anagrams and rescans
Optimal O(n · k log k) O(n) Single pass, compare only with previously kept word

Here:

  • n is the number of words.
  • k is the maximum word length.

Since k <= 10, sorting each word is extremely cheap.

Algorithm Walkthrough

  1. Create an empty result list to store the words that survive.
  2. Iterate through each word in words from left to right.
  3. For the current word, compute its sorted representation. This acts as its anagram signature. For example:
  • "abba" becomes "aabb"
  • "baba" becomes "aabb"
  1. If the result list is empty, add the current word because there is no previous word to compare against.
  2. Otherwise, compare the current word's signature with the signature of the most recently kept word.
  • If they match, the current word is an anagram of the previous kept word, so skip it.
  • If they differ, append the word to the result list.
  1. Continue until all words have been processed.
  2. Return the result list.

Why it works

The core invariant is that the result list always contains the correct final words for the portion of the array processed so far.

Whenever a word is an anagram of the most recently kept word, it would eventually be removed by the problem's operation rule, so skipping it is safe. If it is not an anagram, it cannot be removed by any earlier word, so it must remain. Since the final outcome is independent of deletion order, greedily building the result produces the same final array.

Python Solution

from typing import List

class Solution:
    def removeAnagrams(self, words: List[str]) -> List[str]:
        def get_signature(word: str) -> str:
            return "".join(sorted(word))

        result = []
        previous_signature = ""

        for word in words:
            current_signature = get_signature(word)

            if current_signature != previous_signature:
                result.append(word)
                previous_signature = current_signature

        return result

The implementation begins by defining a helper function get_signature, which converts a word into its sorted character form. Since anagrams share the same sorted representation, this gives us an easy comparison mechanism.

We then initialize an empty result list and a previous_signature variable. This variable stores the signature of the most recently kept word.

As we iterate through each word, we compute its signature. If it differs from the previous kept signature, the word survives and is appended to the result. We also update previous_signature.

If the signature matches, the word is an anagram of the previous kept word and would eventually be removed, so we skip it.

The implementation follows the exact greedy logic described in the algorithm walkthrough and processes the input in a single pass.

Go Solution

import (
	"sort"
)

func removeAnagrams(words []string) []string {
	getSignature := func(word string) string {
		chars := []rune(word)

		sort.Slice(chars, func(i, j int) bool {
			return chars[i] < chars[j]
		})

		return string(chars)
	}

	result := []string{}
	prevSignature := ""

	for _, word := range words {
		currentSignature := getSignature(word)

		if currentSignature != prevSignature {
			result = append(result, word)
			prevSignature = currentSignature
		}
	}

	return result
}

The Go implementation follows the same logic as the Python version. Since Go does not have a built-in string sorting function, we convert the word into a rune slice and sort it using sort.Slice.

The result is stored in a slice, and prevSignature keeps track of the last surviving word's anagram signature.

Because the input contains only lowercase English letters and very short strings, there are no concerns about integer overflow or memory pressure. Empty and nil slices are naturally handled by Go, and appending works correctly even when the result starts empty.

Worked Examples

Example 1

Input:

["abba","baba","bbaa","cd","cd"]

We process each word one at a time.

Step Current Word Signature Previous Signature Action Result
1 "abba" "aabb" "" Keep ["abba"]
2 "baba" "aabb" "aabb" Remove ["abba"]
3 "bbaa" "aabb" "aabb" Remove ["abba"]
4 "cd" "cd" "aabb" Keep ["abba", "cd"]
5 "cd" "cd" "cd" Remove ["abba", "cd"]

Final answer:

["abba","cd"]

Example 2

Input:

["a","b","c","d","e"]
Step Current Word Signature Previous Signature Action Result
1 "a" "a" "" Keep ["a"]
2 "b" "b" "a" Keep ["a", "b"]
3 "c" "c" "b" Keep ["a", "b", "c"]
4 "d" "d" "c" Keep ["a", "b", "c", "d"]
5 "e" "e" "d" Keep ["a", "b", "c", "d", "e"]

No adjacent anagrams exist, so every word remains.

Final answer:

["a","b","c","d","e"]

Complexity Analysis

Measure Complexity Explanation
Time O(n · k log k) Each word is sorted once
Space O(n) Result list plus temporary sorted strings

The algorithm processes every word exactly once. For each word, we sort its characters, which costs O(k log k) where k is the word length. Since there are n words, the total time complexity becomes O(n · k log k).

The space complexity is O(n) because the result list may contain every word in the worst case. Additional temporary storage for sorting is proportional to the word length.

Test Cases

class Solution:
    def removeAnagrams(self, words):
        def get_signature(word):
            return "".join(sorted(word))

        result = []
        previous_signature = ""

        for word in words:
            current_signature = get_signature(word)

            if current_signature != previous_signature:
                result.append(word)
                previous_signature = current_signature

        return result

solution = Solution()

assert solution.removeAnagrams(
    ["abba", "baba", "bbaa", "cd", "cd"]
) == ["abba", "cd"]  # Provided example with multiple removals

assert solution.removeAnagrams(
    ["a", "b", "c", "d", "e"]
) == ["a", "b", "c", "d", "e"]  # No removals occur

assert solution.removeAnagrams(
    ["abba"]
) == ["abba"]  # Single word input

assert solution.removeAnagrams(
    ["ab", "ba", "ab", "ba"]
) == ["ab"]  # Entire chain collapses

assert solution.removeAnagrams(
    ["abc", "def", "fed", "ghi"]
) == ["abc", "def", "ghi"]  # One adjacent anagram pair

assert solution.removeAnagrams(
    ["aa", "aa", "aa"]
) == ["aa"]  # Identical words are anagrams

assert solution.removeAnagrams(
    ["abc", "xyz", "zyx", "abc"]
) == ["abc", "xyz", "abc"]  # Later non-adjacent anagram survives

assert solution.removeAnagrams(
    ["ab", "cd", "ef"]
) == ["ab", "cd", "ef"]  # Completely distinct words
Test Why
["abba","baba","bbaa","cd","cd"] Validates repeated removals and chaining
["a","b","c","d","e"] Ensures no removals happen
["abba"] Tests minimum input size
["ab","ba","ab","ba"] Verifies consecutive anagram chain handling
["abc","def","fed","ghi"] Tests a single removable pair
["aa","aa","aa"] Confirms identical strings count as anagrams
["abc","xyz","zyx","abc"] Ensures only adjacent anagrams matter
["ab","cd","ef"] Tests fully distinct inputs

Edge Cases

Single Word Input

When the input contains only one word, there are no adjacent elements to compare. A careless implementation might incorrectly assume at least two words exist and attempt an invalid comparison.

For example:

["hello"]

The implementation handles this naturally because the first word is always added to the result list.

Long Chain of Consecutive Anagrams

A sequence where many consecutive words are anagrams can be tricky because repeated deletions change adjacency.

For example:

["ab", "ba", "ab", "ba"]

A naive simulation may become complicated due to repeated list modifications. The greedy approach handles this cleanly by keeping only the first word and skipping every subsequent word whose signature matches the previous kept word.

Non-Adjacent Anagrams

The problem only removes adjacent anagrams, not all anagrams in the array.

For example:

["abc", "xyz", "zyx", "abc"]

Even though the first and last words are anagrams, they are not adjacent. After removing "zyx", the last "abc" is compared only with "xyz" and survives. The implementation correctly compares only against the most recently kept word.