LeetCode 831 - Masking Personal Information

The problem requires creating a masked version of a personal information string s that can either be an email address or a phone number. The goal is to obscure sensitive information while keeping enough data to identify the user minimally.

LeetCode Problem 831

Difficulty: 🟡 Medium
Topics: String

Solution

Problem Understanding

The problem requires creating a masked version of a personal information string s that can either be an email address or a phone number. The goal is to obscure sensitive information while keeping enough data to identify the user minimally.

For emails, we need to lowercase all characters and replace the middle letters of the name (everything except the first and last character) with exactly 5 asterisks "*****". The domain stays in lowercase without modification. Even if the name is very short (like two characters), the middle must still be replaced by 5 asterisks.

For phone numbers, we first remove all non-digit characters. Then we separate the local number (the last 10 digits) from the country code (the remaining leading digits, up to 3). The local number is always masked as "***-***-XXXX" and the country code is prepended with + and as many asterisks as its length.

Constraints guarantee that the input is valid, meaning we do not need to validate formats. The input length is limited to 40 characters for emails and 20 characters for phone numbers, so efficiency is not a major concern. Edge cases often arise from very short email names, minimal phone numbers (10 digits), or maximum country codes (3 digits).

Approaches

The brute-force approach would be to manually check each character and mask them according to the rules. For emails, we could iterate through the string until the '@' symbol, collect the first and last characters, and replace the middle with 5 asterisks. For phone numbers, we could iterate through the string, remove non-digit characters, determine country code length, and then build the final string. While this works correctly, it involves multiple iterations and unnecessary string manipulations, which can be simplified.

The optimal approach leverages direct string operations and slicing. For emails, we can split by '@', lowercase both parts, and concatenate first character + "*****" + last character with the domain. For phone numbers, we can use a list comprehension to filter digits, compute the country code length, and construct the masked string using formatted string interpolation. This reduces unnecessary loops and intermediate structures.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(n) Iterates character by character, constructs string manually
Optimal O(n) O(n) Uses split, lowercasing, slicing, and string formatting for concise masking

Algorithm Walkthrough

  1. Check type of input: If s contains '@', it is an email; otherwise, it is a phone number.

  2. Email masking:

  3. Split the string into name and domain using the '@' separator.

  4. Convert both name and domain to lowercase.

  5. Construct the masked name as the first character, 5 asterisks, and the last character.

  6. Concatenate the masked name with '@' and the domain.

  7. Phone number masking:

  8. Remove all non-digit characters from s.

  9. Identify the local number as the last 10 digits.

  10. Identify the country code as all remaining leading digits.

  11. Construct the mask prefix for the country code as + followed by as many asterisks as the country code length, or empty if no country code.

  12. Concatenate the country code prefix, "***-***-", and the last 4 digits of the local number.

Why it works: The algorithm correctly separates emails from phone numbers, applies the respective masking rules precisely, and guarantees that sensitive parts are obscured. String operations like slicing and formatting ensure that the output matches the exact required pattern.

Python Solution

class Solution:
    def maskPII(self, s: str) -> str:
        if '@' in s:
            # Email masking
            name, domain = s.split('@')
            name = name.lower()
            domain = domain.lower()
            masked_name = name[0] + '*****' + name[-1]
            return masked_name + '@' + domain
        else:
            # Phone number masking
            digits = [c for c in s if c.isdigit()]
            local = ''.join(digits[-10:])
            country_len = len(digits) - 10
            country_prefix = '+' + '*' * country_len + '-' if country_len else ''
            return f"{country_prefix}***-***-{local[-4:]}"

The Python implementation uses straightforward string operations. For emails, split and lower simplify the process. For phone numbers, a list comprehension efficiently filters digits, and string formatting handles the mask. This ensures clarity and correctness.

Go Solution

import (
    "strings"
    "unicode"
)

func maskPII(s string) string {
    if strings.Contains(s, "@") {
        // Email masking
        parts := strings.Split(s, "@")
        name := strings.ToLower(parts[0])
        domain := strings.ToLower(parts[1])
        maskedName := string(name[0]) + "*****" + string(name[len(name)-1])
        return maskedName + "@" + domain
    } else {
        // Phone number masking
        digits := []rune{}
        for _, c := range s {
            if unicode.IsDigit(c) {
                digits = append(digits, c)
            }
        }
        local := string(digits[len(digits)-10:])
        countryLen := len(digits) - 10
        countryPrefix := ""
        if countryLen > 0 {
            countryPrefix = "+" + strings.Repeat("*", countryLen) + "-"
        }
        return countryPrefix + "***-***-" + local[len(local)-4:]
    }
}

In Go, we handle strings using slices of rune to properly manage characters. The masking logic mirrors Python. strings.Repeat is used for asterisks, and unicode.IsDigit efficiently filters numeric characters.

Worked Examples

Example 1: "[email protected]"

Step Action Value
1 Detect email contains '@'
2 Split name/domain name="LeetCode", domain="LeetCode.com"
3 Lowercase name="leetcode", domain="leetcode.com"
4 Mask name masked_name="l*****e"
5 Concatenate result="l*****[email protected]"

Example 2: "[email protected]"

Step Action Value
1 Detect email contains '@'
2 Split name/domain name="AB", domain="qq.com"
3 Lowercase name="ab", domain="qq.com"
4 Mask name masked_name="a*****b"
5 Concatenate result="a*****[email protected]"

Example 3: "1(234)567-890"

Step Action Value
1 Detect phone no '@'
2 Extract digits digits="1234567890"
3 Local/CC split local="1234567890", country_len=0
4 Build mask "--7890"

Complexity Analysis

Measure Complexity Explanation
Time O(n) We iterate through the input string once to either split and lowercase or extract digits
Space O(n) Storing the filtered digits and masked strings requires additional space proportional to input size

The linear complexity is acceptable given input length constraints (<=40 for emails, <=20 for phone numbers).

Test Cases

# provided examples
assert Solution().maskPII("[email protected]") == "l*****[email protected]"
assert Solution().maskPII("[email protected]") == "a*****[email protected]"
assert Solution().maskPII("1(234)567-890") == "***-***-7890"

# additional edge cases
assert Solution().maskPII("[email protected]") == "a*****[email protected]" # minimal name
assert Solution().maskPII("+1(234)567-8901") == "+*-***-***-8901" # country code 1 digit
assert Solution().maskPII("+12-345-678-9012") == "+**-***-***-9012" # country code 2 digits
assert Solution().maskPII("+123 456 789 0123") == "+***-***-***-0123" # country code 3 digits
assert Solution().maskPII("1234567890") == "***-***-7890" # phone only digits, no separators
Test Why
"[email protected]" Typical email masking
"[email protected]" Short email name
"1(234)567-890" Phone number without country code
"[email protected]" Minimal email name
"+1(234)567-8901" Phone number with 1-digit country code