mail@pastecode.io avatar
2 months ago
1.1 kB
How to find if two strings are the same anagrams in linear time (and later can group them)

Option 1) If the anagram has a limited charset (for example, only lowercase letters), and you know the maximum possible length of the string (if it's not too long, e.g., max length = 10000), then you can create a unique integer for each string like this:

charset_size = 26 # number of lowercase letters in English ABC
string = "hello"
unique_int = 0

exponent = 0

foreach letter in string:
    unique_int += charset_size^exponent * (ord(letter) - ord('a') + 1)

Then you can use this unique_int as a dict key. If string match to the same key/unique_int, they are in the same group

Option 2) Count each character in the string and turn the list into hashable key like tuple which can be used as the key for the dictionary
import Counter from collections # not sure if this is correct, but there is a Counter class
counts = Counter(string)
counts_tuple = tuple(counts)

So this would look like this: tuple('h': 1, 'e': 1, 'l': 2, 'o': 1)
This tuple can be used as the dict key to identify groups
Leave a Comment