Reputation: 73638
I have a string which needs to be sorted based on the sort_fmt
. Ex: If the string is 'abdcdfs' & the sort_fmt is 'dacg'. Upon sort, the output should be 'ddacfbs'. As you see, there might be characters in input string that are not present in the order string and vice-versa. The characters of input string that are not present in the order string should come at the end of the output string in any order.
Here's what I have written. It works, It's O(n*m) algo. I was wondering is there are better & shorter ways to do this? Maybe use itertools
?
def sort_str(s, sort_fmt):
sorted_str = ''
str_hash = dict()
# O(n)
for ch in s:
if ch in str_hash:
str_hash[ch] += 1
else:
str_hash[ch] = 1
# O(m) + O(1) where m<=n
for ch in sort_fmt:
if ch in str_hash:
cnt = str_hash[ch]
sorted_str += cnt * ch
# O(n)
for ch in s:
if ch not in sort_fmt:
sorted_str += ch
return sorted_str
if __name__ == '__main__':
print sort_str('abdcdfs', 'dacg')
Upvotes: 2
Views: 179
Reputation: 838376
You are trying to implement counting sort which is indeed O(n) under certain conditions. However your implementation has two bugs near the end which mean that the actual time complexity of your implementation is O(n2 + n*m):
for ch in s:
if ch not in sort_fmt: # <--- "in" requires a linear search. O(n*m)
sorted_str += ch # <--- Ouch! Concatenation! O(n^2)
in
on a string is linear in the length of the string, and you are doing this in a loop.Try this instead. It requires Python 2.7 or newer because of the use of collections.Counter
, but the Counter
can easily be replaced with a defaultdict
for older versions of Python):
from collections import Counter
def sort_str(s, sort_fmt):
counter = Counter(s)
d = set(sort_fmt)
result = ''.join(c * counter[c] for c in sort_fmt)
result += ''.join(c for c in s if c not in d)
return result
if __name__ == '__main__':
print sort_str('abdcdfs', 'dacg')
Here's a more concise way to get the result you want if you drop the requirement that it should be O(n):
>>> d = dict((v,k) for (k,v) in enumerate('dacg'))
>>> sorted('abdcdfs', key = lambda c:d.get(c, len(d)))
['d', 'd', 'a', 'c', 'b', 'f', 's']
Upvotes: 6
Reputation: 15172
I am not sure about the complexity of sorted. This works
def sort_str(s, frmt):
l = len(frmt)
return sorted(s, key = lambda x: frmt.index(x) if x in frmt else l)
Upvotes: 0