Reputation:
Say I have a string in alphabetical order, based on the amount of times that a letter repeats.
There are 3 B's, so they go at the start, 2 A's and 2 D's, so the A's go in front of the D's because they are in alphabetical order, and 1 C. Another example would be CCCCAAABBDDAB.
Note that there can be 4 letters in the middle somewhere (i.e. CCCC), as there could be 2 pairs of 2 letters.
However, let's say I can only have n letters in a row. For example, if n = 3 in the second example, then I would have to omit one "C" from the first substring of 4 C's, because there can only be a maximum of 3 of the same letters in a row.
Another example would be the string "CCCDDDAABC"; if n = 2, I would have to remove one C and one D to get the string CCDDAABC
Example input/output:
How can I do this with Python? Thanks in advance!
This is what I have right now, although I'm not sure if it's on the right track. Here, z is the length of the string.
for k in range(z+1):
if final_string[k] == final_string[k+1] == final_string[k+2] == final_string[k+3]:
final_string = final_string.translate({ord(final_string[k]): None})
return final_string
Upvotes: 2
Views: 911
Reputation: 552
from itertools import groupby
n = 2
def rem(string):
out = "".join(["".join(list(g)[:n]) for _, g in groupby(string)])
print(out)
So this is the entire code for your question.
s = "AABBCCDDEEE"
s2 = "AAAABBBDDDDDDD"
s3 = "CCCCAAABBDDABBB"
s4 = "AAAAAAAA"
z = "AAABBCCCCDE"
With following test:
AABBCCDDEE
AABBDD
CCAABBDDABB
AA
AABBCCDE
Upvotes: 1
Reputation: 15962
Ok, based on your comment, you're either pre-sorting the string or it doesn't need to be sorted by the function you're trying to create. You can do this more easily with itertools.groupby()
:
import itertools
def max_seq(text, n=1):
result = []
for k, g in itertools.groupby(text):
result.extend(list(g)[:n])
return ''.join(result)
max_seq('AAABBCCCCDE', 2)
# 'AABBCCDE'
max_seq('EEEEEFFFFGGG', 4)
# 'EEEEFFFFGGG'
max_seq('XXYYZZ')
# 'XYZ'
max_seq('CCCDDDAABC', 2)
# 'CCDDAABC'
In each group g
, it's expanded and then sliced until n
elements (the [:n]
part) so you get each letter at most n
times in a row. If the same letter appears elsewhere, it's treated as an independent sequence when counting n
in a row.
Edit: Here's a shorter version, which may also perform better for very long strings. And while we're using itertools
, this one additionally utilises itertools.chain.from_iterable()
to create the flattened list of letters. And since each of these is a generator, it's only evaluated/expanded at the last line:
import itertools
def max_seq(text, n=1):
sequences = (list(g)[:n] for _, g in itertools.groupby(text))
letters = itertools.chain.from_iterable(sequences)
return ''.join(letters)
Upvotes: 1
Reputation: 4670
Here's my solution:
def snip_string(string, n):
list_string = list(string)
list_string.sort()
chars = set(string)
for char in chars:
while list_string.count(char) > n:
list_string.remove(char)
return ''.join(list_string)
Calling the function with various values for n
gives the following output:
>>> string = "AAAABBBCCCDDD"
>>> snip_string(string, 1)
'ABCD'
>>> snip_string(string, 2)
'AABBCCDD'
>>> snip_string(string, 3)
'AAABBBCCCDDD'
>>>
Here is the updated version of my solution, which only removes characters if the group of repeated characters exceeds n
.
import itertools
def snip_string(string, n):
groups = [list(g) for k, g in itertools.groupby(string)]
string_list = []
for group in groups:
while len(group) > n:
del group[-1]
string_list.extend(group)
return ''.join(string_list)
Output:
>>> string = "DDDAABBBBCCABCDE"
>>> snip_string(string, 3)
'DDDAABBBCCABCDE'
Upvotes: 1
Reputation:
hello = "hello frrriend"
def replacing() -> str:
global hello
j = 0
for i in hello:
if j == 0:
pass
else:
if i == prev:
hello = hello.replace(i, "")
prev = i
prev = i
j += 1
return hello
replacing()
looks a bit primal but i think it works, thats what i came up with on the go anyways , hope it helps :D
Upvotes: 1