Reputation: 388
I have been using .translate() but I would like to remove a character from a string if it is not one of the characters in the list. I don't think I'm using "not" correctly, but that was mainly to show what I mean (sorry, bad way, I know). The string is
MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEAXX
and I am trying to remove the Xs.
aminoacids = ['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
contentjoined.translate(None, not(aminoacids))
Upvotes: 4
Views: 5893
Reputation: 618
Not a fancy solution, but you can try adding the characters to a new string if it's on the list
newString = ""
for i in range(len(oldString)):
if oldString[i] in aminoacids:
newString += oldString[i]
Upvotes: -1
Reputation: 21
Is there a specific reason you're using translate()
instead of strip()
?
If you know there's a specific character you're looking to remove, strip()
is a much easier way:
long_string = "MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEAXX"
print long_string.strip('X')
Output: MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA
Upvotes: 0
Reputation: 107287
You can use a list comprehension to get missed characters :
>>> contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>>
Or you can use set.difference
:
contentjoined.translate(None,''.join(set(contentjoined).difference(aminoacids)))
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>>
But you can do this job with a simple list comprehension and join
:
>>> ''.join([i for i in contentjoined if i in aminoacids])
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>>
Upvotes: 5