mkpappu
mkpappu

Reputation: 388

Remove character from string if its not in a list...?

I have been using .translate() but I would like to remove a character from a string if it is not one of the characters in the list. I don't think I'm using "not" correctly, but that was mainly to show what I mean (sorry, bad way, I know). The string is

MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEAXX 

and I am trying to remove the Xs.

aminoacids = ['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y']
contentjoined.translate(None, not(aminoacids))

Upvotes: 4

Views: 5893

Answers (3)

ljk
ljk

Reputation: 618

Not a fancy solution, but you can try adding the characters to a new string if it's on the list

newString = ""
for i in range(len(oldString)):
  if oldString[i] in aminoacids:
    newString += oldString[i]

Upvotes: -1

ReallyGrimm
ReallyGrimm

Reputation: 21

Is there a specific reason you're using translate() instead of strip()?

If you know there's a specific character you're looking to remove, strip() is a much easier way:

long_string = "MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEAXX" 
print long_string.strip('X')

Output: MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAXPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA

Upvotes: 0

Kasravnd
Kasravnd

Reputation: 107287

You can use a list comprehension to get missed characters :

>>> contentjoined.translate(None,''.join([i for i in contentjoined if i not in aminoacids]))
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>> 

Or you can use set.difference :

contentjoined.translate(None,''.join(set(contentjoined).difference(aminoacids)))
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>> 

But you can do this job with a simple list comprehension and join :

>>> ''.join([i for i in contentjoined if i in aminoacids])
'MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA'
>>> 

Upvotes: 5

Related Questions