Souvik Ray
Souvik Ray

Reputation: 3018

Regex to remove certain strings from a comma separated strings

I have a string which looks like below

string = "NO PICK: hey there, hey you,NO PICK:hey there you, haha"

Now I want to remove any string that contains NO PICK: from the comma separated strings such that the end result looks like this

string = "hey you, haha"

I know how to remove the NO PICK: from the entire string itself by doing something like this

import re
string = string.replace("NO PICK:", "")
print(string)

But I do not know how to build a regex to remove entire substrings containing the match while keeping the other comma separated strings intact.

Note: I am using pandas to join values of certain columns that have these strings and remove NO PICK: from them.

Here is my below example

cc = [i for i in df.columns if i.startswith("Data")]
df[c] = df[cc].astype('unicode').apply(','.join, axis=1)

Here the value of df[cc] should not contain those strings that that NO PICK:

Upvotes: 1

Views: 1294

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626870

In Pandas, you may use

df[cc].astype(str).str.replace(r'NO PICK:[^,]*,*', '').str.strip()

The regex is NO PICK:[^,]*,*:

  • NO PICK: - a literal text
  • [^,]* - zero or more chars other than a comma
  • ,* - zero or more commas.

The .str.strip() will remove redudant leading/trailing whitespaces.

If you just work with strings, you may use

string = "NO PICK: hey there, hey you,NO PICK:hey there you, haha"
print( ', '.join([x.strip() for x in string.split(",") if "NO PICK:" not in x]).strip() )

See the Python demo

Notes:

  • string.split(",") splits the string with commas
  • if "NO PICK:" not in x] discards all items with NO PICK: in them
  • x.strip() strips the leading/trailing whitespace from the "valid" splits
  • ', '.join(...).strip() joins the "valid" items and remove any leading/trailing whitespace

Upvotes: 2

Bipul singh kashyap
Bipul singh kashyap

Reputation: 605

you can split the string and the check for NO PICK, if NO PICK is not in the list substring then append it into a list and finally join the list with ','

import re
value = "NO PICK: hey there, hey you,NO PICK:hey there you, haha"
value = value.split(',')
string = [v for v in value if not re.search('NO PICK', v)]
print(','.join(string))

Upvotes: 1

Related Questions