may
may

Reputation: 1185

Removing multiple characters and joining in pandas columns

I am trying to format this string but excluding the characters: ( )

My_name (1)
Your_name (2)

Desired output:

My_name_ID_1
Your_name_ID_2

This is a column of my dataframe.I tried replacing but only one character at a time, and I also would like to join afterward.

Can I join and replace already those both characters?

Upvotes: 2

Views: 150

Answers (2)

anky
anky

Reputation: 75100

you can also use :

s.str.replace(r"\(.*\)","").str.strip()+"_ID_"+s.str.replace(r'[^(]*\(|\)[^)]*', '')

However, the answer by @user3483203 is better. :)

Upvotes: 2

user3483203
user3483203

Reputation: 51165

You can use a regular expression with str.replace:

s.str.replace(r'(\w+)\s+\(([^\)])\)', r'\1_ID_\2')

0      My_name_ID_1
1    Your_name_ID_2
Name: 0, dtype: object

An alternative is:

s.str.replace(r'\s+\(([^\)])\)', r'_ID_\1')

If you'd like to be less explicit.


Regex Explanation

(                          # matching group 1
  \w+                      # matches any word character
)                          
\s+                        # matches one or more spaces
\(                         # matches the character (
(                          # matching group 2
  [^\)]                    # matches any character that IS NOT )
) 
\)                         # matches the character )

Upvotes: 2

Related Questions