aaaaa0a
aaaaa0a

Reputation: 137

How to replace dataframe-element with another dataframe-element

DataFrame I have prepared is as follows...

Index and Title Index
1 aa aa aaaa 1
1.2 bb bbbb bb bbbb bb b 1.2
1.2.3 ccc cc c ccccc cccccc 1.2.3
2 dddd d d dd ddd 2

DataFrame I want is as follow..

Index and Title Index Title
1 aa aa aaaa 1 aa aa aaaa
1.2 bb bbbb bb bbbb bb b 1.2 bb bbbb bb bbbb bb b
1.2.3 ccc cc c ccccc cccccc 1.2.3 ccc cc c ccccc cccccc
2 dddd d d dd ddd 2 dddd d d dd ddd

I tried it with a following code

df['Title'] = df['Index and Title'].str.replace(df['Index'] + ' ','')

However, the debugger said ...

TypeError: 'Series' objects are mutable, thus they cannot be hashed

How should I do in this case?

Upvotes: 2

Views: 111

Answers (3)

RavinderSingh13
RavinderSingh13

Reputation: 133680

With your shown samples only, this could be taken care by extract function of Pandas, please try following.

df["Title"] = df["Index and Title"].str.extract(r'^\d+(?:(?:\.\d+){1,})?\s+(\D+)$', expand=True)

OR in case you may have digits after later values then try following:

df["Title"] = df["Index and Title"].str.extract(r'^\d+(?:(?:\.\d+){1,})?\s+(.*)$', expand=True)

Output of df will be as follows:

               Index and Title  Index                  Title
0                 1 aa aa aaaa      1             aa aa aaaa
1     1.2 bb bbbb bb bbbb bb b    1.2   bb bbbb bb bbbb bb b
2  1.2.3 ccc cc c ccccc cccccc  1.2.3  ccc cc c ccccc cccccc
3            2 dddd d d dd ddd      2        dddd d d dd ddd

Explanation: Adding detailed explanation for above.

^\d+(?:(?:\.\d+){1,})?  ##Matching starting digits in column Index and Title, digits may followed by dot and digits(1 or more occurrences) keeping this optional.
\s+                     ##Matching 1 or more occurrences of spaces here.
(\D+)$                  ##Creating 1st capturing group which has all non digits values till end of value.

Upvotes: 2

Corralien
Corralien

Reputation: 120489

df["Title"] = df["Index and Title"].str.split(n=0).str[1:].str.join(" ")
>>> df
               Index and Title  Index                  Title
0                 1 aa aa aaaa      1             aa aa aaaa
1     1.2 bb bbbb bb bbbb bb b    1.2   bb bbbb bb bbbb bb b
2  1.2.3 ccc cc c ccccc cccccc  1.2.3  ccc cc c ccccc cccccc
3            2 dddd d d dd ddd      2        dddd d d dd ddd

Upvotes: 2

jezrael
jezrael

Reputation: 863291

If need replace by both columns use lambda function with axis=1:

df['Title'] = df.apply(lambda x: x['Index and Title'].replace(x['Index'],''), axis=1).str.strip()

If need only letters with spaces (there is no replace by Index column) use Series.str.extract with Series.str.strip:

df['Title'] = df['Index and Title'].str.extract('([a-zA-Z ]+)', expand=False).str.strip()

Upvotes: 0

Related Questions