Vikram Karthic
Vikram Karthic

Reputation: 478

Pandas Characters between two spaces

I have dataframe like one below

df = pd.DataFrame({'vals': [1, 2, 3, 4, 5], 'ids': [u'a iball is', u'aaa vcat ll', u'c cnut bb', u'fdfdf qbell l', 'bxyz zbat c']})

I am trying to replace the the first string of characters between the first and second space position with x in ids column

I want my data frame to look some thing like this

df = pd.DataFrame({'vals': [1, 2, 3, 4, 5], 'ids': [u'a xball is', u'aaa xcat ll', u'c xnut bb', u'fdfdf xbell l', 'bxyz xbat c']})

Upvotes: 0

Views: 124

Answers (2)

Cute Panda
Cute Panda

Reputation: 1498

Without the use of regex, this will work fine:

import pandas as pd
df = pd.DataFrame({'vals': [1, 2, 3, 4, 5], 'ids': [u'a iball is', u'aaa vcat ll', u'c cnut bb', u'fdfdf qbell l', 'bxyz zbat c']})
for row in df.iterrows():
    temp = row[1]['ids'].split()
    val = temp[1]    
    val = 'x'+val[1:]
    temp[1] = val
    s = " ".join(temp)
    df.loc[df['ids']==row[1]['ids'], 'ids'] = s
df

Output

Upvotes: 0

Umar.H
Umar.H

Reputation: 23099

use str.replace with capturing groups.

\1 will apply to the first word after a space at the start of a string.

^ asserts a pattern at the start of a line.

\w matches any word [A-Za-z0-9_]

+ is a greedy match to match the previous token as many times as possible.

df['ids'].str.replace('(^\w+\s)(\w{1})', r'\1x')

0       a xball is
1      aaa xcat ll
2        c xnut bb
3    fdfdf xbell l
4      bxyz xbat c
Name: ids, dtype: object

Upvotes: 2

Related Questions