Sylvain
Sylvain

Reputation: 253

grab substring in pandas series

I have a dataframe df with X columns. I want to fill df['date'] and df['time'] with a substring located inside the column df['job.filename']. I tried to convert the Series into list and then grab list[x:y]=date and also

for i,row in df.iterrows(): df.set_value(i,'time',row['job.filename'][-10:-4]) df.set_value(i,'date',row['job.filename'][21:27])

But this didn't work Cheers

Upvotes: 1

Views: 624

Answers (1)

Vaishali
Vaishali

Reputation: 38415

I took your sample job.filename to create a dataframe and tried the following:

df = pd.DataFrame(['IMAT list 1-3609-0-20161214-092934.csv'])
df['date'] = df[0].str.extract('.*-\d+-(\d+)-\d+') #0 is the column name, in your case job.filename
df['time'] = df[0].str.extract('.*-\d+-\d+-(\d+)')

You get:

 0                                      date        time

0 IMAT list 1-3609-0-20161214-092934.csv 20161214 092934

This regex will work only if all the values follow the exact pattern

Upvotes: 1

Related Questions