whytheq
whytheq

Reputation: 35557

Add 3 columns to DataFrame via map

Is it possible to add 3 new columns to this little DataFrame in one map?

import datetime as dt  
import pandas as pd
from pandas import *

df = pd.DataFrame({'myDate':['2006-02-12'
                             ,'2007-07-20'
                             ,'2009-05-19']})

def convert_date(val):    
    d, m, y = val.split('-')
    return int(d), int(y), int(m)

df[['day', 'year','month']] = df.myDate.map(convert_date)

Upvotes: 3

Views: 56

Answers (2)

jezrael
jezrael

Reputation: 862481

I think you can convert column myDate to_datetime and then use dt.year, dt.month and dt.day:

df['myDate'] = pd.to_datetime(df.myDate)

df['year'] = df.myDate.dt.year
df['month'] = df.myDate.dt.month
df['day'] = df.myDate.dt.day

print (df)
     myDate  year  month  day
0 2006-02-12  2006      2   12
1 2007-07-20  2007      7   20
2 2009-05-19  2009      5   19

If want use your approach, you need add pd.Series, because else you return tuples. And change map to apply:

def convert_date(val):    
    d, m, y = val.split('-')
    return pd.Series([int(d), int(y), int(m)])

df[['day', 'year','month']] = df.myDate.apply(convert_date)

print (df)
       myDate   day  year  month
0  2006-02-12  2006    12      2
1  2007-07-20  2007    20      7
2  2009-05-19  2009    19      5

I try use map, but result is:

def convert_date(val):    
    d, m, y = val.split('-')
    return int(d), int(y), int(m)

df['a'], df['b'], df['c'] = df.myDate.map(convert_date)
print (df)
       myDate     a     b     c
0  2006-02-12  2006  2007  2009
1  2007-07-20    12    20    19
2  2009-05-19     2     7     5

Upvotes: 2

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

you can use .str.split():

In [11]: df[['day', 'year','month']] = df.myDate.str.split('-', expand=True).astype(int)

In [12]: df
Out[12]:
       myDate   day  year  month
0  2006-02-12  2006     2     12
1  2007-07-20  2007     7     20
2  2009-05-19  2009     5     19

or using .str.extract()

In [21]: df.myDate.str.extract(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})', expand=True).astype(int)
Out[21]:
   year  month  day
0  2006      2   12
1  2007      7   20
2  2009      5   19

Upvotes: 2

Related Questions