Reputation: 2304
I have this dataset:
df = pd.DataFrame()
df['year'] = [2011,2011,2011,2011,2011,2011,2011,2011,2011,2011,2011,2011]
df['month'] = [1,2,3,4,5,6,1,2,3,4,5,6]
df['after'] = [0,0,0,1,1,1,0,0,0,1,1,1]
df['campaign'] = [0,0,0,0,0,0,1,1,1,1,1,1]
df['sales'] = [10000,11000,12000,10500,10000,9500,7000,8000,5000,6000,6000,7000]
And I want a new column date, that combines year and month into year-month date. I tried:
df['my_month'] = df['year']*100 + df['month'] + 1
But I'm stuck on what to do next. Any help will be greatly appreciated.
Upvotes: 0
Views: 748
Reputation: 5918
If we need start date of the month then
df['date'] = pd.to_datetime(df.year.astype(str) + '-' + df.month.astype(str))
Sample Output
year month after campaign sales date
0 2011 1 0 0 10000 2011-01-01
1 2011 2 0 0 11000 2011-02-01
2 2011 3 0 0 12000 2011-03-01
3 2011 4 1 0 10500 2011-04-01
When year and month
format is required
pd.to_datetime(df.year.astype(str) + '-' + df.month.astype(str)).dt.to_period('M')
Sample output
year month after campaign sales date
0 2011 1 0 0 10000 2011-01
1 2011 2 0 0 11000 2011-02
2 2011 3 0 0 12000 2011-03
3 2011 4 1 0 10500 2011-04
Upvotes: 2
Reputation: 153
import pandas as pd
from datetime import date
def get_date(year, month):
return date(year, month, 1)
def create_dataframe():
df = pd.DataFrame()
df['year'] = [2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011]
df['month'] = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
df['after'] = [0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1]
df['campaign'] = [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
df['sales'] = [10000, 11000, 12000, 10500, 10000, 9500, 7000, 8000, 5000, 6000, 6000, 7000]
df['my_month'] = df.apply(lambda x: get_date(x.year, x.month), axis=1)
print(df.to_string())
if __name__ == '__main__':
create_dataframe()
output
year month after campaign sales my_month
0 2011 1 0 0 10000 2011-01-01
1 2011 2 0 0 11000 2011-02-01
2 2011 3 0 0 12000 2011-03-01
3 2011 4 1 0 10500 2011-04-01
4 2011 5 1 0 10000 2011-05-01
5 2011 6 1 0 9500 2011-06-01
6 2011 1 0 1 7000 2011-01-01
7 2011 2 0 1 8000 2011-02-01
8 2011 3 0 1 5000 2011-03-01
9 2011 4 1 1 6000 2011-04-01
10 2011 5 1 1 6000 2011-05-01
11 2011 6 1 1 7000 2011-06-01
Upvotes: 1