Reputation: 67
I have a pandas dataframe called df1
ID| ACTIVITY | Date
1 | activity 1 | 2/04/2016
2 | activity 2 | 3/04/2015
3 | activity 3 | 7/05/2016
3 | activity 4 | 2/04/2016
4 | activity 3 | 2/04/2017
5 | activity 6 | 2/04/2015
5 | activity 2 | 2/04/2016
6 | activity 1 | 2/04/2018
i have too create two dataframes out of these one where activity1 is being done in year 2015 and other where activity 2 is being done in year 2015 i am using the following code
for index,rows in df1.iterrows():
if rows['Date'].year == 2015 :
if rows['ACTIVITY'] == 'activity 1' :
data1 = rows['ID','Date']
answer1 = answer1.append(data1)
if rows['ACTIVITY'] == 'activity 2' :
data2 = rows['ID','Date']
answer2 = answer2.append(data2)
this is giving following result
activity1
Index | 0
0 | Id1
1 | Date1
0 | ID2
1 | Date2
0 | ID3
1 | Date 3 corresponding to that id
similarly
activity2
Index | 0
0 | Id1
1 | Date1
0 | ID2
1 | Date2
0 | ID3
1 | Date 3 corresponding to that id
what i want is
activity1
ID | Date
ID1 | Date1
How to do this?
Upvotes: 1
Views: 484
Reputation: 862771
Use DataFrame.loc
for filter by conditions and columns names:
mask = (df['Date'].dt.year == 2015)
df1 = df.loc[mask & (df['ACTIVITY'] == 'activity 1'), ['ID','Date']]
df2 = df.loc[mask & (df['ACTIVITY'] == 'activity 2'), ['ID','Date']]
Your solution is not recommend, check this.
But is possible by:
df1['Date'] = pd.to_datetime(df1['Date'])
answer1, answer2 = pd.DataFrame(), pd.DataFrame()
for index,rows in df1.iterrows():
if rows['Date'].year == 2015 :
if rows['activity'] == 'activity 1' :
data1 = rows[['ID','Date']]
answer1 = answer1.append(data1)
if rows['activity'] == 'activity 2' :
data2 = rows[['ID','Date']]
answer2 = answer2.append(data2)
Upvotes: 1