Reputation: 363
I Need to iterate values in a dataframe .In a data frame where i have oldvalue and new value columns in dataframe . I want to
old value newvalue date casenumber
aab baa 1/1/2019 001
acb bca 2/2/2019 002
abc cba 1/7/2109 003
acd dca 2/8/2019 004
aab bca 2/23/2019 005
acb baa 4/6/2019 006
abc dca 4/9/2019 007
aab baa 1/23/2019 008
I want to iterate the values in oldvalue to know how many are passing from oldvalue to new value from aab to others.
expected output:-
jan feb march April
aab-baa 2 1
aab-bca 1
acb-bca 1
acb-baa 4
abc-cba 1
abc-dca 4
acd-dca 1
The code i used to get the output:
df = pd.read_excel(r"")
f8 = df[df['Old Value'] == 'aab'] every time i am changing the old value manually
f9 = f8[f8['New Value'] == 'baa']
f1 = f8[f8['New Value'] == 'bca']
f2 = f8[f8['New Value'] == 'cba']
f3 = f8[f8['New Value'] == 'dca']
f4 = f8[f8['New Value'] == 'abc']
d1 = pd.concat([f9, f1])
d2 = pd.concat([f2, f3])
d3 = pd.concat([d1, d2])
d4= pd.concat([d3, f4])
df10=d4[['Case Number','Old Value','New Value']]
f9= df10.set_index(["New Value", "Old Value"]).count(level="New Value")
df = pd.read_excel(r"")
f8 = df[df['Old Value'] == 'aab']
f9 = f8[f8['New Value'] == 'baa']
f1 = f8[f8['New Value'] == 'bca']
f2 = f8[f8['New Value'] == 'cba']
f3 = f8[f8['New Value'] == 'dca']
f4 = f8[f8['New Value'] == 'abc']
d1 = pd.concat([f9, f1])
d2 = pd.concat([f2, f3])
d3 = pd.concat([d1, d2])
d4= pd.concat([d3, f4])
df10=d4[['Case Number','Old Value','New Value']]
f9= df10.set_index(["New Value", "Old Value"]).count(level="New Value")
out put
jan feb march April
aab-baa 2 1
aab-bca 1
acb-bca 1
acb-baa 4
abc-cba 1
abc-dca 4
acd-dca 1
Upvotes: 1
Views: 275
Reputation: 862471
First convert date
s to datetimes, then reshape by crosstab
with Series.dt.month
for correct order, add DataFrame.reindex
for all missing months (if necessary), then convert columns to months and last convert MultiIndex
to first 2 columns:
df['date'] = pd.to_datetime(df['date'])
df = (pd.crosstab([df['old value'],df['newvalue']], df['date'].dt.month)
.reindex(columns=range(1, 13), fill_value=0)
.rename(columns = lambda x: pd.to_datetime(x, format='%m').strftime('%b'))
.reset_index()
.rename_axis(None, axis=1))
print (df)
old value newvalue Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov \
0 aab baa 2 0 0 0 0 0 0 0 0 0 0
1 aab bca 0 1 0 0 0 0 0 0 0 0 0
2 abc cba 1 0 0 0 0 0 0 0 0 0 0
3 abc dca 0 0 0 1 0 0 0 0 0 0 0
4 acb baa 0 0 0 1 0 0 0 0 0 0 0
5 acb bca 0 1 0 0 0 0 0 0 0 0 0
6 acd dca 0 1 0 0 0 0 0 0 0 0 0
Dec
0 0
1 0
2 0
3 0
4 0
5 0
6 0
Replace 0
to empty strings is possible, but get numeric with strings data and next processing should be problem:
df = (pd.crosstab([df['old value'],df['newvalue']], df['date'].dt.month)
.replace(0, '')
.reindex(columns=range(1, 13), fill_value='')
.rename(columns = lambda x: pd.to_datetime(x, format='%m').strftime('%b'))
.reset_index()
.rename_axis(None, axis=1))
print (df)
old value newvalue Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
0 aab baa 2
1 aab bca 1
2 abc cba 1
3 abc dca 1
4 acb baa 1
5 acb bca 1
6 acd dca 1
Upvotes: 1