Reputation: 605
i was trying to format the input data to form a format as mentioned below, i am not sure how to achieve how to do it as i am new to pandas
Input data in xlsx format:
which is parsed to form below data format
import pandas as pd
excel_file = 'sample.xlsx'
ip_data_read = pd.read_excel(excel_file)
ip_data_read["Test Case(ID / Brief Description)"] = ip_data_read["Test Case(ID / Brief Description)"].ffill()
out put print like this:
Test Case(ID / Brief Description) Action / Step Expected Result (Description)
TC_001 a b
TC_001 c d
TC_001 e NaN
TC_001 f NaN
TC_001 NaN g
TC_002 a11 qwerty
TC_002 qqq mnbghd
TC_002 NaN cccc12
TC_002 qwwqwq33 NaN
TC_003 a1 b1
TC_004 NaN c2
TC_005 NaN d2
TC_006 NaN s2
TC_007 a3 d4
TC_099 abcdefg NaN
TC_100 a b
TC_100 c d
TC_100 e NaN
TC_100 f NaN
TC_100 NaN g
TC_101 xx yy
TC_101 er gf
TC_101 vv i
i wanted the dataframe to be something like,
Test Case(ID / Brief Description) Steps
TC_001 a,b
c,d
e
f
g
TC_002 a11,qwerty
TC_002 qqq,mnbghd
TC_002 cccc12
TC_002 qwwqwq33
TC_003 a1,b1
TC_004 c2
TC_005 d2
TC_006 s2
TC_007 a3,d4
TC_099 abcdefg
TC_100 a,b
c,d
e
f
g
TC_101 xx ,yy
er ,gf
vv ,i
for example the value/steps associated with TC_001 should give me a string like
a,b
c,d
e
f
g
Upvotes: 1
Views: 55
Reputation: 862396
Use Series.str.cat
with replace missing values:
s = df.pop('Expected Result (Description)').fillna('')
df['Steps'] = df.pop('Action / Step').fillna('').str.cat(s, sep=',').str.strip(',')
If there is many columns:
df1 = (df.set_index('Test Case(ID / Brief Description)')
.apply(lambda x: ','.join(x.dropna()), axis=1)
.reset_index())
Sample:
df = pd.DataFrame({'Test Case(ID / Brief Description)':range(4),
'Action / Step':[np.nan, np.nan,'a', 'd'],
'Expected Result (Description)':['s', np.nan, np.nan,'m']})
print (df)
Test Case(ID / Brief Description) Action / Step \
0 0 NaN
1 1 NaN
2 2 a
3 3 d
Expected Result (Description)
0 s
1 NaN
2 NaN
3 m
df1 = (df.set_index('Test Case(ID / Brief Description)')
.apply(lambda x: ','.join(x.dropna()), axis=1)
.reset_index(name='Steps'))
print (df1)
Test Case(ID / Brief Description) Steps
0 0 s
1 1
2 2 a
3 3 d,m
Upvotes: 1