Reputation: 55
I want to combine the cells as follows.
Before:
| | test1 | test2 | test3 |
| -:|:----- |:----- | -----:|
| 0 | value | value | value |
| 1 | test4 | test5 |
| 2 | value | value |
| 3 | test6 | test7 | test8 |
| 4 | value | value | value |
| 5 | test9 | test0 |
| 6 | value | value |
After:
| | test1 | test2 | test3 | test4 | test5 | test6 | test7 | test8 | test9 | test0 |
| -:|:----- |:----- | ----- |:----- |:----- |:----- |:----- |:----- |:----- | -----:|
| 0 | value | value | value | value | value | value | value | value | value | value |
I want to change the cells using the Python code with the help of pandas. Please help me with this. Thank you.
Upvotes: 3
Views: 143
Reputation: 25239
Slice from row 1 and pre-process it with melt
to series s
. Slice even rows of s
for columns and odd rows for values and construct a new dataframe from these columns and values. Finally, join to row 0 of original df
Sample df:
test1 test2 test3
0 value1 value2 value3
1 test4 test5 NaN
2 value4 value5 NaN
3 test6 test7 test8
4 value6 value7 value8
5 test9 test0 NaN
6 value9 value0 NaN
s = df.iloc[1:].melt().dropna()['value']
cols = s[0::2].tolist()
vals = s[1::2].tolist()
df_final = df.iloc[[0]].join(pd.DataFrame([vals], columns=cols)).sort_index(1)
Out[140]:
test0 test1 test2 test3 test4 test5 test6 test7 test8 \
0 value0 value1 value2 value3 value4 value5 value6 value7 value8
test9
0 value9
Upvotes: 2
Reputation: 953
Here is something you can do.
import pandas as pd
df = pd.DataFrame({'test1 ':['15','test4','79', 'test6', '34', 'test9', '323'],
'test2 ':['78','test5','45', 'test7', '4', 'test10', '34'],
'test3 ':['8','','', 'test8', '56', '', '']})
print("Original Dataframe")
print(df)
df1 = pd.DataFrame()
col_names = []
col_names = df.iloc[1::2, :].to_numpy('str').tolist()
row_values = df.iloc[2::2, :].to_numpy('str').tolist()
col_names = [j for sub in col_names for j in sub if j!= '']
row_values = [j for sub in row_values for j in sub if j!= '']
df1 = pd.DataFrame([row_values], columns= col_names)
print("Dataframe 1")
print(df1)
df2 = df.iloc[[0, ]]
print("Dataframe 2")
print(df2)
df3 = pd.concat([df2, df1], axis=1)
print("Dataframe Result")
print(df3)
The df1
contains all the data except for the first one. The df2
contains only original columns and first row. Finally, you concatenate df1 and df2
to form df3
.
This gives you an output as:
Original Dataframe
test1 test2 test3
0 15 78 8
1 test4 test5
2 79 45
3 test6 test7 test8
4 34 4 56
5 test9 test10
6 323 34
Dataframe 1
test4 test5 test6 test7 test8 test9 test10
0 79 45 34 4 56 323 34
Dataframe 2
test1 test2 test3
0 15 78 8
Dataframe Result
test1 test2 test3 test4 test5 test6 test7 test8 test9 test10
0 15 78 8 79 45 34 4 56 323 34
Upvotes: 2