Reputation: 47
Row_Number | ColumnA | ColumnB |
---|---|---|
1 | Data 1 | A |
2 | Data A | |
3 | Data B | |
4 | Data 2 | B |
5 | Data C | |
6 | Data D | |
7 | Data E | |
8 | Data 3 | C |
this is how my data updated on a dataframe from another table, In this "Data 1" having a continuation as "Data A" and "Data B" and having only one data as on column B as "A", but in data frame all the "Data 1", "Data A", "Data B" are created in separate rows. Is there any operations are available to concatenate all the "Row_number" 1,2,3 to be appended on a single line and having only value as "A" in columnB?
Expected output is below:
Row_Number | ColumnA | ColumnB |
---|---|---|
1 | Data1,Data A, Data B | A |
2 | Data2,Data C, Data D,Data E | B |
3 | Data3 | C |
Thanks in Advance
I'm new to python and have tried the below code
# variable that stores the values: table_values
import pandas as pd
df=pd.DataFrame(table_values,columns=['ColumnA','ColumnB'])
for index, row in df.iterrows():
if df.loc[index,'ColumnB'] == '' & df.loc[index,'ColumnA'] != '':
df.loc[index-1, 'ColumnA'] = df.loc[index-1, 'ColumnA'] + df.loc[index, 'ColumnA']
print(df)'''
Upvotes: 0
Views: 290
Reputation: 8800
You could do a groupby
here. The simplest is to just use Column B
, since that already corresponds to the rows you want to group. So given this data:
import pandas as pd
df = pd.DataFrame({'ColumnA': ['Data1', 'DataA', 'DataB',
'Data2', 'DataC', 'DataD', 'DataE',
'Data3'],
'ColumnB': ['A', None, None,
'B', None, None, None,
'C']})
# ColumnA ColumnB
# 0 Data1 A
# 1 DataA None
# 2 DataB None
# 3 Data2 B
# 4 DataC None
# 5 DataD None
# 6 DataE None
# 7 Data3 C
Create something to group with, by filling the missing values:
grouper = df['ColumnB'].ffill()
# 0 A
# 1 A
# 2 A
# 3 B
# 4 B
# 5 B
# 6 B
# 7 C
# Name: ColumnB, dtype: object
Then aggregate to concat strings, following this post:
output = df['ColumnA'].groupby(grouper).apply(lambda x : ', '.join(x)).reset_index()
Final result:
ColumnB ColumnA
0 A Data1, DataA, DataB
1 B Data2, DataC, DataD, DataE
2 C Data3
Upvotes: 1