Reputation: 105
I want to add new rows in a Pandas data frame without considering the order and the number of columns in every new row.
As I add new rows, I want my data frame to look like below. Every row can have different number of columns.
---- | 1 | 2 | 3 | 4
row1 | data | data |
row2 | data | data | data
row3 | data |
row4 | data | data | data | data
Upvotes: 0
Views: 183
Reputation: 60
In pandas you can concatenate new rows with an existing data frame (even if the new row has different number of columns) as below.
import pandas as pd
df = pd.DataFrame([list(range(5))])
new_row = pd.DataFrame([list(range(4))])
pd.concat([df,new_row], ignore_index=True, axis=0)
In the above code snippet, pd.concatenate function merges two data frames. If you provide the argument ignore_index=True, pandas will merge two data frames without considering their lengths.
Upvotes: 0
Reputation: 4186
Building pandas DataFrames one row at a time is typically very slow. One solution is to first gather the data in a dictionary, and then turn it into a dataframe for further processing:
d = {
'att1': ['a', 'b'],
'att2': ['c', 'd', 'e'],
'att3': ['f'],
'att4': ['g', 'h', 'i', 'j'],
}
df = pd.DataFrame.from_dict(d, orient='index')
Which results in df
containing:
0 1 2 3
att1 a b None None
att2 c d e None
att3 f None None None
att4 g h i j
Or more in line with typical pandas formats, store the data in one long series where 'att1' is used as index for values 'a' and 'b', etc.:
series = df.stack().reset_index(level=1, drop=True)
which allows for easy selection of various attributes:
series.loc[['att1', 'att3']]
returning:
att1 a
att1 b
att3 f
Upvotes: 1