Reputation: 1300
I am facing an issue while adding a new row to the data set.
Here is the example DataFrame
.
column_names = ['A','B','C']
items = [['a1','b1','c1'],['a2','b2']]
newDF = pd.DataFrame(items,columns=column_names)
print(newDF)
output:
A B C
0 a1 b1 c1
1 a2 b2 None
Since c2 was missing, it was replaced with None
. This is fine and as expected.
Now if i continue to add similar rows to this existing DataFrame
, like this:
newDF.loc[len(newDF)] = ['a3','b3']
I get the error "cannot set a row with mismatched columns".
How can I add this additional row, so that it will automatically take care of missing c3 with None
or NaN?
Upvotes: 15
Views: 42091
Reputation: 8816
what about just :
>>> print(newDF)
A B C
0 a1 b1 c1
1 a2 b2 None
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
Just place new index 2
with new values a3
& b3
and last column.
>>> newDF.loc['2'] = ['a3','b3', np.nan]
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
OR
>>> row = ['a3','b3', np.nan]
>>> newDF.loc['2'] = row
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
Another way around: appending to Dataframe, the new values across the row for desired columns as we have for A
& B
this another column for them row will become NaN
>>> row
['a3', 'b3']
>>> newDF.append(pd.DataFrame([row],index=['2'],columns=['A', 'B']))
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
Upvotes: 4
Reputation: 3018
You specify your new row as a dictionary and create a dataframe out of it.
new_entry = {'A': ['a3'], 'B': ['b3']}
new_entry_df=pd.DataFrame.from_dict(new_entry)
Now this can be appended to the original dataframe
newDF.append(new_entry_df)
A B C
0 a1 b1 c1
1 a2 b2 None
0 a3 b3 NaN
Upvotes: 3
Reputation: 78690
One option is DataFrame.append
:
>>> new_row = ['a3', 'b3']
>>> newDF.append(pd.Series(new_row, index=newDF.columns[:len(new_row)]), ignore_index=True)
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
Upvotes: 10