Reputation: 3101
I'm trying to add a new row to the DataFrame with a specific index name 'e'
.
number variable values
a NaN bank true
b 3.0 shop false
c 0.5 market true
d NaN government true
I have tried the following but it's creating a new column instead of a new row.
new_row = [1.0, 'hotel', 'true']
df = df.append(new_row)
Still don't understand how to insert the row with a specific index. Will be grateful for any suggestions.
Upvotes: 45
Views: 122819
Reputation: 19
In future versions of Pandas, DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
will be deprecated.
Source: Pandas Documentation
The documentation recommends using .concat()
.
It would look like this (if you wanted an empty row with only the added index name:
df = pd.concat([df, pd.Series(index=['New index label'], dtype=str)])
If you wanted to add data use this:
df = pd.concat([df, pd.Series(data, index=['New index label'], dtype=str)])
Hope that helps!
Upvotes: 1
Reputation: 965
df.loc['e', :] = [1.0, 'hotel', 'true']
should be the correct implementation in case of conflicting index and column names.
Upvotes: 1
Reputation: 30605
Use append by converting list a dataframe in case you want to add multiple rows at once i.e
df = df.append(pd.DataFrame([new_row],index=['e'],columns=df.columns))
Or for single row (Thanks @Zero)
df = df.append(pd.Series(new_row, index=df.columns, name='e'))
Output:
number variable values a NaN bank True b 3.0 shop False c 0.5 market True d NaN government True e 1.0 hotel true
Upvotes: 17
Reputation: 210842
You can use df.loc[_not_yet_existing_index_label_] = new_row
.
Demo:
In [3]: df.loc['e'] = [1.0, 'hotel', 'true']
In [4]: df
Out[4]:
number variable values
a NaN bank True
b 3.0 shop False
c 0.5 market True
d NaN government True
e 1.0 hotel true
PS using this method you can't add a row with already existing (duplicate) index value (label) - a row with this index label will be updated in this case.
UPDATE:
This might not work in recent Pandas/Python3 if the index is a DateTimeIndex and the new row's index doesn't exist.
it'll work if we specify correct index value(s).
Demo (using pandas: 0.23.4
):
In [17]: ix = pd.date_range('2018-11-10 00:00:00', periods=4, freq='30min')
In [18]: df = pd.DataFrame(np.random.randint(100, size=(4,3)), columns=list('abc'), index=ix)
In [19]: df
Out[19]:
a b c
2018-11-10 00:00:00 77 64 90
2018-11-10 00:30:00 9 39 26
2018-11-10 01:00:00 63 93 72
2018-11-10 01:30:00 59 75 37
In [20]: df.loc[pd.to_datetime('2018-11-10 02:00:00')] = [100,100,100]
In [21]: df
Out[21]:
a b c
2018-11-10 00:00:00 77 64 90
2018-11-10 00:30:00 9 39 26
2018-11-10 01:00:00 63 93 72
2018-11-10 01:30:00 59 75 37
2018-11-10 02:00:00 100 100 100
In [22]: df.index
Out[22]: DatetimeIndex(['2018-11-10 00:00:00', '2018-11-10 00:30:00', '2018-11-10 01:00:00', '2018-11-10 01:30:00', '2018-11-10 02:00:00'], dtype='da
tetime64[ns]', freq=None)
Upvotes: 74
Reputation: 886
If it's the first row you need:
df = Dataframe(columns=[number, variable, values])
df.loc['e', [number, variable, values]] = [1.0, 'hotel', 'true']
Upvotes: 4