Reputation: 28437
I know that I can reset the indices like so
df.reset_index(inplace=True)
but this will start the index from 0
. I want to start it from 1
. How do I do that without creating any extra columns and by keeping the index/reset_index functionality and options? I do not want to create a new dataframe, so inplace=True
should still apply.
Upvotes: 73
Views: 147114
Reputation: 393963
Just assign directly a new index array:
df.index = np.arange(1, len(df)+1)
Or if the index is already 0 based, just:
df.index += 1
Example:
In [151]:
df = pd.DataFrame({'a': np.random.randn(5)})
df
Out[151]:
a
0 0.443638
1 0.037882
2 -0.210275
3 -0.344092
4 0.997045
In [152]:
df.index = np.arange(1, len(df)+1)
df
Out[152]:
a
1 0.443638
2 0.037882
3 -0.210275
4 -0.344092
5 0.997045
TIMINGS
For some reason I can't take timings on reset_index
but the following are timings on a 100,000 row df:
In [160]:
%timeit df.index = df.index + 1
The slowest run took 6.45 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 107 µs per loop
In [161]:
%timeit df.index = np.arange(1, len(df)+1)
10000 loops, best of 3: 154 µs per loop
So without the timing for reset_index
I can't say definitively, however it looks like just adding 1 to each index value will be faster if the index is already 0
based
Upvotes: 124
Reputation: 23051
One possibility is to simply increment each index value by 1 (which changes the index in-place).
df = pd.DataFrame({'col': [1, 2, 3]})
df.index += 1
Another is to assign a new range index, that starts from 1, using set_axis()
.
df = pd.DataFrame({'col': [1, 2, 3]})
df = df.set_axis(range(1, len(df)+1))
In fact, since set_axis()
assigns a new object to the index, i.e. resets the index, it can be used instead of reset_index()
.
It is especially useful if you need to make the index to start from 1 in a pipeline (where assigning or incrementing index wouldn't work).
df = pd.DataFrame({'col': [4, 1, 2, 3]})
df = (
df
.reset_index()
.set_axis(range(1, len(df)+1))
)
or the dataframe shape needs to be modified in the pipeline (e.g. using query()
), pipe()
could be used.
df = pd.DataFrame({'col': [4, 1, 2, 3]})
df = (
df
.query('col > 2')
.pipe(lambda x: x.set_axis(range(1, len(x)+1)))
)
Upvotes: 4
Reputation: 185
For this, you can do the following(I created an example dataframe):
price_of_items = pd.DataFrame({
"Wired Keyboard":["$7","4.3","12000"],"Wireless Keyboard":["$13","4.6","14000"]
})
price_of_items.index += 1
Upvotes: 4
Reputation: 793
You can also specify the start value using index range like below. RangeIndex is supported in pandas.
#df.index
default value is printed, (start=0,stop=lastelement, step=1)
You can specify any start value range like this:
df.index = pd.RangeIndex(start=1, stop=600, step=1)
Refer: pandas.RangeIndex
Upvotes: 11