Reputation: 55

How to index a pandas data frame starting at n?

Is it possible to start the index from n in a pandas dataframe?

I have some datasets saved as csv files, and would like to add the column index with the row number starting from where the last row number ended in the previous file.

For example, for the first file I'm using the following code which works fine, so I got an output csv file with rows starting at 1 to 1048574, as expected:

yellow_jan['index'] = range(1, len(yellow_jan) + 1)

I would like to do same for the yellow_feb file, but starting the row index at 1048575 and so on.

Appreciate any help!

Upvotes: 1

Answers (3)

Roo

Reputation: 912

you may just reset the index at the end or define a local variable and use it in `arange' function. update the variable with the numbers of rows for each file you read.

Upvotes: 0

rpanai

Reputation: 13447

If your plan is to concat the dataframe you can just use

import pandas as pd
import numpy as np
df1 = pd.DataFrame({"a": np.arange(10)})
df2 = pd.DataFrame({"a": np.arange(10,20)})
df = pd.concat([df1, df2],ignore_index=True)

otherwise

df2.index += len(df)

Upvotes: 1

frhyme

Reputation: 1036

df["new_index"] = range(10, 20)
df = df.set_index("new_index")
df

Upvotes: 2

How to index a pandas data frame starting at n?

Answers (3)

Related Questions