Reputation: 55
Is it possible to start the index from n in a pandas dataframe?
I have some datasets saved as csv files, and would like to add the column index with the row number starting from where the last row number ended in the previous file.
For example, for the first file I'm using the following code which works fine, so I got an output csv file with rows starting at 1 to 1048574, as expected:
yellow_jan['index'] = range(1, len(yellow_jan) + 1)
I would like to do same for the yellow_feb file, but starting the row index at 1048575 and so on.
Appreciate any help!
Upvotes: 1
Views: 2422
Reputation: 912
you may just reset the index at the end or define a local variable and use it in `arange' function. update the variable with the numbers of rows for each file you read.
Upvotes: 0
Reputation: 13447
If your plan is to concat the dataframe you can just use
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"a": np.arange(10)})
df2 = pd.DataFrame({"a": np.arange(10,20)})
df = pd.concat([df1, df2],ignore_index=True)
otherwise
df2.index += len(df)
Upvotes: 1