Pandas - Column sorting changes

Question

I am trying to merge set of csv files into one Dataframe. In the process I create a new column called Time_Created which I am trying to have as the first column of the Dataframe.

df_v1 = pd.concat([pd.read_csv(f) for f in updatedfiles_1], sort=True)
cols = df_v1.columns.tolist()
print(cols) 
cols.insert(0, cols.pop(cols.index('Time_Created')))
print(cols) <-- This shows the columns as expected
df_v1.to_csv('file.csv')

I see when printing the columns before saving to csv the columns are modified as required but when I open the saved csv the column sorting changes.

Given below is the sequence of columns in the source:

Name,Price,Quanity,Time_Created

Sequence that I am trying to sort into:

Time_Created,Name,Price,Quanity

Could anyone guide me why is the output file changes the column sorting. Thanks.

jtweeder · Accepted Answer

I believe you are close. You never actually set the columns of the dataframe to your modified order. You can do something like this and it should work.

df_v1 = pd.concat([pd.read_csv(f) for f in updatedfiles_1], 
                  sort=True)
cols = df_v1.columns.tolist()
print(cols) 
cols.insert(0, cols.pop(cols.index('Time_Created')))
print(cols) <-- This shows the columns as expected
df_v1[cols].to_csv('file.csv')  <-Here you tell it to send df_v1 in [cols] order

If the last print(cols) is the order you want then you can just use it when sending the df_v1 to_csv.

Pandas - Column sorting changes

Answers (2)

Related Questions