ixixxsx
ixixxsx

Reputation: 43

I'm trying to organize my csv data by sorting it according to 2 columns

I actually extracted my code from the previous times this question has been answered. However, my output is not what I anticipated. I'm organizing a refined data set by its only 2 columns. This is the refined data set I'm working with, sp,:

      ACC_TIME     COUNTY_NAME
978       0:01         Harford
952       0:01    Anne Arundel
995       0:01  Prince Georges
1059      0:01         Carroll
941       0:01  Prince Georges
...        ...             ...
17535     9:12       Frederick
17536     9:12       Frederick
17251     9:12    Anne Arundel
17507     9:12      Dorchester
18636     9:12       Frederick

sp is just df, with particular columns dropped. This is my code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import csv
from operator import itemgetter
from datetime import datetime
import operator



df=pd.read_csv("2012CarCrashes.csv")
df.drop(['ACC_TIME_CODE','ROAD', 'INTERSECT_ROAD','DIST_FROM_INTERSECT', 'CITY_NAME', 
         'DIST_DIRECTION', 'COUNTY_CODE', 'VEHICLE_COUNT', 'PROP_DEST', 
         'COLLISION_WITH_2', 'CASE_NUMBER', 'BARRACK'], axis=1,inplace=True) #--> inplace=True means to update the df file

df["ACC_DATE"]= pd.to_datetime(df["ACC_DATE"])  #-->converts datatype to datetime

df = df.sort_values('ACC_TIME') #-->sorts according to time of accident
.
.
.
.
sp =df.drop(['ACC_DATE','DAY_OF_WEEK','INJURY','COLLISION_WITH_1'],axis=1)

#Next, how can I organize the data by county and time of accidents? 

sp1 = sorted(sp, key=operator.itemgetter(0, 1))
print(sp1)

And this is the output I keep getting:

['ACC_TIME', 'COUNTY_NAME']

See, it just prints the title of both columns and nothing else.

What may I be doing wrong?

Upvotes: 0

Views: 25

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177901

Use a DataFrame method to sort the DataFrame. sorted() is not DataFrame-aware and the DataFrame object just iterates its column names:

>>> import pandas as pd
>>> df = pd.DataFrame([[2,3,4],[1,3,5],[2,1,7]],columns=['A','B','C'])
>>> df
   A  B  C
0  2  3  4
1  1  3  5
2  2  1  7
>>> df = df.sort_values(['A','B'])
>>> df
   A  B  C
1  1  3  5
2  2  1  7
0  2  3  4

Upvotes: 1

Related Questions