Luis
Luis

Reputation: 325

separate dataset by day with pandas

I have a dataset that looks like this,

"2018-05-30 21:26:43",20.61129150,-100.40933971
"2018-05-30 21:26:43",20.61127415,-100.41146822
"2018-06-02 21:56:12",21.15633228,-100.93766080
"2018-06-05 22:57:40",20.59734201,-100.38091286
"2018-06-05 22:57:40",20.59875096,-100.37821426
"2018-06-06 20:56:22",20.61278120,-100.38446619
"2018-06-06 20:56:22",20.59865452,-100.37827264
"2018-06-06 21:57:15",20.59862012,-100.37817348
"2018-06-06 21:57:15",20.59864713,-100.37821263
"2018-06-06 21:57:15",20.59862915,-100.37825902
"2018-06-07 15:54:29",20.61280757,-100.39768857
"2018-06-07 15:54:29",20.61276216,-100.39769379

I want to separate my data into day groups so i can calculate distances and come up with the average distance travelled per day.

Im currently separating it by my date column like this:

col_names = ['date', 'latitude', 'longitude']
df = pd.read_csv('marco.csv', names=col_names, sep=',', skiprows=1)

# merge
m = df.reset_index().merge(df.reset_index(), on='date')

However i would like to separate it by day so that i get indexes of

2018-05-30, 2018-06-05, 2018-06-06, 2018-06-07

How would i approach this problem?

Upvotes: 0

Views: 32

Answers (1)

killian95
killian95

Reputation: 813

As Yuca mentioned, group by should do the trick. I would make a new column called "day" that just contains the day from your time stamp, sort by date, group by "date", then calculate the distance traveled in each group.

import pandas as pd

a = pd.DataFrame(
    [["2018-05-30 21:26:43",20.61129150,-100.40933971],
    ["2018-05-30 21:26:43",20.61127415,-100.41146822],
    ["2018-06-02 21:56:12",21.15633228,-100.93766080],
    ["2018-06-05 22:57:40",20.59734201,-100.38091286]], 
    columns=['date', 'lat', 'lng'])

a['date'] = pd.to_datetime(a['date'])


a['day'] = a['date'].dt.date

b = a.groupby('day')

# Loop over the groups and do whatever calculation you need
for tup in b:
    group = tup[0]
    df = tup[1]
    print df['lat'].sum()

Upvotes: 1

Related Questions