Reputation: 1491
I have a relatively large (~300 MB) set of geolocation data, where the format is
Timestamp, id, type, x, y
With the following data types:
In[7]: df.dtypes
Out[7]:
Timestamp datetime64[ns]
id int64
type object
X int64
Y int64
dtype: object
Each id corresponds to a particular user, and each person has several hundred points recorded across the day.
I want to create a plot showing where everyone is at a certain second. So I need 1 point for every id. However, the data is somewhat sparse, and it's unlikely there's a data point that correlates precisely with that second. I want to approximate by interpolating between the closest two points.
Between data points, I'm assuming people move linearly, so that if we know the location at 8:31:10 and 8:31:50, then at 8:31:30 they should be exactly halfway between the two locations, and at 8:31:11 they should be 1/40th of the way between the points (so interpolating as described here: Pandas data frame: resample with linear interpolation)
I'm thinking the basic process would be:
I know I can loop through each id with
for name, group in df.groupby('id'):
and plotting isn't a problem, but I'm not sure about the rest.
After a bit of searching I haven't found any good way to do this for a single value from each group. Other answers suggest using the resample and interpolate functions, but that will take way too long with the size of data I have, and does a lot of unnecessary calculations seeing as I only need one point.
Upvotes: 1
Views: 554
Reputation: 20080
It is not quite clear what you want, but lets start with something
First, you probably need list of unique IDs, right?
import pandas as pd
import numpy as np
df = ...
unids = np.unique(df[['id']])
for id in unids:
df_id = # subset df by id, filtering out rows by id, and get back dataframe
# sort new df by Timestamp
tmin = new_df['Timestamp'][0]
tmax = new_df['Timestamp'][-1]
tstep = ... # time step
position = []
for t in range(tmin, tmax, tstep):
# interpolate
# add to position
plot(position)
is this looks reasonable?
Upvotes: 1