Reputation: 11
I have tried this code but this shows attribute error.
from dask.base import compute
import dask.dataframe as dd
import pandas as pd
def dfWithTripTimes(df):
startTime=datetime.now()
duration=df[["tpep_pickup_datetime","tpep_dropoff_datetime"]].compute()
pickup_time=[timeToUnix(pkup) for pkuo in duration["tpep_pickup_datetime"].values]
dropoff_time=[timeToUnix(pkup) for pkuo in duration["tpep_dropoff_datetime"].values]
trip_Duration=(np.array(dropoff_time)-np.array(pickup_time))/float(60)
NewFrame=df[['passanger_count','trip_distance','pickup_longitude','pickup_latitude','dropoff_longitude','dropoff_latitude']]
NewFrame=df["trip_duration"]=trip_duration
NewFrame=df["pickup_time"]=pickup_time
NewFrame=df["speed"]=(NewFrame["trip_distance"]/NewFrame["Trip_Duration"])*60
print("Time taken for creation of dataframe is {}".format(datetime.now()-startTime))
return NewFrame
new_frame=dfWithTripTimes(data)
Upvotes: 1
Views: 3679
Reputation: 57319
Only Dask DataFrame objects have a .compute
method. The error that you get is consistent with your dataframe being a Pandas DataFrame instead. If you are using Pandas then there is no need to call .compute()
Upvotes: 3