Reputation: 2109
I am trying to use Pandas to sum the time (hours, minutes) of a series. The data comes from a TimeField
class PhoneRecord ( models.Model ):
et = models.TimeField ( null=True, blank=True )
In python I get the record and convert to a dataframe.
phone = PhoneRecord.objects.all()
df = read_frame ( phone )
df.et = df.et.fillna ( '00:00:00' ) # some records are blank
df [ "time" ] = pd.to_datetime(df.et, format = '%H:%M:%S', errors = 'coerce')
this gives me the following output.
0 00:00:35
1 00:00:29
2 00:00:00
3 00:00:00
4 00:00:37
......
When I try to sum
df.time.sum ()
I get errors like: unsupported operand type(s) for +: 'datetime.time' and 'datetime.time'
What do I need to do to be able to sum and average the data.
Thank you.
Upvotes: 1
Views: 882
Reputation: 190
time
objects into timedelta
objects which can then be summed together. (see the "print" line)from datetime import datetime, timedelta
import pandas as pd
phone = PhoneRecord.objects.all()
df = pd.DataFrame(list([i.__dict__ for i in phone])) # create pd.df from model query
df.et = df.et.fillna('00:00:00') # some records are blank
print(df.et)
print("SUM:", sum([datetime.combine(datetime.min, time) - datetime.min for time in df.et.tolist()], timedelta()))
You should get something like this:
0 00:00:20
1 00:00:20
2 00:00:50
3 00:00:30
4 00:00:20
SUM: 0:02:20
I had to change things a bit to get them to work on my end so hopefully, it is the same with you and your version of Pandas and Django. Hope this helps!
Upvotes: 2
Reputation: 1672
All computations, such as getting averages, counting etc. should be performed (if possible) using database engine. I don't know about underlying problem, but using pandas
on the server-side to get average values is definitely overkill. You need to look at aggregation facility of Django.
Also, you probably need to restructure the model. If you need to store duration of a phone conversation, you can use FloatField instead, e.g.
class PhoneRecord(models.Model):
duration = models.FloatField(blank=True, default=0.0, help_text=_('duration in seconds'))
# other fields...
# also, you can set up the duration field with `editable=False`, and
# calculate its value each time the record is created
In this case you can use Avg
:
from django.db.models import Avg
PhoneRecords.objects.all().aggregate(Avg('duration'))
and get something like this
{'duration_avg': 12.3}
Upvotes: 1