Reputation: 165
I am wondering if i can find the average and upload it along with my data. My code is this:
for file in sorted_files:
df = process_file(file)
for row,item in df.iterrows():
data_dict = item.to_dict()
mycol1.update_one(
{"nsamples": {"$lt": 13}},
{
"$push": {"samples": data_dict},
"$min": {"first": data_dict['timestamp1'],"minid13":data_dict['id13']},
"$max": {"last": data_dict['timestamp1'],'maxid13':data_dict['id13']},
"$inc": {"nsamples": 1,"totid13":data_dict['id13']}
},
upsert=True
)
My data look like this:
{'_id': ObjectId('6068da8878fa2e568c42c7f1'),
'first': datetime.datetime(2018, 1, 24, 14, 5),
'last': datetime.datetime(2018, 1, 24, 15, 5),
'maxid13': 12.5,
'minid13': 7.5,
'nsamples': 13,
'samples': [{'c14': 'C',
'id1': 3758.0,
'id10': 0.0,
'id11': 274.0,
'id12': 0.0,
'id13': 7.5,
'id15': 0.0,
'id16': 73.0,
'id17': 0.0,
'id18': 0.342,
'id19': 6.3,
'id20': 1206.0,
'id21': 0.0,
'id22': 0.87,
'id23': 0.0,
'id6': 2.0,
'id7': -79.09,
'id8': 35.97,
'id9': 5.8,
'timestamp1': datetime.datetime(2018, 1, 24, 14, 5),
'timestamp2': datetime.datetime(2018, 1, 24, 9, 5)},
{'c14': 'C',
'id1': 3758.0,
'id10': 0.0,
'id11': 288.0,
'id12': 0.0,
'id13': 8.4,
'id15': 0.0,
'id16': 71.0,
'id17': 0.0,
'id18': 0.342,
'id19': 6.3,
'id20': 1207.0,
'id21': 0.0,
'id22': 0.69,
'id23': 0.0,
'id6': 2.0,
'id7': -79.09,
'id8': 35.97,
'id9': 6.2,
'timestamp1': datetime.datetime(2018, 1, 24, 14, 10),
'timestamp2': datetime.datetime(2018, 1, 24, 9, 10)},
.
.
.
.
I use totid13
for that purpose but i if need to to find the average in many document its not very helpful.
I tried something like that:
for file in sorted_files:
df = process_file(file)
#df.reset_index(inplace=True) # Reset Index
#data_dict = df.to_dict('records') # Convert to dictionary
#to row einai o arithmos ths grammhskai to item ti periexei h grammh
for row,item in df.iterrows():
data_dict = item.to_dict()
mycol1.update_one(
{"nsamples": {"$lt": 13}},
{
"$push": {"samples": data_dict},
"$min": {"first": data_dict['timestamp1'],"minid13":data_dict['id13']},
"$max": {"last": data_dict['timestamp1'],'maxid13':data_dict['id13']},
"$avg":{"avg_id13":data_dict['id13']},
"$inc": {"nsamples": 1,"totid13":data_dict['id13']}
},
upsert=True
)
But the output is:
pymongo.errors.WriteError: Unknown modifier: $avg. Expected a valid update modifier or pipeline-style update specified as an array, full error: {'index': 0, 'code': 9, 'errmsg': 'Unknown modifier: $avg. Expected a valid update modifier or pipeline-style update specified as an array'}
Thanks in advance!
Upvotes: 0
Views: 31
Reputation: 8844
$avg
is not an update operator it's only an aggregation operator.
If you need the average, calculate that in pandas; you already have the data in pandas and it's what pandas is good at.
Upvotes: 1