Reputation: 181
I have the following dictionary:
rts = {
"PO1": {
"congruent": {
"rt": [0.647259, 0.720116, 0.562909, 0.538918, 0.633367],
"correct": ["True", "True", "True", "True", "True", "False",]
},
"incongruent": {
"rt": [0.647259, 0.720116, 0.562909, 0.538918, 0.633367],
"correct": ["True", "True", "True", "True", "True", "False",]
}
},
"PO2": {
"congruent": {
"rt": [0.647259, 0.720116, 0.562909, 0.538918, 0.633367],
"correct": ["True", "True", "True", "True", "True", "False",]
},
"incongruent": {
"rt": [0.647259, 0.720116, 0.562909, 0.538918, 0.633367],
"correct": ["True", "True", "True", "True", "True", "False",]
}
}
}
Here is the code I have so far:
import csv
from pathlib import Path
import json
import numpy as np
from numpy import array
def main():
rts = {}
statsDict = {}
data = Path('C:/Users/oli.warriner/Desktop/data(2)/data')
for csvfile in data.glob('*.csv'):
key = csvfile.stem
with csvfile.open() as f:
csv_reader = csv.reader(f)
# Skip the header
_ = next(csv_reader)
rts[key] = {
'congruent': {
'rt': [],
'correct': []
},
'incongruent': {
'rt': [],
'correct': []
},
}
for tn, ctext, cname, condition, response, rt, correct in csv_reader:
rts[key][condition]['rt'].append(float(rt))
rts[key][condition]['correct'].append(correct)
for k in rts:
key = k
statsDict[key] = {
'congruent': {
'mean': [],
'stddev': [],
'correct': []
},
'incongruent': {
'mean': [],
'stddev': [],
'correct': []
},
}
for n in rts[k]:
for i in rts[key][n]
array([rts[k] for k in rts]).mean()
print(array)
if __name__ == "__main__":
main()
I am reading a directory of csv files to produce the "rts" dictionary you see above (Its much bigger than that I have just shortened for here).
What I am now looking to do is to use the "rts" dictionary to populate the "statsDict".
I need to loop through the "rts" dictionary and calculate the mean and standard deviation from the "rt" values in both the "congruent" and "incongruent" values for each key separately.
I then need to use the boolean values in "correct" for each key to calculate a percentage of true in each one.
I am managing to loop through the first couple of layers on the dictionary however now I am a little lost I'm not sure how to go into the next layer down and begin making the stats calculations I need.
Hope this is clear enough for people. Let me know if you have any questions. Thanks in advance!
Upvotes: 2
Views: 132
Reputation: 2528
Based on the example of rts
given, you can construct a dictionary with statistics with this code fragment:
import statistics
import json
rts = { ... as given ... }
stats_dict = {}
for k in rts.keys():
stats_dict[k] = {}
for ck in rts[k].keys():
stats_dict[k][ck] = {}
stats_dict[k][ck]["mean"] = statistics.mean(rts[k][ck]["rt"])
stats_dict[k][ck]["stdev"] = statistics.stdev(rts[k][ck]["rt"])
stats_dict[k][ck]["true_percentage"] = len([x for x in rts[k][ck]["correct"] if x == "True"]) / len(rts[k][ck]["correct"])
print(json.dumps(stats_dict, indent=2))
numpy
to calculate the staticstis. The built-in statistics
package is sufficientrts
and use the same keys for the statisics dictionary stats_dict
{
"PO1": {
"congruent": {
"mean": 0.6205138,
"stdev": 0.07207165926839758,
"true_percentage": 0.8333333333333334
},
"incongruent": {
"mean": 0.6205138,
"stdev": 0.07207165926839758,
"true_percentage": 0.8333333333333334
}
},
"PO2": {
"congruent": {
"mean": 0.6205138,
"stdev": 0.07207165926839758,
"true_percentage": 0.8333333333333334
},
"incongruent": {
"mean": 0.6205138,
"stdev": 0.07207165926839758,
"true_percentage": 0.8333333333333334
}
}
}
Upvotes: 2