Reputation: 639
Hi I am creating a Pandas DF from this peice of code:
for odfslogp_obj in odfslogs_plist:
with zipfile.ZipFile(odfslogp_obj, mode='r') as z:
for name in z.namelist():
with z.open(name) as etest_zip:
tdict = {}
etestlines = [line.decode() for line in etest_zip] #change lines from log files from binary to text
regval_range_tup_list = list(zip([i for i,x in enumerate(etestlines) if 'Head' in x ], [i for i,x in enumerate(etestlines) if 'BINS' in x ])) #get binval sections
head_siteparam_tup_list = list(zip([x.split("=")[1].replace("(",'').replace(")",'').rstrip() for x in etestlines if 'Head' in x], [x.split(":")[2].rstrip() for x in etestlines if 'SITE:PARAM_SITE:' in x])) #extract head and site:param values from bin val sections
print(head_siteparam_tup_list)
linesineed = [etestlines[range[0]:range[1]-1] for range in regval_range_tup_list]
reglinecount = []
regvals = []
for head_site, loclist in zip(head_siteparam_tup_list, linesineed):
regvals_ext = [x for x in loclist if pattern.search(x)]
regvaltups_list = [tuple(x.split(":")[0:2]) for x in regvals_ext]
regvaldict = dict(regvaltups_list)
df = pd.DataFrame(data=regvaldict)
print(df)
The Sample of output of the dictionary being used looks like this when printed:
{'1000': '1669.15', '10012': '-0.674219', '10013': '-0.260156', '1003': '9.5792', '1007': '11.9812', '1011': '27.888', '1012': '14.8333', '1014': '19.1812', '1015': '19.0396', '1024': '1352.66', '1025': '3247.63', '1026': '33.7434', '1027': '38.7566', '1030': '19.7548', '1031': '30.2201'}
As you can see they are all strings, so why is it giving me this error? And how do i fix it?
Upvotes: 0
Views: 263
Reputation: 639
I found what I was looking for by using a list of dictionaries instead of passing a dictionary.
This not only helped me append different dictionaries but it streamlined the code and allowed me to create a dataframe from the aggregate list of dictionaries easily to exactly how I wanted it to look:
for head_site, loclist in zip(head_siteparam_tup_list, linesineed):
regvals_ext = [x for x in loclist if pattern.search(x)]
#print(regvals_ext)
regvaltups_list = [tuple(x.split(":")[0:2]) for x in regvals_ext]
regvaldict = dict(regvaltups_list)
regvaltupaggr_list.append(regvaldict)
regvalfile_df = pd.DataFrame(regvaltupaggr_list)
# print(regvalfile_df)
regvalfile_df.to_csv(r"C:\Users\sys_nsgprobeingestio\Documents\dozie\odfs\etest\filebinval.csv", index=False)
Ouput:
1000 10012 10013 1003 1007 1011 1012 1014 1015 1024 1025 1026 1027 ... 9717 9718 9722 9723 9724 9725 9726 9727 9728 9729 9730 9912 9913
0 1665.67 -0.678906 -0.267969 9.66017 12.0638 27.728 15.2347 19.9796 19.634 1352.33 3618.55 32.843 38.1179 ... 81.58 89.88 106.239 117.136 132.556 132.944 141.92 132.76 161.551 68.6192 67.325 -0.68125 -0.27031
Upvotes: 0
Reputation: 1058
Check the parameter orient
from .from_dict()
:
pd.DataFrame.from_dict(dic, orient='index')
Another option:
pd.DataFrame(dic.keys(), index = dic.values())
Output:
0
1000 1669.15
10012 -0.674219
10013 -0.260156
...
Alternatively, if you do not want to have the keys as index:
pd.DataFrame(dic.items())
Output:
0 1
0 1000 1669.15
1 10012 -0.674219
2 10013 -0.260156
Upvotes: 3