Reputation: 31
I have a list of dictionaries of 100 points as follows:
datapoint1 a:1 b:2 c:6
datapoint2 a:2 d:8 p:10
.....
datapoint100: c:9 d:1 z:12
I want to print a list to a file as follows:
a b c d ...... z
datapoint1 1 2 6 0 ...... 0
datapoint2 2 0 0 8 ...... 0
.........
.........
datapoint100 0 0 9 1 ...... 12
Here to mention a,b,c...z are just for example the real number of words are not known beforehand, so the total number of words is not 26, it can be 1000/ 10000 and a, b, .... will be replaced with real words like 'my', 'hi', 'tote' ... etc.
I have been thinking of trying to do it as follows:
But this method seems complicated in python. Is there any better way of doing it in python?
Upvotes: 0
Views: 240
Reputation: 2143
Program:
data_points = [
{'a': 1, 'b': 2, 'c': 6},
{'a': 2, 'd': 8, 'p': 10},
{'c': 9, 'd': 1, 'z': 12},
{'e': 3, 'f': 6, 'g': 3}
]
merged_data_points = {
}
for data_point in data_points:
for k, v in data_point.items():
if k not in merged_data_points:
merged_data_points[k] = []
merged_data_points[k].append(v)
# print the merged datapoints
print '{'
for k in merged_data_points:
print ' {0}: {1},'.format(k, merged_data_points[k])
print '}'
Output:
{
a: [1, 2],
c: [6, 9],
b: [2],
e: [3],
d: [8, 1],
g: [3],
f: [6],
p: [10],
z: [12],
}
Upvotes: 0
Reputation: 353019
If you don't care much about the fiddly bits of column alignment, this isn't too bad:
datapoints = [{'a': 1, 'b': 2, 'c': 6},
{'a': 2, 'd': 8, 'p': 10},
{'c': 9, 'd': 1, 'z': 12}]
# get all the keys ever seen
keys = sorted(set.union(*(set(dp) for dp in datapoints)))
with open("outfile.txt", "wb") as fp:
# write the header
fp.write("{}\n".format(' '.join([" "] + keys)))
# loop over each point, getting the values in order (or 0 if they're absent)
for i, datapoint in enumerate(datapoints):
out = '{} {}\n'.format(i, ' '.join(str(datapoint.get(k, 0)) for k in keys))
fp.write(out)
produces
a b c d p z
0 1 2 6 0 0 0
1 2 0 0 8 10 0
2 0 0 9 1 0 12
As mentioned in the comments, the pandas solution is pretty nice too:
>>> import pandas as pd
>>> df = pd.DataFrame(datapoints).fillna(0).astype(int)
>>> df
a b c d p z
0 1 2 6 0 0 0
1 2 0 0 8 10 0
2 0 0 9 1 0 12
>>> df.to_csv("outfile_pd.csv", sep=" ")
>>> !cat outfile_pd.csv
a b c d p z
0 1 2 6 0 0 0
1 2 0 0 8 10 0
2 0 0 9 1 0 12
If you really need the columns nicely aligned, then there are ways to do that too, but I never need them so I don't know much about them.
Upvotes: 1