Reputation: 6077
How to write below data in below table output format ?
list of dict
data=[{'date': datetime.date(2011, 2, 8), 'user': u'xxx', 'status': u'P'}, {'date': datetime.date(2011, 2, 8), 'user': u'yyy', 'status': u'P'}, {'date': datetime.date(2011, 2, 8), 'user': u'zzz', 'status': u'P'}, {'date': datetime.date(2011, 2, 9), 'user': u'xxx, 'status': u'P'}, {'date': datetime.date(2011, 2, 9), 'user': u'yyy', 'status': u'E'}, {'date': datetime.date(2011, 2, 9), 'user': u'zzz', 'status': u'E'}, {'date': datetime.date(2011, 2, 10), 'user': u'xxx', 'status': u'P'}, {'date': datetime.date(2011, 2, 10), 'user': u'yyy', 'status': u'P'}, {'date': datetime.date(2011, 2, 10), 'user': u'zzz', 'status': u'P'}]
Output format should be :
S.no user 2011-02-08 2011-02-09 2011-02-10 p-total E-total total
1 xxx p p p 3 0 3
2 yyy p E p 2 1 3
3 zzz p E E 1 2 3
Please need help
Upvotes: 1
Views: 7326
Reputation: 467301
I'm a bit conflicted about writing answers like these - it seems like just producing a complete solution with very little didactic value, but I've tried to make it as generally helpful as possible...
If I understand what you're trying to do correctly, you want to turn your data
into CSV where there is one row for each user. There are a range of dates and you want one column for each date - that column indicates the status of the user on that date. Then there are columns that produce the totals for each status across every date, and so on. The output you've quoted looks most like CSV with tabs as separators, although as eumiro points out, it isn't exactly. However, let's assume that you want to write tab-separated data. It's not clear from your question what should happen if you discover in data
that a user has two different statuses for one day, so let's check for that and throw an exception.
Note that everything in the last paragraph really should be in your question, along with the code from your best attempt so far.
So, using DictWriter
from the csv module is a reasonable idea, but to use that class you need a dictionary for each line which maps the column headings to values. So, you could iterate over everything in data
to produce a dictionary of dictionaries, mapping a user to the dictionary that represents the row for that user. You could do that with something like this:
from collections import defaultdict
import csv
from datetime import date
user_to_row = defaultdict(dict)
for d in data:
user = d['user']
status = d['status']
row_dict = user_to_row[user]
row_dict['user'] = user
date_string = str(d['date'])
if date_string in d and row_dict[date_string] != status:
raise Exception, "Contradiction: '%s' on '%s'" % (user,date_string)
row_dict[date_string] = status
# If a value isn't set in one of the total columns yet, set it to 0:
row_dict.setdefault('p-total',0)
row_dict.setdefault('E-total',0)
row_dict.setdefault('total',0)
# Make sure you increment the right column:
count_column = 'p-total' if (status == 'P') else 'E-total'
row_dict[count_column] += 1
# And increment the overall total column in any case:
row_dict['total'] += 1
You should check that you understand what's going on in there - try printing user_to_row
to check that you understand what is being produced.
Now you just need to loop over the values in the user_to_row
dictionary and output them with DictWriter. The thing to be careful with here is that you're not sure that there will be an entry for every date, so in this case I've just inserted Unknown
when a value is missing:
with open("hello.csv","w") as f:
# Create the headings:
headings = ['S.no']
headings += [str(date(2011,2,i)) for i in xrange(6,11)]
headings += ['user', 'date_format','p-total','E-total','total']
writer = csv.DictWriter(f, headings, delimiter="\t")
# The writeheader method only appeared in Python 2.7, so write the
# headings from a dictionary that maps each heading to itself:
writer.writerow(dict(zip(headings,headings)))
# Assume that S.no is just a row number...
sno = 1
for d in user_to_row.values():
d['S.no'] = sno
# Fill in any unknown values with 'Unknown':
for h in headings:
d.setdefault(h,'Unknown')
writer.writerow(d)
sno += 1
The documentation for the csv module should give you all the extra information you need to understand that part.
The output then looks like:
S.no 2011-02-06 2011-02-07 2011-02-08 2011-02-09 2011-02-10 user date_format p-total E-total total
1 Unknown Unknown P P P xxx Unknown 3 0 3
2 Unknown Unknown P E P yyy Unknown 2 1 3
3 Unknown Unknown P E P zzz Unknown 2 1 3
... which looks odd here because of the tabs, but would load into a spreadsheet correctly.
Upvotes: 18