Reputation: 1399
I have data in an mongo collection called "hello". The documents look like:
{
name: ...,
size: ...,
timestamp: ISODate("2013-01-09T21:04:12Z"),
data: { text:..., place:...},
other: ...
}
I would like to export the timestamp and the text from each document into a CSV File, with first column the Timestamp and second column the text.
I tried creating a new collection (hello2) where the documents only have the timestamp and the text.
data = db.hello
for i in data:
try:
connection.me.hello2.insert(i["data"]["text"], i["timestamp"])
except:
print "Unable", sys.exc_info()
I then wanted to use mongoexport:
mongoexport --db me --collection hello2 --csv --out /Dropbox/me/hello2.csv
But this is not working and I do not know how to proceed.
PS: I would also like to store only the time of the ISODate in the CSV File, i.e. just 21:04:12 instead of ISODate("2013-01-09T21:04:12Z")
Thank you for your help.
Upvotes: 1
Views: 2332
Reputation: 214959
You can export right from the data collection, no need for a temporary collection:
for r in db.hello.find(fields=['text', 'timestamp']):
print '"%s","%s"' % (r['text'], r['timestamp'].strftime('%H:%M:%S'))
or to write to a file:
with open(output, 'w') as fp:
for r in db.hello.find(fields=['text', 'timestamp']):
print >>fp, '"%s","%s"' % (r['text'], r['timestamp'].strftime('%H:%M:%S'))
To filter out duplicates and print only most recent ones, the process should be split in two steps. First, accumulate data in a dictionary:
recs = {}
for r in d.foo.find(fields=['data', 'timestamp']):
text, time = r['data']['text'], r['timestamp']
if text not in recs or recs[text] < time:
recs[text] = time
and then output the dictionary content:
for text, time in recs.items():
print '"%s","%s"' % (text, time.strftime('%H:%M:%S'))
Upvotes: 2