Reputation: 353
I have a file that contains millions of arrays of this size:
{
"leagues" : [{
"tier" : "SILVER",
"entries" : [{
"playerOrTeamId" : "359",
"playerOrTeamName" : "TryHard",
"division" : "II",
"leaguePoints" : "63",
"wins" : "65"
}],
"id" : "359"
}],
"summonerId" : "359",
"region" : "euw",
"updatedAt" : "1412122432"
},
That's, for example, the smallest array.,There are some arrays that have additional linked arrays that contain extra information in relation to the primary array. Example:
{
"summonerId" : "477",
"region" : "euw",
"leagues" : [{
"tier" : "GOLD",
"entries" : [{
"playerOrTeamId" : "477",
"playerOrTeamName" : "Alucard662545",
"division" : "V",
"leaguePoints" : "9",
"wins" : "128"
}]
}, {
"tier" : "SILVER",
"entries" : [{
"playerOrTeamId" : "TEAM-8d6a3640-2da8-11e2-99dc-782bcb4ce61a",
"playerOrTeamName" : "CAPCOMP BE",
"division" : "V",
"leaguePoints" : "0",
"wins" : "24"
}]
}, {
"tier" : "BRONZE",
"entries" : [{
"playerOrTeamId" : "TEAM-8d6a3640-2da8-11e2-99dc-782bcb4ce61a",
"playerOrTeamName" : "CAPCOMP BE",
"division" : "I",
"leaguePoints" : "55",
"wins" : "8"
}]
}],
"updatedAt" : "1410786559"
},
I have been literally pulling hair out of my head, spend 2 day and nights to figure it out. I have MongoDB where this information is stored in, when I export it I can only get decoded JSON arrays. I need this stuff to be fully CSV formatted. How name can I CSV format a million arrays like these?
Upvotes: 0
Views: 89
Reputation: 34677
You have two options:
mongoexport is a utility that produces a JSON or CSV export of data stored in a MongoDB instance. Usage example:
mongoexport --db users --collection contacts --csv --fieldFile fields.txt --out /opt/backups/contacts.csv
which takes the fields specified in the \r-terminated fields.txt file, one per line, from the collection contacts
and puts them into /opt/backups/contacts.csv.
Any other way to read in JSON to a language and write csv. An example in python follows:
from pymongo import MongoClient
import csv
client = MongoClient()
db = client['test-database']
collection = db.test_collection
writer = csv.writer('/opt/backups/contacts.csv')
writer.writerow([k for k in collection])
writer writerows([[v for v in c] for c in collection])
... and does the same Hope that helps.
Upvotes: 1