Reputation: 1426
I'm trying to use mongoimport to upsert data with string values in _id. Since the ids look like integers (even though they're in quotes), mongoimport treats them as integers and creates new records instead of upserting the existing records.
Command I'm running:
mongoimport --host localhost --db database --collection my_collection --type csv --file mydata.csv --headerline --upsert
Example data in mydata.csv:
{ "_id" : "0364", someField: "value" }
The result would be for mongo to insert a record like this: { "_id" : 364, someField: "value" }
instead of updating the record with _id "0364"
.
Does anyone know how to make it treat the _id
as strings?
Things that don't work:
{ "_id" : "0364" + "", someField: "value" }
Upvotes: 5
Views: 6495
Reputation: 3
I encountered the same issue.
I feel the simplest way is to convert the CSV file to a JSON file using an online tool and then import.
This is the tool I used:
http://www.convertcsv.com/csv-to-json.htm
It lets you wrap the integer values of your CSV file in double quotes for your JSON file.
If you have trouble importing this JSON file and encountering an error, just add --jsonArray to your import command. It will work for sure.
mongoimport --host localhost --db mydb -c mycollection --type json --jsonArray --file <file_path>
Upvotes: 0
Reputation: 9665
As an alternative to @Jesse, you can do something similar in the mongo console, e.g.
db.my_collection.find().forEach(function (obj) {
db.my_collection.remove({_id: obj._id); // remove the old one
obj._id = '' + obj._id; // change to string
db.my_collection.save(obj); // resave
});
For non _id
fields you can simply do:
db.my_collection.find().forEach(function (obj) {
obj.someField = '' + obj.someField; // change to string
db.my_collection.save(obj); // resave
});
Upvotes: 0
Reputation: 63
I was able to prefix the numeric string and that worked for me. Example:
00012345 was imported as 12345 (Type Int) string00012345 was imported as string00012345 (Type String)
My source was a SQL database so I just did
select 'string'+column as name
Of course, you also need to do a bit of post-processing to parse the string, but far less effort than converting a rather large tsv file to json.
I also added +1 to the jira link above for the enhancement.
Upvotes: 1
Reputation: 53573
Just encountered this same issue and discovered an alternative. You can force Mongo to use string types for non-string values by converting your CSV to JSON and quoting the field. For example, if your CSV looks like this:
key value
123 foo
abc bar
Then you'll get an integer field for key 123 and a string field for key abc. If you convert that to JSON, making sure that all the keys are quoted, and then use --type json
when you import, you'll end up with the desired behavior:
{
"123":"foo",
"abc":"bar"
}
Upvotes: 2
Reputation: 24007
Unfortunately there is not now a way to force number-like strings to be interpreted as strings:
https://jira.mongodb.org/browse/SERVER-3731
You could write a script in Python or some other language with which you're comfortable, along the lines of:
import csv, pymongo
connection = pymongo.Connection()
collection = connection.mydatabase.mycollection
reader = csv.DictReader(open('myfile.csv'))
for line in reader:
print '_id', line['_id']
upsert_fields = {
'_id': line['_id'],
'my_other_upsert_field': line['my_other_upsert_field']}
collection.update(upsert_fields, line, upsert=True, safe=True)
Upvotes: 2