Alessandro Ceccarelli
Alessandro Ceccarelli

Reputation: 1945

Change the type of feature from string to float using PyMongo

I am connecting to a MongoDB using the following client:

client = MongoClient("mongodb+srv:...")

and I would like to convert the feature "imdbRating" to float from string, throughout the whole DB.

How can I achieve that? p

Upvotes: 0

Views: 1009

Answers (2)

Belly Buster
Belly Buster

Reputation: 8844

NB This answer uses the pipeline operator in update_many() which requires MongoDB 4.2 or later.

Firstly just to point out there is no "float" type in MongoDB, your choice is double or decimal.

You can do the update in one line, with a further line to loop all the collections.

for col in db.list_collection_names():
    db[col].update_many({'imdbRating': {'$exists': True, '$type': 'string'}}, [{'$set': {'imdbRating': { '$toDouble': '$imdbRating'}}}])

If you want a decimal instead of a double replace $toDouble with $toDecimal.

Example with test data setup:

from pymongo import MongoClient
from bson.json_util import dumps

# Database connection and test data setup; 10 collections each with 10 records

db = MongoClient()['mydatabase']

for col in range(10):
    for record in range(10):
        db[f'col{col}'].insert_one({'Record': str(record), 'imdbRating': str(record)})

print('Before\n' + dumps(db.col9.find_one({}, {'_id': 0}), indent=4))

# Update all imdbRating to float from string in all collections

for col in db.list_collection_names():
    db[col].update_many({'imdbRating': {'$exists': True, '$type': 'string'}}, [{'$set': {'imdbRating': { '$toDouble': '$imdbRating'}}}])

print('After\n' + dumps(db.col9.find_one({}, {'_id': 0}), indent=4))

Gives:

Before
{
    "Record": "0",
    "imdbRating": "0"
}
After
{
    "Record": "0",
    "imdbRating": 0.0
}

Upvotes: 0

hhharsha36
hhharsha36

Reputation: 3349

The below script will do the trick.

client = MongoClient()

col = client['<DB-Name>']['<Coollection-Name>']

count = 0
for cursor in col.find({}, {"imdbRating": 1}):
    col.update_one({
        "_id": cursor["_id"]
    }, {
        "$set": {
            "imdbRating": float(cursor["imdbRating"])
        }
    })
    count += 1
    print("\r", count, end='')
print("\n\nDONE!!!")

Upvotes: 1

Related Questions