Iteratively populate a nested dictionary in Python

Question

I am working on dynamically populating a nested dictionary with data from MongoDB. I am not that well-versed in using dictionaries, so please bear with me. I have checked over and over and tried different approaches, but I still keep on getting the same incorrect result.

The data I am trying to feed into the dictionary is not in a tuple, as I have seen in the questions I have checked, but in a collection from MongoDB.

This is what my collection fields look like:

new_crawl_130422_data.insert_one(
        {
        "database_url": proj_database_url,
        "database_project_id": proj_database_id,
        "projectname": proj_database_name,
        "version": version,
        "boost": boost,
        "content": content,
        "digest": digest,
        "title": title,
        "timestamp": timestamp,
        "url": website,
        "language": language
        }

The language field there can be various languages, for a particular project_id. So in essence, I have a number of records per project_id, and some of them are in different languages. What I am trying to do is create a nested dictionary with the project_id as the name, and the keys being the different languages. So I should have something like:

{Project_id1: {'it': "text here in Italian if it exists in the collection" ,'en': "text here in English if it exits", 'de': "text here in German if it exists"}
{Project_id2: {'en': "text here in English if it exists in the collection" ,'fr': "text here in French if it exits", 'de': "text here in German if it exists"}

etc.

Hence, as it iterates through the records, it should pick a language and make that the key, and pick the 'content' as the value. Another aspect is that if there is already that language key in the dictionary, it should append the text with the matching language to the value. I don't know if this is too much for a dictionary?

So far, I have tried the following feeble attempts, and have gotten the same result, which is only the last record and language read (it's overwriting, not appending) and also, it's not concatenating the texts.

project_details = {}

for row in results:
    idProject = row[0]
    documents = mongo_db.new_collection_Eus.find(
       {"database_project_id": idProject},
       no_cursor_timeout=True).batch_size(100)

    for doc in documents:
        project_details[doc['database_project_id']] = {}

        [project_details[doc['database_project_id']][doc['language']]] = [doc['content']]

        for k,v in project_details[doc['database_project_id']].items():
            if k in [project_details[doc['database_project_id']]]:
                k[v] = project_details[doc['database_project_id']][doc['language']].append([doc['content']])

            else:
                [project_details[doc['database_project_id']][doc['language']]] = [doc['content']]

also tried this:

for row in results:
    idProject = row[0]
    documents = mongo_db.new_collection_Eus.find(
       {"database_project_id": idProject},
       no_cursor_timeout=True).batch_size(100)

    for doc in documents:
        project_details[doc['database_project_id']] = {}

        if doc['language'] not in project_details[doc['database_project_id']].keys():

            project_details[doc['database_project_id']][doc['language']] = doc['content']
        else:
            
            project_details[doc['database_project_id']][doc['language']] = project_details[doc['database_project_id']][doc['language']] + ' ' + doc['content']

They both give the same result, only one language, even though there are many languages in the records, and the text is not concatenated per language in the dictionary.

I have looked through these questions

Any help will be greatly appreciated, as I'm quite stuck on this.

Iteratively populate a nested dictionary in Python

Answers (1)

Addendum: turn into a simple `dict` and join multipart content

Related Questions

Iteratively populate a nested dictionary in Python

Answers (1)

Addendum: turn into a simple dict and join multipart content

Related Questions

Addendum: turn into a simple `dict` and join multipart content