Reputation: 1381
I am trying to create a list of dictionaries but not able to push my dictionaries into a list. What mistake am I making.
How data (mongo_data) looks:
{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},],
'vendor': 'Fantasy'
}
{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'},],
'vendor': 'Dystopia'
}
{
'url': 'https://kindle.com/',
'variables': [{'key': 'Twilight', 'value': '5.9'},
{'key': 'Lord of the Rings', 'value': '9.0'},],
'vendor': 'Fantasy'
}
{
'url': 'https://kindle.com/',
'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'},],
'vendor': 'Fantasy'
}
My data that I have gotten from MongoDB:
for item in mongo_data:
url = item['url']
genre = item['genre']
books = item['books']
My code:
url_array = []
url_array.append(url)
unique_urls = set(url_array)
searches = []
main_dict = {}
searches.append(main_dict)
results = []
for url in list(unique_urls):
book_vals = {}
main_dict['url'] = url
main_dict['results'] = [book_vals]
results.append(book_vals)
book_vals['genre'] = genre
book_vals['data'] = books
My Result:
{
"searches": [
{
"url": "http://goodreads.com",
"results": [
{
"genre": "Fantasy",
"data": [
{
"name": "Harry Potter",
"value": "10.0"
},
{
"name": "Discovery of Witches",
"value": "8.5"
},
]
}
]
},
{
"url": "http://goodreads.com",
"results": [
{
"genre": "Dystopia",
"data": [
{
"name": "Hunger Games",
"value": "10.0"
},
{
"name": "Maze Runner",
"value": "5.5"
}
]
}
]
},
{
"url": "http://kindle.com",
"results": [
{
"genre": "Fantasy",
"data": [
{
"name": "Twilight",
"value": "5.9"
},
{
"name": "Lord of the Rings",
"value": "9.0"
},
]
}
]
},
{
"url": "http://kindle.com",
"results": [
{
"genre": "Dystopia",
"data": [
{
"name": "The Handmaids Tale",
"value": "10.0"
},
{
"name": "Divergent",
"value": "9.0"
}
]
}
]
}
]
}
Everything is being added to the searches array.
But I need them to be grouped by first the url in the main_dict and then again the results to be grouped by genre
Expected results:
{
'searches': [
{
'url': 'http://goodreads.com',
'results': [
{
'genre': 'Fantasy',
'data': [
{
'key': 'Harry Potter',
'value': '10.0'
}, {
'key': 'Discovery of Witches',
'value': '8.5'
}
]
}, {
'genre': 'Dystopia',
'data': [{
'key': 'Hunger Games',
'value': '10.0'
}, {
'key': 'Maze Runner',
'value': '5.5'
}
]
}
]
} ,
{
'url': 'http://kindle.com',
'results': [
{
'genre': 'Fantasy',
'data': [
{
'key': 'Twilight',
'value': '5.9'
}, {
'key': 'Lord of the Rings',
'value': '9.0'
}
]
}, {
'genre': 'Dystopia',
'data': [{
'key': 'The Handmaids Tale',
'value': '10.0'
}, {
'key': 'Divergent',
'value': '9.0'
}
]
}
]
}
]
}
Sorry for any data structural issues.
Upvotes: 0
Views: 1034
Reputation: 4070
Try the following. The key is to use groupby
to group items with the same URL together.
mongo_data = [{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},],
'vendor': 'Fantasy'
},{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'},],
'vendor': 'Dystopia'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'Twilight', 'value': '5.9'},
{'key': 'Lord of the Rings', 'value': '9.0'},],
'vendor': 'Fantasy'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'},],
'vendor': 'Fantasy'
}]
from itertools import groupby, chain
import json
searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
search = {}
search["url"] = key
search["results"] = [{"genre": result["vendor"], "data": result["variables"]} for result in group]
searches.append(search)
print(json.dumps(searches))
Output
[
{
"url": "https://goodreads.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Harry Potter",
"value": "10.0"
},
{
"key": "Discovery of Witches",
"value": "8.5"
}
]
},
{
"genre": "Dystopia",
"data": [
{
"key": "Hunger Games",
"value": "10.0"
},
{
"key": "Maze Runner",
"value": "5.5"
}
]
}
]
},
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Twilight",
"value": "5.9"
},
{
"key": "Lord of the Rings",
"value": "9.0"
}
]
},
{
"genre": "Fantasy",
"data": [
{
"key": "The Handmaids Tale",
"value": "10.0"
},
{
"key": "Divergent",
"value": "9.0"
}
]
}
]
}
]
Upvotes: 1
Reputation: 9997
So, if this is your code, it doesn't make a ton of sense. (I'm assuming that for some reason you didn't share your actual code?)
url_array = []
url_array.append(url)
# so- your url_array only has one url?
unique_urls = set(url_array)
searches = []
main_dict = {}
searches.append(main_dict)
# searches will only contain one dict?
results = []
for url in list(unique_urls):
book_vals = {}
main_dict['url'] = url
# as written, you would be over-writing the values in 'main_dict' every time
main_dict['results'] = [book_vals]
results.append(book_vals)
book_vals['genre'] = genre
book_vals['data'] = books
Instead, let me talk about some more general things on this problem. You said
But I need them to be grouped by first the url in the main_dict and then again the results to be grouped by genre
If we want to take your search results and group them twice, this is how I would do it.
class SearchResult:
url: str
title: str
genre: str
result_factory = lambda: {data: []}
search_factory = lambda: {results: default_dict(result_factory)}
searches = default_dict(search_factory)
for search in search_data:
searches[search.url][search.genre].append(search.title)
The basic idea is that when grouping stuff, you use a dictionary. so to group searches by urls, you have a dict of urls to a collection of results. Since you want it nested, have a dict of urls to a dict of genres to a list of titles.
The default dict stuff is just syntactic sugar to jumpstart each record instead of checking if it exists and adding the empty object when necessary.
Upvotes: 0