Reputation: 2889
The following data representation:
[
{u'0xbd4f1cc0da707c5712651b659b86766ec6f25af5e388fc82474523339dd1da37': u'90000'},
{u'0x05a04a7bb2500087c14bc89eb6a49cd4c5afcac63270aff2d4508e610f606eed': u'40000'},
{u'0xc3f68d46b9e462110e4897a41b573a10fef72747fd4c9e8413eb2e4cba0af9b5': u'21000'},
{u'0x79dcc6ab82b2024a0d4135d4fa3a5cd62ab740f28fffa3fc4dfdb8b00430baab': u'158971'},
{u'0x034c9e7f28f136188ebb2a2630c26183b3df90c387490159b411cf7326764341': u'21000'},
{u'0xffda7269775dcd710565c5e0289a2254c195e006f34cafc80c4a3c89f479606e': u'1000000'},
{u'0x90ca439b7daa648fafee829d145adefa1dc17c064f43db77f573da873b641f19': u'90000'},
{u'0x7cba9f140ab0b3ec360e0a55c06f75b51c83b2e97662736523c26259a730007f': u'40000'},
{u'0x92dedff7dab405220c473aefd12e2e41d260d2dff7816c26005f78d92254aba2': u'21000'},
{u'0x0abe75e40a954d4d355e25e4498f3580e7d029769897d4187c323080a0be0fdd': u'21000'},
{u'0x22c2b6490900b21d67ca56066e127fa57c0af973b5d166ca1a4bf52fcb6cf81c': u'90000'},
{u'0x8570106b0385caf729a17593326db1afe0d75e3f8c6daef25cd4a0499a873a6f': u'90000'},
{u'0x8adfe7fc3cf0eb34bb56c59fa3dc4fdd3ec3f3514c0100fef800f065219b7707': u'40000'},
{u'0x8b0fe2b7727664a14406e7377732caed94315b026b37577e2d9d258253067553': u'21000'},
{u'0x244b29b60c696f4ab07c36342344fe6116890f8056b4abc9f734f7a197c93341': u'50000'},
{u'0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63': u'121000'},
{u'0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26': u'121000'}
]
Is generated from this loop:
dict_hash_gas = list()
for line in inpt:
resource = json.loads(line)
dict_hash_gas.append({resource['first']:resource['second']})
Based on data that looks, more or less, like so:
{"first":"A","second":"1","third":"2"}
{"first":"B","second":"1","third":"2"}
{"first":"C","second":"2","third":"2"}
{"first":"D","second":"3","third":"2"}
{"first":"E","second":"3","third":"2"}
{"first":"F","second":"3","third":"2"}
I've tried to find the maximum value of the second value in each dict, i.e.
{"first":"A","second":"LOOKING_FOR_MAX"}
How can I access all of the second values (the ones that look like u'90000'
) from that set of nested dictionaries, record and output the max
and the min
?
To precisely define terms: In the example up top, i.e.:
{u'0xbd4f1cc0da707c5712651b659b86766ec6f25af5e388fc82474523339dd1da37': u'90000'},
{u'0x05a04a7bb2500087c14bc89eb6a49cd4c5afcac63270aff2d4508e610f606eed': u'40000'},
{u'0xc3f68d46b9e462110e4897a41b573a10fef72747fd4c9e8413eb2e4cba0af9b5': u'21000'},
I'd like to search on the basis of u'90000'
, u'40000'
and u'21000'
- that's what I mean by "second" value.
The selection of max
I'd like to make would be on the basis of the number alone, so in that case u'90000'
.
EDIT:
Trying to call it in the following way, I generated the error reproduced below:
def _main():
with open('transactions000000000029.json', 'rb') as inpt:
dict_hash_gas = list()
for line in inpt:
resource = json.loads(line)
dict_hash_gas.append({resource['hash']:resource['gas']})
pairs = list(_as_pairs(dict_hash_gas))
if pairs:
# Avoid a ValueError from min() and max() if the list is empty.
print(min(pairs, key=lambda pair: pair.value))
print(max(pairs, key=lambda pair: pair.value))
Upvotes: 0
Views: 409
Reputation: 3956
Once you have your data in a tractable form, it's a one-liner.
In this case, since those dictionaries are obviously records of some sort, the ideal data type is either a custom class or a
collections.namedtuple
.
I went with the namedtuple
, since all the values are atomic and immutable.
(Also, it comes with many handy features like decent __str__
and __hash__
methods, and it's more efficient too.)
All of the effort below is in _as_pairs
, which generates immutable key-value pairs from that frustrating list of one-item dictionaries.
It also converts the stringified integers
(value
)
into the actual integers you wish they were.
After that, using the data is easy.
import collections
# FIXME: Use more descriptive names than "Pair", "key", and "value".
Pair = collections.namedtuple('Pair', ['key', 'value'])
def _as_pairs(pairs):
for pair in pairs:
# TODO: Verify the dict conatains exactly one item?
for k, v in pair.items():
# Should the `key` string also be an integer?
#yield Pair(key=int(k, base=16), value=int(v))
yield Pair(key=k, value=int(v))
def _main():
# Abbreviated below, but conatains same inputs as your example.
dict_hash_gas = [
...,
{u'0xffda...606e': u'1000000'},
{u'0x90ca...1f19': u'90000'},
...,
]
pairs = list(_as_pairs(dict_hash_gas))
if pairs:
# Avoid a ValueError from min() and max() if the list is empty.
print(min(pairs, key=lambda pair: pair.value))
print(max(pairs, key=lambda pair: pair.value))
if '__main__' == __name__:
_main()
Output (Python 3):
Pair(key='0xc3f6...f9b5', value=21000)
Pair(key='0xffda...606e', value=1000000)
I've included a couple suggestions in the comments:
Is it important that those dictionaries have exactly one item each?
Should those hexadecimal strings
(which I called id
)
also be converted into integers?
I can't tell what you're using this for, so I can't answer either of those questions.
Upvotes: 1
Reputation: 3454
Are you constrained to using dictionaries here? A list of tuples might be simpler to use:
dict_hash_gas = list()
for line in inpt:
resource = json.loads(line)
dict_hash_gas.append((resource['first'], resource['second']))
sorted_data = sorted(dict_hash_gas, key=lambda x: int(x[1]))
minimum = sorted_data[0]
maximum = sorted_data[-1]
yields:
('0xc3f68d46b9e462110e4897a41b573a10fef72747fd4c9e8413eb2e4cba0af9b5', '21000')
for the minimum
and
('0xffda7269775dcd710565c5e0289a2254c195e006f34cafc80c4a3c89f479606e',
'1000000')
for the maximum
Edit to show example using collections.namedtuple
:
from collections import namedtuple
DataItem = namedtuple('DataItem', ['first', 'second'])
dict_hash_gas = list()
for line in inpt:
resource = json.loads(line)
dict_hash_gas.append(DataItem(resource['first'], resource['second']))
sorted(dict_hash_gas, key=lambda x: int(x.second))
Upvotes: 0