Reputation: 1833
I have a python dictionary as follows:
dict = {4:0.65,8:1.23,3:0.43}
I would like to convert this to a python list by using the key as the index to the list. The desired converted result would be:
listLength = 10
plist = [0,0,0,0.43,0.65,0,0,0,1.23,0]
I know how to do the above using a loop but that is not pythonic and it is not fast. What is the most pythonic way to do the above without using a loop.
I specially need to do the above with the best performance.
Upvotes: 0
Views: 80
Reputation: 53039
For larger data sets you can gain some speed using np.fromiter
directly on the key and value iterators instead of creating lists first.
Create test case
>>> d = dict(zip(np.random.randint(1, 10, 1_000_000).cumsum(), np.arange(1_000_000.)))
>>> out = np.zeros(10_000_000)
Define fromiter
method
>>> def use_iter():
... k, v = (np.fromiter(w, dtype=t, count=len(d)) for w, t in [(d.keys(), int), (d.values(), float)])
... out[k] = v
... return out
and list
method for reference
>>> def use_list():
... out[list(d.keys())] = list(d.values())
... return out
and time them
>>> timeit(use_iter, number=100)
4.2583943260106025
>>> timeit(use_list, number=100)
17.10310926999955
Also, check correctness
>>> np.all(use_list() == use_iter())
True
Upvotes: 1
Reputation: 539
You can just iterate over the dictionary and place them into a list. I am doing error checking to make sure that the key is within the specified list length.
list = [0] * length
for key, val in d.items():
if key < length:
list[key] = val
If you want the list to be as big as the max key, follow this bellow
maxKey = max(d, key=int)
list = [0] * maxKey
for key, val in d.items():
list[key] = val
Upvotes: 0
Reputation: 51165
Using numpy
and numpy
indexing is going to be the most performant solution:
out = np.zeros(10)
out[list(d.keys())] = list(d.values())
array([0. , 0. , 0. , 0.43, 0.65, 0. , 0. , 0. , 1.23, 0. ])
Performance since you asked:
k = np.random.randint(1, 100000, 10000)
v = np.random.rand(10000)
d = dict(zip(k, v))
In [119]: %%timeit
...: out = np.zeros(100000)
...: out[list(d.keys())] = list(d.values())
...:
...:
1.86 ms ± 13.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [120]: %timeit [d.get(i, 0) for i in range(100000)]
17.4 ms ± 231 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [121]: %timeit pd.Series(d).reindex(range(100000),fill_value=0).tolist()
9.77 ms ± 148 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Upvotes: 2
Reputation: 61910
You could to something like this:
list_length = 10
d = {4: 0.65, 8: 1.23, 3: 0.43}
plist = [d.get(i, 0) for i in range(list_length)]
print(plist)
Output
[0, 0, 0, 0.43, 0.65, 0, 0, 0, 1.23, 0]
Note: Don't use the name dict for your own variables, you will shadow the built-in name dict
.
Upvotes: 0
Reputation: 323316
Since you tag pandas
, solution from reindex
pd.Series(d).reindex(range(10),fill_value=0).tolist()
Out[369]: [0.0, 0.0, 0.0, 0.43, 0.65, 0.0, 0.0, 0.0, 1.23, 0.0]
Upvotes: 3
Reputation: 4606
Using list comprehension
lst = [d[i] if i in d else 0 for i in range(10)]
print(lst)
# [0, 0, 0, 0.43, 0.65, 0, 0, 0, 1.23, 0]
Expanded:
lst = []
for i in range(10):
if i in d:
lst.append(d[i])
else:
lst.append(0)
Upvotes: 0
Reputation: 419
Avoid shadowing the built-in dict
. Use some other name instead.
dict_ = {4:0.65,8:1.23,3:0.43}
length = max(dict_) + 1 # Get number of entries needed
list_ = [0] * length # Initialize a list of zeroes
for i in dict_:
list_[i] = dict_[i]
Upvotes: 0