Reputation: 310
In my DataFrame I have list with dicts. When I do
data.stations.apply(lambda x: x)[5]
the output is:
[{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
{'connector': 3, 'id': 514161, 'power': 0},
{'connector': 7, 'id': 514160, 'power': 0}]},
{'id': 245856,
'outlets': [{'connector': 13, 'id': 514165, 'power': 0},
{'connector': 3, 'id': 514164, 'power': 0},
{'connector': 7, 'id': 514163, 'power': 0}]},
{'id': 245857,
'outlets': [{'connector': 13, 'id': 514168, 'power': 0},
{'connector': 3, 'id': 514167, 'power': 0},
{'connector': 7, 'id': 514166, 'power': 0}]}]
So it looks like 3 dicts in a list.
When I do
data.stations.apply(lambda x: x[0] )[5]
It does what it should:
{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
{'connector': 3, 'id': 514161, 'power': 0},
{'connector': 7, 'id': 514160, 'power': 0}]}
HOWEVER, when I chose second or third element, it doesn't work:
data.stations.apply(lambda x: x[1])[5]
This gives an error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-118-1210ba659690> in <module>()
----> 1 data.stations.apply(lambda x: x[1])[5]
~\AppData\Local\Continuum\Anaconda3\envs\geo2\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
2549 else:
2550 values = self.asobject
-> 2551 mapped = lib.map_infer(values, f, convert=convert_dtype)
2552
2553 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()
<ipython-input-118-1210ba659690> in <lambda>(x)
----> 1 data.stations.apply(lambda x: x[1])[5]
IndexError: list index out of range
Why? It should just give me the second element.
Upvotes: 1
Views: 44
Reputation: 30605
The reason might be simple that all the list entries in each row might not be of same length. Lets consider an example
data = pd.DataFrame({'stations':[[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}],
[{'1':2,'3':4},{'1':2,'3':4}],
[{'1':2,'3':4}],
[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}]]
})
stations
0 [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
1 [{'1': 2, '3': 4}, {'1': 2, '3': 4}]
2 [{'1': 2, '3': 4}]
3 [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
If you do :
data['stations'].apply(lambda x: x[0])[3]
You will get :
{'1': 2, '3': 4}
But if you do:
data['stations'].apply(lambda x: x[1])[3]
You will get Index Error... list out of bounds
because if you observe the 3rd row there is only one element in the list. Hope it clears your doubt.
Upvotes: 3