MikolajM
MikolajM

Reputation: 310

Accessing other than first element in list doesn't work

In my DataFrame I have list with dicts. When I do

data.stations.apply(lambda x: x)[5]

the output is:

[{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
   {'connector': 3, 'id': 514161, 'power': 0},
   {'connector': 7, 'id': 514160, 'power': 0}]},
 {'id': 245856,
  'outlets': [{'connector': 13, 'id': 514165, 'power': 0},
   {'connector': 3, 'id': 514164, 'power': 0},
   {'connector': 7, 'id': 514163, 'power': 0}]},
 {'id': 245857,
  'outlets': [{'connector': 13, 'id': 514168, 'power': 0},
   {'connector': 3, 'id': 514167, 'power': 0},
   {'connector': 7, 'id': 514166, 'power': 0}]}]

So it looks like 3 dicts in a list.

When I do

data.stations.apply(lambda x: x[0] )[5]

It does what it should:

{'id': 245855,
 'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
  {'connector': 3, 'id': 514161, 'power': 0},
  {'connector': 7, 'id': 514160, 'power': 0}]}

HOWEVER, when I chose second or third element, it doesn't work:

data.stations.apply(lambda x: x[1])[5]

This gives an error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-118-1210ba659690> in <module>()
----> 1 data.stations.apply(lambda x: x[1])[5]

~\AppData\Local\Continuum\Anaconda3\envs\geo2\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
   2549             else:
   2550                 values = self.asobject
-> 2551                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   2552 
   2553         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-118-1210ba659690> in <lambda>(x)
----> 1 data.stations.apply(lambda x: x[1])[5]

IndexError: list index out of range

Why? It should just give me the second element.

Upvotes: 1

Views: 44

Answers (1)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

The reason might be simple that all the list entries in each row might not be of same length. Lets consider an example

data = pd.DataFrame({'stations':[[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}],
                                [{'1':2,'3':4},{'1':2,'3':4}],
                                [{'1':2,'3':4}],
                                 [{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}]]
                    })

                                         stations
0  [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
1               [{'1': 2, '3': 4}, {'1': 2, '3': 4}]
2                                 [{'1': 2, '3': 4}]
3  [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...

If you do :

data['stations'].apply(lambda x: x[0])[3]

You will get :

{'1': 2, '3': 4}

But if you do:

data['stations'].apply(lambda x: x[1])[3]

You will get Index Error... list out of bounds because if you observe the 3rd row there is only one element in the list. Hope it clears your doubt.

Upvotes: 3

Related Questions