Pandas Series apply lambda: NoneType found, but there are only str and list in the series

EDIT: jezrael had the right answer for the question I asked below. Unfortunately for me, I asked the wrong question. As it turns out, the problem was that the lists of strings in the DataFrame column contained None elements, which is where the error was coming from. Please see the answer I have added for the code I used to fix this.

SECOND EDIT: jezrael has updated his answer to a way of doing what I did but more succinctly in a lambda expression.


I have a DataFrame, of which I select a column, upon which I call apply, to which I provide the parameter of a lambda expression, which is an if statement. I understand that at this point the column is treated as a Series.

The column is made of up strings and lists of strings, the latter of which I wish to convert to just plain strings by concatenating their elements and replacing that list with the resulting string, so that the FataFrame column is just strings.

Relevant code:

raw_data.address = raw_data.address.fillna('')

At this point I have looped through the entire address column and added all the types to a set - the only elements in that set are str and list.

raw_data.address.apply(lambda x: x if type(x) == str else ' '.join(x))

and

raw_data.address.apply(lambda x: x if isinstance(x, str) else ' '.join(x))

do not work.

This is the error message in both cases:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-32-5e2dce775d20> in <module>
----> 1 raw_data.address.apply(lambda x: x if type(x) == str else ' '.join(x))

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3589             else:
   3590                 values = self.astype(object).values
-> 3591                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3592 
   3593         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-32-5e2dce775d20> in <lambda>(x)
----> 1 raw_data.address.apply(lambda x: x if type(x) == str else ' '.join(x))

TypeError: sequence item 0: expected str instance, NoneType found

I don't understand why this doesn't work. My understanding is that the syntax is correct.

Upvotes: 1

Views: 2356

Answers (2)

As it turns out, the problem was that the lists in the DataFrame contained None elements themselves. To solve this, instead of using a lambda function in apply, I just wrote a normal function, that uses the inbuilt function filter to remove the Nones in the lists:

def make_strings(thing):
    if isinstance(thing, list):
        return ' '.join(filter(None, thing))
    else:
        return str(thing)

Upvotes: 0

jezrael
jezrael

Reputation: 862771

Compare list and remove None values:

raw_data = pd.DataFrame({'address':[['a', 'b', None], 'c']})
print (raw_data)
        address
0  [a, b, None]
1             c

raw_data.address = (raw_data.address
                            .apply(lambda x: ' '.join(filter(None, x)) 
                                             if isinstance(x, list)
                                             else x))
print (raw_data)
  address
0     a b
1       c

Upvotes: 1

Related Questions