JohnE
JohnE

Reputation: 30404

Map vs applymap when passing a dictionary

I thought I understood map vs applymap pretty well, but am having a problem (see here for additional background, if interested).

A simple example:

df  = pd.DataFrame( [[1,2],[1,1]] ) 
dct = { 1:'python', 2:'gator' }

df[0].map( lambda x: x+90 )
df.applymap( lambda x: x+90 )

That works as expected -- both operate on an elementwise basis, map on a series, applymap on a dataframe (explained very well here btw).

If I use a dictionary rather than a lambda, map still works fine:

df[0].map( dct )

0    python
1    python

but not applymap:

df.applymap( dct )
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-100-7872ff604851> in <module>()
----> 1 df.applymap( dct )

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in applymap(self, func)
   3856                 x = lib.map_infer(_values_from_object(x), f)
   3857             return lib.map_infer(_values_from_object(x), func)
-> 3858         return self.apply(infer)
   3859 
   3860     #----------------------------------------------------------------------

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   3687                     if reduce is None:
   3688                         reduce = True
-> 3689                     return self._apply_standard(f, axis, reduce=reduce)
   3690             else:
   3691                 return self._apply_broadcast(f, axis)

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce)
   3777             try:
   3778                 for i, v in enumerate(series_gen):
-> 3779                     results[i] = func(v)
   3780                     keys.append(v.name)
   3781             except Exception as e:

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in infer(x)
   3855                 f = com.i8_boxer(x)
   3856                 x = lib.map_infer(_values_from_object(x), f)
-> 3857             return lib.map_infer(_values_from_object(x), func)
   3858         return self.apply(infer)
   3859 

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:56990)()

TypeError: ("'dict' object is not callable", u'occurred at index 0')

So, my question is why don't map and applymap work in an analogous manner here? Is it a bug with applymap, or am I doing something wrong?

Edit to add: I have discovered that I can work around this fairly easily with this:

df.applymap( lambda x: dct[x] )

        0       1
0  python   gator
1  python  python

Or better yet via this answer which requires no lambda.

df.applymap( dct.get )

So that is pretty much exactly equivalent, right? Must be something with how applymap parses the syntax and I guess the explicit form of a function/method works better than a dictionary. Anyway, I guess now there is no practical problem remaining here but am still interested in what is going on here if anyone wants to answer.

Upvotes: 10

Views: 5990

Answers (1)

Data_addict
Data_addict

Reputation: 320

.applymap() and .map() is true to work element-wise. But .applymap() doesn't take every columns and do .map() on those, but do .apply() on each of those.

So when you call df.applymap(dct): What happend is df[0].apply(dct), not df[0].map(dct)

And here what is the difference between this two Series methods:

.map() accept Series, dict and function (any callable, so methods like dict.get work too) as first argument; as .apply() only accept function(or any callable) as first argument.

.map() contains if statement to figure out if the first argument passed is a dict, a Series or a function and act proprely depending of the input. When you pass a function to .map(), the .map() method do the same things as .apply().

But .apply() don't have those if statements that allow it to deal proprely with dictionnary and Series. It only know how to work with callable.

When you call .apply() or .map() with a function they both end calling lib.map_infer(), who look like acting like the map() function of python (but Im enable to put my hand on the source code so Im not completly sure).

Doing map(dct, df[0]) will give you the same error as df.applymap(dct) and df[0].apply(dct) will also give the same error.

Now, you can ask why using .apply() instead of .map(), if .map() do the same thing when called with a function and can take dict and Series?

Because .apply() can return you a Dataframe if the result of the function you pass to it is a Series.

ser = pandas.Series([1,2,3,4,5], index=range(5))

ser_map = ser.map(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_map)
pandas.core.series.Series

ser_app = ser.apply(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_app)
pandas.core.frame.DataFrame

Upvotes: 7

Related Questions