Reputation: 1572
SEE UPDATE AT THE END FOR A MUCH CLEARER DESCRIPTION.
According to http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.apply.html you can pass external arguments to an apply function, but the same is not true of applymap: http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.applymap.html#pandas.DataFrame.applymap
I want to apply an elementwise function f(a, i)
, where a
is the element, and i
is a manually entered argument. The reason I need that is because I will do df.applymap(f)
in a loop for i in some_list
.
To give an example of what I want, say I have a DataFrame df
, where each element is a numpy.ndarray
. I want to extract the i
-th element of each ndarray
and form a new DataFrame from them. So I define my f
:
def f(a, i):
return a[i]
So that I could make a loop which would return the i-th element of each of the np.ndarray
contained in df
:
for i in some_series:
b[i] = df.applymap(f, i=i)
so that in each iteration, it would pass my value of i
into the function f
.
I realise it would all have been easier if I had used MultiIndexing for df
but for now, this is what I'm working with. Is there a way to do what I want within pandas? I would ideally like to avoid for-looping through all the columns in df
, and I don't see why applymap
doesn't take keyword arguments, while apply
does.
Also, the way I currently understand it (I may be wrong), when I use df.apply
it would give me the i
-th element of each row/column, instead of the i
-th element of each ndarray
contained in df
.
UPDATE:
So I just realised I could split df
into Series and then use the pd.Series.apply
which could do what I want. Let me just generate some data to show what I mean:
def f(a,i):
return a[i]
b = pd.Series(index=range(10), dtype=object)
for i in b.index:
b[i] = np.random.rand(5)
b.apply(f,args=(1,))
Does exactly what I expect, and want it to do. However, trying with a DataFrame:
b = pd.DataFrame(index=range(4), columns=range(4), dtype=object)
for i in b.index:
for col in b.columns:
b.loc[i,col] = np.random.rand(10)
b.apply(f,args=(1,))
Gives me ValueError: Shape of passed values is (4, 10), indices imply (4, 4)
.
Upvotes: 1
Views: 9565
Reputation: 1437
This is a solution where argument is stored within a nested method
f(cell,argument):
"""Do something with cell value and argument"""
return output
def outer(argument):
def inner(cell):
return f(cell,argument)
return inner
argument = ...
df.applymap(func = outer(argument))
Upvotes: 2
Reputation: 121
You can use it:
def matchValue(value, dictionary):
return dictionary[value]
a = {'first': 1, 'second': 2}
b = {'first': 10, 'second': 20}
df['column'] = df['column'].map(lambda x: matchValue(x, a))
Upvotes: 3