Reputation: 154
Maybe a very naive question, but I am stuck in this: pandas.DataFrame.apply has a possibility to put inside a function.
# define function for further usage
def get_string(df):
string_input = []
for id, field in enumerate(df.index):
string_input.append('<field id="{0}" column="{1}">{2}</field>'.format(id, field, df[field]))
return '\n'.join(string_input)
If i apply it on df i get perfectly formatted string file output as wanted
global_1 = '\n'.join(df.apply(get_string, axis=1))
output:
<field id="0" column="xxx">49998.0</field>
<field id="1" column="xxx">37492.0</field>
<field id="2" column="xxx">12029.0</field>
But why don't i have to put inside get_string() necessary input global parameter df get_string(df) like this:
global_1 = '\n'.join(df.apply(get_string(df), axis=1))
and what if i have more input global parameters? I have Googled for it a while, but I am not very clear about it. Could anyone give me some illustrative explanation how it works? Thank you for any assistance.
Upvotes: 1
Views: 244
Reputation: 93161
You are confusing between df
the global variable and df
the local variable.
The get_string
function defines input variable called df
and this will overshadow any variable of the same name from higher scopes. The df
that get_string
knows is the dataframe you called apply
upon, not the global df
. You can try it with different dataframes:
df = pd.DataFrame({'a': ['Lorem', 'Ipsum']})
x = pd.DataFrame({'b': ['Hello', 'World']})
y = pd.DataFrame({'c': ['Goodbye', 'World']})
global_1 = '\n'.join(df.apply(get_string, axis=1))
global_2 = '\n'.join(x.apply(get_string, axis=1))
global_3 = '\n'.join(y.apply(get_string, axis=1))
print(global_1)
print(global_2)
print(global_3)
Result:
# From the global `df`
<field id="0" column="a">Lorem</field>
<field id="0" column="a">Ipsum</field>
# From x
<field id="0" column="b">Hello</field>
<field id="0" column="b">World</field>
# From y
<field id="0" column="c">Goodbye</field>
<field id="0" column="c">World</field>
Upvotes: 1