Reputation: 1371
In the folwwing code:
import pandas as pd
import sqlite3
import math
import numpy
con = sqlite3.connect(r'C:\Python34\factbook.db')
facts = pd.read_sql_query('select * from facts;', con)
facts.dropna(inplace=True)
facts = facts[facts['area_land']!=0][:]
facts = facts[facts['population']!=0][:]
facts.reset_index(drop=True, inplace=True)
def pop_50(name):
pop = facts[facts['name'] == name]['population']
perc = facts[facts['name'] == name]['population_growth']
new_pop = pop*(math.e**(35*perc))
return new_pop
x=pd.Series(data=facts['name'])
z = x.apply(pop_50)
x is a Series:
0 Afghanistan
1 Albania
2 Algeria
3 Andorra
4 Angola
5 Antigua and Barbuda
6 Argentina
7 Armenia
and so on...
But z isn't. Here is a link for seeing what it is (a DataFrame): https://www.scribd.com/document/357697929/Doc1
I cant understand why. The pop_50 func gives back a single result (I tested it), so why is zed a DataFrame? How can pop_50 return a series? it takes a row (where facts['name']==name) and from it a single value (under the population column) than call it pop. it than do the same idea for perc. new_pop is a math combination of 2 singel values so its a single value as well, and the func return just that, dont it?
Thank you.
Upvotes: 3
Views: 1676
Reputation: 294358
pop_50
returns a pd.Series
. x.apply(pop_50)
calls the function pop_50
for every row of x
with the value of that row being passed to pop_50
as the argument name
. So for the first row in x
, you return a series. And again for the second row. You end up with a series of series... which is a dataframe. Moreover, the index of x
will be the columns of your result.
Try this instead:
facts2 = facts.set_index('name')
def pop_50(name):
pop = facts2.at[name, 'population']
perc = facts2.at[name, 'population_growth']
new_pop = pop*(math.e**(35*perc))
return new_pop
You can also use pd.Series.squeeze
def pop_50(name):
pop = facts[facts['name'] == name]['population'].squeeze()
perc = facts[facts['name'] == name]['population_growth'].squeeze()
new_pop = pop*(math.e**(35*perc))
return new_pop
If for whatever reason you can't change pop_50
, wrap it in a lambda
z = x.apply(lambda name: pop_50(name).squeeze())
Upvotes: 1