Reputation: 1592
I have table of population size in two countries in different years that looks like this:
year pop1 pop2
0 0 1.000000e+08 1.000000e+08
1 1 9.620000e+07 9.970000e+07
2 2 9.254440e+07 9.940090e+07
3 3 8.902771e+07 9.910270e+07
4 4 8.564466e+07 9.880539e+07
The table has information only about the first 300 years.
I'm trying to create a function that will tell in which year the population has decreased/increased in 2 times, 10 times and 100 times. Helped by jezrael, I have created the following function:
def find_year(df,init_pop,multiplier):
pop_size=init_pop*multiplier
pop1_values=df['pop1'].unique().tolist()
pop2_values=df['pop2'].unique().tolist()
s=pd.DataFrame(df.set_index('year').sub(pop_size).abs().idxmin(),columns =[multiplier])
return(s)
The problem is that the growth rate is diffferent in each population and for that if I check when the population decrease in 10 or 100, one of the poplations gets to that point in more than 300 years, but in this case my function does'nt work and gives 300 as is the maximum value.
that for example the results if I run that the pop decrease in 0.1 and 0.01 (same year for pop2 which is wrong):
find_year(data,100000000,0.1)
>>> 0.1
pop1 59
pop2 300
find_year(data,100000000,0.01)
>>> 0.01
pop1 119
pop2 300
in order to fix it I have created conditions inside the function that checks if:
pop_size=init_pop*multiplier
is inside one of the columns in the original df, and if not it will not add it to the results. I have tried to do it this way:
def find_year(df,init_pop,multiplier):
pop_size=init_pop*multiplier
pop1_values=df['pop1'].unique().tolist()
pop2_values=df['pop2'].unique().tolist()
if pop_size.isin(pop1_values) & pop_size.isin(pop2_values):
s=pd.DataFrame(df.set_index('year').sub(pop_size).abs().idxmin(),columns =[multiplier])
return(s)
elif pop_size.isin(pop1_values) & ~pop_size.isin(pop2_values):
s=pd.DataFrame(df.set_index('year').sub(pop_size).abs().idxmin(),columns =[multiplier])
s.drop('pop2',axis=0,inplace=True)
return(s)
elif pop_size.isin(pop2_values) & ~pop_size.isin(pop1_values):
s=pd.DataFrame(df.set_index('year').sub(pop_size).abs().idxmin(),columns =[multiplier])
s.drop('pop1',axis=0,inplace=True)
return(s)
but when I run it I get:
AttributeError: 'float' object has no attribute 'isin'
I havr also tried to change the numbers to integer but I still got the same error just instead of "float" it said int. I don't understand why this happenns.
My end goal: to add condition that if variable "pop_size" is not in columns pop1 or pop2 of original table, it will remove that row (that says it takes 300 years) from the results df.
edit: how to produce my data:
def growth(p,r,t):
time=np.arange(0,t+1,1)
res=[p]
for t in time:
p=p*r
res.append(p)
return(pd.DataFrame(list(zip(time, res)),columns =['year','pop']))
country1=growth(100000000,0.962,300)
country2=growth(100000000,0.997,300)
data=pd.concat([country1, country2['pop']],axis=1)
data.columns=['year','pop1','pop2']
data
Upvotes: 0
Views: 1058
Reputation: 893
You are getting the error because neither int
nor float
has isin
attribute. It is defined on pandas dataframe. Now here is something that might solve your problem:
def find_year(df, init_pop, multiplier):
pop_size = init_pop*multiplier
pop1_values = df['pop1'].unique().tolist()
pop2_values = df['pop2'].unique().tolist()
s = pd.DataFrame(df.set_index('year').sub(
pop_size).abs().idxmin(), columns=[multiplier])
if pop_size in pop1_values and not pop_size in pop2_values:
s.drop('pop2', axis=0, inplace=True)
elif pop_size in pop2_values and not pop_size in pop1_values:
s.drop('pop1', axis=0, inplace=True)
return s
Upvotes: 1