Reputation: 209
This question is strongly related to my question earlier:
here
Sorry that I have to ask again!
The code below is running and delivering the correct results but its again somehow slow (4 mins for 80K rows). I have problems to use the Series class from pandas for concrete values. Can someone recommend how I can instead classify those columns?
Could not find relevant information in the documentary:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html
# p_test_SOLL_test_D10
for x in range (0,len(tableContent[6])):
var = tableContent[6].loc[x, ('p_test_LAENGE')]
if float(tableContent[6].loc[x, ('p_test_LAENGE')])>=100.0:
tableContent[6].loc[x, ('p_test_LAENGE')]='yes'
elif (float(tableContent[6].loc[x, ('p_test_LAENGE')]) <30.0 and float(tableContent[6].loc[x, ('p_test_LAENGE')]) >= 10):
tableContent[6].loc[x, ('p_test_LAENGE')]='yes2'
elif (float(tableContent[6].loc[x, ('p_test_LAENGE')]) <10.0 and float(tableContent[6].loc[x, ('p_test_LAENGE')]) >= 5):
tableContent[6].loc[x, ('p_test_LAENGE')]='yes3'
else:
tableContent[6].loc[x, ('p_test_LAENGE')]='no'
print (tableContent[6]['p_test_LAENGE'])
if tableContent[6]['p_test_LAENGE'].astype(float) >=100.0:
tableContent[6]['p_test_LAENGE']='yes'
elif (tableContent[6]['p_test_LAENGE'].astype(float) <30.0 and tableContent[6]['p_test_LAENGE'].astype(float) >= 10):
tableContent[6]['p_test_LAENGE']='yes1'
elif (tableContent[6]['p_test_LAENGE'].astype(float) <10.0 and tableContent[6]['p_test_LAENGE'].astype(float) >= 5):
tableContent[6]['p_test_LAENGE']='yes2'
else:
tableContent[6]['p_test_LAENGE']='no'
print (tableContent[6]['p_test_LAENGE'])
Upvotes: 0
Views: 45
Reputation: 1344
I do not have your df
to test so you need to modify the following code.
Assume that min of df
is greater than 10e-7
while max of df
is less than 10e7
bin = [10e-7,5,10,30,100,10e7]
label = ['no','yes2','yes1','no','yes']
df['p_test_LAENGE_class'] = pd.cut(df['p_test_LAENGE'], bins=bin, labels=label)
Hope this will help you
Upvotes: 1