Reputation: 49
I have a dataframe of urls and verified urls and add a column with the levenstein ratio, which compares the two types of urls for each row.
Here is an example of my pandas dataframe:
url url_ok2
13 10hanover.org/ NaN
15 111140.cevadosite.com/ aerorealestate.net/
42 18brownlow.com/ 18brownlow.com:443/
57 1granary.com/ 1granary.com/journal/
61 1rs.org.uk/ 1rs.io/
79 2020visionnetwork.eu/ network.crowdhelix.com/
Here is my script:
import Levenshtein as lev
to_test['lev_ratio'] = None
for i in range(to_test.shape[0]):
to_test.iloc[i]['lev_ratio'] = lev.ratio(str(to_test.iloc[i].url),str(to_test.iloc[i].url_ok2))
But the values are not replaced, see dataframe after running script: url url_ok2 lev_ratio 13 10hanover.org/ NaN None 15 111140.cevadosite.com/ aerorealestate.net/ None 42 18brownlow.com/ 18brownlow.com:443/ None 57 1granary.com/ 1granary.com/journal/ None 61 1rs.org.uk/ 1rs.io/ None 79 2020visionnetwork.eu/ network.crowdhelix.com/ None
But when I check lev.ratio(str(to_test.iloc[i].url),str(to_test.iloc[i].url_ok2)), it gives me the corresponding value, i.e. lev.ratio(str(to_test.iloc[0].url),str(to_test.iloc[0].url_ok2))
returns
0.45454545454545453
How can I replace the values in lev_ratio column for each row?
Upvotes: 0
Views: 32
Reputation: 73
Try using .apply
to the dataFrame:
df['lev_ratio'] = df.apply(lambda x: lev.ratio(str(x.url),str(x.url_ok2)), axis=1)
Upvotes: 1