Reputation: 2232
Applying functions to dataframe
I currently have following dataframe:
Data
url visitors
http://somedomain.com 200000
http://someotherdomain.com 150000
http://somenewdomain.com 11000
For every row in the dataframe, I like to apply two functions to the url column and then write each result in two distinct columns 'meta' and 'content'.
Functions:
def metacrawler(url)
...
return data
def contentcrawler(url)
...
return data
# Counter
progress = 0
Loop
for index, row in data.iterrows():
print(str(progress)," out of ",str(len(data)))
print('Starting meta crawling.')
row['meta'] = metacrawler(row["url"])
print('Starting content crawling.')
row['content'] = contentcrawler(row["url"])
print('Complete.')
progress += 1
However, when I aborted the process after few iterations, I found that no data was written into the data frame. No columns were created either.
What did I do wrong?
Solution
def func(row):
print("Crawling Meta")
meta = metacrawler(row["url"])
print("Crawling Content")
tags = contentcrawler(row["url"])
return meta, content
data[['meta', 'content']] = data.apply(func, axis=1, result_type='expand')
Upvotes: 0
Views: 272
Reputation: 1105
You can use the .apply()
function Docs with result_type='expand'
In [3]: df = pd.DataFrame({'one':[1,2,3,4], 'two':[5,6,7,8]})
In [4]: df.apply(lambda x: (sum(x), sum(x)), axis=1, result_type='expand')
Out[4]:
0 1
0 6 6
1 8 8
2 10 10
3 12 12
In [5]: df[['new', 'etc']] = df.apply(lambda x: (sum(x), sum(x)), axis=1, result_type='expand')
In [6]: df
Out[6]:
one two new etc
0 1 5 6 6
1 2 6 8 8
2 3 7 10 10
3 4 8 12 12
Edit: If you want to show progress, define the applied function separately i.e.
def func(row):
print(row)
return sum(row), sum(row)
In [3]: df = pd.DataFrame({'one':[1,2,3,4], 'two':[5,6,7,8]})
In [4]: df.apply(func), axis=1, result_type='expand')
Out[4]:
0 1
0 6 6
1 8 8
2 10 10
3 12 12
In [5]: df[['new', 'etc']] = df.apply(lambda x: (sum(x), sum(x)), axis=1, result_type='expand')
In [6]: df
Out[6]:
one two new etc
0 1 5 6 6
1 2 6 8 8
2 3 7 10 10
3 4 8 12 12
Upvotes: 2