Reputation: 41
I am running a function that grabs some data from a website and writes it into a pandas database. I am using selenium and geckodriver.
...code...
first_names = driver.find_elements_by_class_name('first-name')
first_names = [name.text for name in first_names]
last_names = driver.find_elements_by_class_name('last-name')
last_names = [name.text for name in last_names]
commit_status = driver.find_elements_by_class_name('school-name')
commit_status = [commit.text for commit in commit_status]
#error is happening below
athlete['commit_school'] = athlete['commit'].str.replace('\d+', '').str.replace('/',
'').str.replace('VERBAL', '').str.replace('SIGNED', '')
athlete['first'] = athlete['first'].str.title()
athlete['last'] = athlete['last'].str.title()
...code...
I then loop through this function to go through similar data on different states webpages. Sometimes it returns the data on the page normally and continues to the next state, while other times, I get: AttributeError: Can only use .str accessor with string values! ...and the code breaks. The part that is confusing me is that the times I get the error seem to be arbitrary. Sometimes I will make it through 1/4 of the loop and sometimes 3/4 of the loop.
My first attempt to fix was a try/except, but I am not sure if I am doing it right or if that is the best approach:
athlete['state'] = state_all[:length:]
athlete['hs'] = hs_all[:length:]
athlete['commit'] = commit_status[:length:]
try:
athlete['commit_school'] = athlete['commit'].str.replace('\d+', '').str.replace('/',
'').str.replace('VERBAL', '').str.replace('SIGNED', '')
athlete['first'] = athlete['first'].str.title()
athlete['last'] = athlete['last'].str.title()
except AttributeError:
pass
athlete['list'] = 'Rivals_' + year + '_' + list_state
athlete['home'] = profile_page[:length:]
The error is happening within that try/except statement, but I think it just skips all of it if it finds an error.
Upvotes: 0
Views: 375
Reputation: 16683
Does the below code where I add .astype('str')
to the middle for each column solve? You probably have column with mixed data type of strings and objects.
athlete['commit_school'] = athlete['commit'].astype('str').str.replace('\d+', '').str.replace('/', '').str.replace('VERBAL', '').str.replace('SIGNED', '')
athlete['first'] = athlete['first'].astype('str').str.title()
athlete['last'] = athlete['last'].astype('str').str.title()
Upvotes: 1