Reputation: 433
Trying to extract text from a tag based on href
containing a certain string, below is part of my sample code:
Experience = soup.find_all(id='background-experience-container')
Exp = {}
for element in Experience:
Exp['Experience'] = {}
for element in Experience:
role = element.find(href=re.compile("title").get_text()
Exp['Experience']["Role"] = role
for element in Experience:
company = element.find(href=re.compile("exp-company-name").get_text()
Exp['Experience']['Company'] = company
It doesn't like the syntax for how I've defined the Exp['outer_key']['inner_key'] = value
it is returning SyntaxError
.
I'm trying to buld a Dict.dict
which contains info on role and company, will also look to include dates for each but haven't got that far yet.
Can anyone spot any glaringly obvious mistakes in my code?
Really appreciate any help with this!
Upvotes: 1
Views: 195
Reputation: 142631
find_all
can return many values (even if you search by id
) so better use list
to keep all values - Exp = []
.
Experience = soup.find_all(id='background-experience-container')
# create empty list
Exp = []
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to list
Exp.append(dic)
# display
print(Exp[0]['Role'])
print(Exp[0]['Company'])
print(Exp[1]['Role'])
print(Exp[1]['Company'])
# or
for x in Exp:
print(x['Role'])
print(x['Company'])
if you sure that find_all
gives you only one element (and you need key 'Experience'
) then you can do
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to main dictionary
Exp['Experience'] = dic
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])
or
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
Exp['Experience'] = {
'Role': element.find(href=re.compile("title")).get_text()
'Company': element.find(href=re.compile("exp-company-name")).get_text()
}
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])
Upvotes: 1