Reputation: 83
I want to collect the detailed recommendation description paragraphs that a person received on his/her LinkedIn profile, such as this link:
https://www.linkedin.com/in/teddunning/details/recommendations/
(This link can be viewed after logging in any LinkedIn account)
Here is my best try:
for index, row in df2.iterrows():
linkedin = row['LinkedIn Website']
current_url = f'{linkedin}/details/recommendations/'
driver.get(current_url)
time.sleep(random.uniform(2,3))
descriptions=driver.find_elements_by_xpath("//*[@class='display-flex align-items-center t-14 t-normal t-black']")
s=0
for description in descriptions:
s+=1
print(description.text)
df2.loc[index, f'RecDescription_{str(s)}'] = description.text
The urls I scraped in df2 are all similar to the example link above. The code find nothing in the "descriptions" variable.
My question is: What element I should use to find the detailed recommendation content under "received tab"? Thank you very much!
Upvotes: 0
Views: 94
Reputation: 594
Well you would first get the direct parent of the paragraphs. You can do that with XPath, class or id whatever fits best. After that you can do Your_Parent.find_elements(by=By.XPATH, value='./child::*')
you can then loop over the result of that to get all paragraphs.
This selects all the paragraphs i have not yet looked into seperating them by post but here is what i got so far:
parents_of_paragraphs = driver.find_elements(By.CSS_SELECTOR, "div.display-flex.align-items-center.t-14.t-normal.t-black")
text_total = ""
for element in parents_of_paragraphs:
paragraph = element.find_element(by=By.XPATH, value='./child::*')
text_total += f"{paragraph.text}\n"
print(text_total)
Upvotes: 1