Reputation:
Guys I want to scrape from yahoo finance the country of the company - United States that is located in profile page on Yahoo finance. The link is :
https://finance.yahoo.com/quote/AAPL/profile?p=AAPL
I tried this code but can't extract it. I am new in scraping data and would appreciate if you could help me with it.
My code:
import requests
from lxml import html
xp = "//span[text()='Sector']/following-sibling::span[1]"
symbol = 'AAPL'
url = 'https://finance.yahoo.com/quote/' + symbol + '/profile?p=' + symbol
page = requests.get(url)
tree = html.fromstring(page.content)
d = {}
I prefer lxm and requests and haven't worked with beautifulsoup so prefer indicated in the code libraries.
Would appreciate any help.
Upvotes: 2
Views: 452
Reputation: 26129
Don't scrape, instead use yfinance
which is regularly updated and simplifies everything:
import yfinance as yf
df = yf.download('TWTR')
If you wish to plot it:
import finplot as fplt
fplt.candlestick_ochl(df[['Open','Close','High','Low']])
fplt.show()
Upvotes: 0
Reputation: 11
Maybe you could use a BeautifulSoup in combination with Regex Search to filter out the location:
import requests
from lxml import html
from bs4 import BeautifulSoup
import re
xp = "//span[text()='Sector']/following-sibling::span[1]"
symbol = 'TEVA'
url = 'https://finance.yahoo.com/quote/' + symbol + '/profile?p=' + symbol
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
baseTag = soup.findAll('p', {'class':"D(ib) W(47.727%) Pend(40px)"})
matches = re.findall("\ -->(.*?)\<!--", str(baseTag))
print(matches[-1])
I tested it with Google (GOOG), Apple (APPL) and Teva Pharmaceutical Industries Limited (TEVA) and it seems to work.
Upvotes: 1
Reputation: 24930
See if this works for you:
xpp = tree.xpath('//div[@data-reactid=7]/p/text()[3]')[0].strip()
xpp
Output:
'United States'
Upvotes: 1