Python, attain a certain text from html

Question

I am trying to attain a certain text that is written in korean. Is there a more efficient way of doing this, rather than converting it to a string and parsing it from there?

CODE:

#input:     url
#output:    name
def urlSC(url):
    soup = BeautifulSoup(urllib2.urlopen(url).read())
    name = soup.find('span', id = 'lblKName')

OUTPUT:

구세군앵커리지한인교회
The Salvation Army Anch. Korean Corps.

Want: 구세군앵커리지한인교회

url: http://www.koreanchurchyp.com/ViewDetail.aspx?OrgID=4102

Casimir et Hippolyte · Accepted Answer

If the korean part of the text is always at the first part before a br tag, you can use :

name = soup.find(id = 'lblKName').contents[0]

Python, attain a certain text from html

Answers (2)

Related Questions