Reputation: 86779
I'm trying to do some simple string manipulation with the href attribute of a hyperlink extracted using Beautiful Soup:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('<a href="http://www.some-site.com/">Some Hyperlink</a>')
href = soup.find("a")["href"]
print href
print href[href.indexOf('/'):]
All I get is:
Traceback (most recent call last):
File "test.py", line 5, in <module>
print href[href.indexOf('/'):]
AttributeError: 'unicode' object has no attribute 'indexOf'
How should I convert whatever href
is into a normal string?
Upvotes: 5
Views: 5575
Reputation: 100826
Python strings do not have an indexOf
method.
Use href.index('/')
href.find('/')
is similar. But find
returns -1
if the string is not found, while index
raises a ValueError
.
So the correct thing is to use index
(since '...'[-1] will return the last character of the string).
Upvotes: 10
Reputation: 3617
href is a unicode string. If you need the regular string, then use
regular_string = str(href)
Upvotes: 0