Reputation: 1687
When I use
page = urllib2.urlopen("https:somewebpage.com")
soup = BeautifulSoup(page,"html.parser")
soup.get_text()
I get a result that looks like a table list but its not it returns this as actual text value:
["<a href='/path<a>","tableNameAAA","FINISHED","SUCCEEDED","<br title='100.0'> <div class='ui-progressbar ui-widget ui-widget-content ui-corner-all' title='100.0%'> ,"0"],
["<a href='/path<a>","tableNameBBB","INPROCESS","SUCCEEDED","<br title='100.0'> <div class='ui-progressbar ui-widget ui-widget-content ui-corner-all' title='100.0%'> ,"0"],...
How do I convert this to a list so I can iterate through it. I tried doing list(soup.get_text()) but when I try to iterate through it goes bananas:
...v', u'>', u'"', u',', u'"', u'<', u'a', u' ', u'...
What I expect when I iterate is : [list1],[list2]
instead of what it is now which is "[list1],[list2]"
Upvotes: 0
Views: 258
Reputation: 1687
Eventually I just stripped all the single quotes then made a list of all the tables probably could have done this without BS but it works.
Upvotes: 1