shantanuo
shantanuo

Reputation: 32286

extract only a few lines from output

The following code is working as expected. It returns 7 lines.

from bs4 import BeautifulSoup
import urllib2
url="http://www.findandtrace.com/trace-mobile-number-location?mobilenumber=9834900000&submit=Trace"
page=urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
universities=soup.findAll('b')
for eachuniversity in universities:
   print eachuniversity.string

But I need only the 3rd and 4th line.

9834900000
9834900000
MADHYA PRADESH & CHHATISGARH 
AIRTEL
GSM
 LIVE - Active 
Mobile Reputation & Monitoring 

The expected output is a tuple:

('MADHYA PRADESH & CHHATISGARH', 'AIRTEL')

How do I achieve this result?

Upvotes: 0

Views: 116

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1123620

Rather than look for bold tags, look for the table and parse out the rows:

data = {}
for row in soup.select('#content #table .row'):
    key, value = (cell.text for cell in row.select('.cell'))
    data[key.rstrip(' :')] = value.strip()

This produces:

{u'Connection Status': u'LIVE - Active',
 u'Mobile Phone': u'9834900000',
 u'Network Operator / Service Provider': u'AIRTEL',
 u'Service Type / Signal': u'GSM',
 u'Telecom Circle / State': u'MADHYA PRADESH & CHHATISGARH'}

allowing you to pull out the data you want by key rather than index:

data['Telecom Circle / State'], data['Network Operator / Service Provider']

Upvotes: 3

Related Questions