Reputation: 41
I have following html:
<td>
<input maxlen="1" name="db" size="1" type="text" value="25"/>
<div style="display:inline-block;position:relative;top:6px;left:0px;width:20px;">
<input class="p_b" name="ta" style="height:1em; width:1.5em;line-height:1em;padding:0px;margin:0px;border:0px;background-color:#f3f3f3" type="submit" value="▴"/>
<input class="p_b" name="ta" style="height:1em; width:1.5em;line-height:1em;padding:0px;margin:0px;border:0px;background-color:#f3f3f3" type="submit" value="▾"/>
</div>
<span style="position:relative;top:8px">
</span>
<input maxlen="1" name="dc" size="1" type="text" value="0"/>
<div style="display:inline-block;position:relative;top:6px;left:0px;width:20px;">
<input class="p_b" name="tb" style="height:1em; width:1.5em;line-height:1em;padding:0px;margin:0px;border:0px;background-color:#f3f3f3" type="submit" value="▴"/>
<input class="p_b" name="tb" style="height:1em; width:1.5em;line-height:1em;padding:0px;margin:0px;border:0px;background-color:#f3f3f3" type="submit" value="▾"/>
</div>
</td>
I need to extract both numbers from value="25" and value="0". I made a workaround like:
y = soup.findAll('input', {'type':'text'})
a = re.findall('(?<=value=")(\d*)',str(y))
But I think there is should be more direct way to do it via parser, can anyone help with it?
Upvotes: 4
Views: 67
Reputation: 407
Try below code line to extract @value
from each input node
values = [element['value'] for element in soup.findAll('input', {'type':'text'})]
P.S. Note that using regex for web-scraping is a very bad practice - there are enough web-scraping tools that can easily do this for you (for instance, BeautifulSoup and lxml can be used in Python)
Upvotes: 1