Larsson
Larsson

Reputation: 39

Print list items in Beautiful Soup Python

I am currently extracting the following from a website using BeautifulSoup. But am struggling to print extract the data I need.

I am looking to extract for each list entry:

The data-qty value and the href="#">4 value. So for example in the first list entry I am trying to extract href = 4 and data-qty = 1.000.

The code I am currently using is listed under the data.

<div class="content size-options size_us-options" data-sizegroup="size_us" style="display:none">
    <ul class="sizes small-block-grid-4">
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="24" data-price="0" data-qty="1.0000" data-qtymad="0.0000" data-qtybcn="1.0000" data-oblocators="BBAI-0B-05-05" href="#">4</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="172" data-price="0" data-qty="4.0000" data-qtymad="0.0000" data-qtybcn="2.0000" data-oblocators="BBAI-0B-05-05" href="#">4.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="22" data-price="0" data-qty="10.0000" data-qtymad="0.0000" data-qtybcn="2.0000" data-oblocators="BBAI-0B-07-05" href="#">5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="160" data-price="0" data-qty="10.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-07-05" href="#">5.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="20" data-price="0" data-qty="9.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-05-05" href="#">6</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="165" data-price="0" data-qty="11.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-05-05" href="#">6.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="18" data-price="0" data-qty="28.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-05-05" href="#">7</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="110" data-price="0" data-qty="41.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-05-05" href="#">7.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="16" data-price="0" data-qty="53.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-05-05" href="#">8</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="121" data-price="0" data-qty="68.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-06-02;BBAI-0B-05-05" href="#">8.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="14" data-price="0" data-qty="85.0000" data-qtymad="0.0000" data-qtybcn="4.0000" data-oblocators="BBAI-0B-07-05" href="#">9</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="114" data-price="0" data-qty="64.0000" data-qtymad="0.0000" data-qtybcn="4.0000" data-oblocators="BBAI-0B-07-05" href="#">9.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="12" data-price="0" data-qty="71.0000" data-qtymad="0.0000" data-qtybcn="4.0000" data-oblocators="BBAI-0B-07-05" href="#">10</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="105" data-price="0" data-qty="59.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-07-05" href="#">10.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="10" data-price="0" data-qty="61.0000" data-qtymad="0.0000" data-qtybcn="3.0000" data-oblocators="BBAI-0B-07-05" href="#">11</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="117" data-price="0" data-qty="39.0000" data-qtymad="0.0000" data-qtybcn="2.0000" data-oblocators="BBAI-0B-07-05" href="#">11.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="8" data-price="0" data-qty="39.0000" data-qtymad="0.0000" data-qtybcn="2.0000" data-oblocators="BBAI-0B-07-05" href="#">12</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="202" data-price="0" data-qty="25.0000" data-qtymad="0.0000" data-qtybcn="0.0000" data-oblocators="" href="#">12.5</a>
        </li>
        <li>
            <a rel="nofollow" class="size-button available" data-optionIndex="126" data-price="0" data-qty="26.0000" data-qtymad="0.0000" data-qtybcn="0.0000" data-oblocators="" href="#">13</a>
        </li>
    </ul>
</div>

This is the code that I am currently using, I am struggling to extract and print the data I need and will be thankful for any help!

 soup = BeautifulSoup(response.content, 'html.parser')
 ukattributes = soup.find('div', {'class':'content size-options 
 size_uk-options'})
 print ukattributes
 sizes = ukattributes.findAll('li')
 print sizes
     for size in sizes:
     response = s.get(size.find('a')['href'])
     soup = BeautifulSoup(response.content, 'html.parser')
     print size

Please let me know if you can help me with this as I have been trying for a while now! Thanks again

Upvotes: 0

Views: 1876

Answers (2)

Dan-Dev
Dan-Dev

Reputation: 9420

You cant make a GET request on a URL # as this is not sent to the server it is probably used by JavaScript on the page or just links to the same page. See my answer to Pagination giving the first page in every iteration for more details. So:

response = s.get(size.find('a')['href'])

Will not work as you expected. To get the data you requested try:

soup = BeautifulSoup(response.content, 'html.parser')
ukattributes = soup.find('div', {'class':'content size-options size_us-options'})
print (ukattributes)
sizes = ukattributes.findAll('li')
print (sizes)
for size in sizes:
    href = size.find('a',href=True)
    print (href.text)
    print (href["data-qty"])

Outputs:

4
1.0000
4.5
4.0000
5
10.0000
5.5
10.0000

Upvotes: 1

t.m.adam
t.m.adam

Reputation: 15376

You can use a simple list comprehension to select the data you need.

ukattributes = soup.find('div', {'class':'content size-options size_us-options'})
data = [ [a.text, a.get('data-qty')] for a in ukattributes.find_all('a') ]

Upvotes: 1

Related Questions