sound wave
sound wave

Reputation: 3537

Count <li> in all <ul> and count all specific <a> in one line code

I wrote a Python code to analyze a web page through Beautiful Soup.

Once the code was finished, I started removing unnecessary variables and lines.

I'm also trying to remove for loops, if possibile.

For example, I'd like to replace these two loops (which are in two different files) with a one line code (eg len(an object)):

(1) Count <li> in all <ul>

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
list = soup.find_all('ul',{'class':'class-name'})
counter = 0
for ul in list:
    for li in ul:
        counter += 1

where the list object is something like this

[<ul class="class-name">
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li></ul>, 
<ul class="class-name">
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li></ul>]

(2) Count all specific <a>

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
list = soup.find('table',{'class':'class-name'}).find_all('a')
counter = 0
for el in list:
    if el.contents[0] != 'Train':
        counter += 1

where list is something like

[<a href="…">Train</a>,
<a href="…">Car</a>,
<a href="…">Plane</a>]

What I tried

Using numpy, but the commands np.array(list) and np.asarray(list) get an error in both cases:

Traceback (most recent call last):
  File "<pyshell#9>", line 1, in <module>
    np.array(list)
  File "C:\...\Python37-32\lib\site-packages\bs4\element.py", line 1016, in __getitem__
    return self.attrs[key]
KeyError: 0

In case (2) I tried using the slice notation to retrieve element by element, but commands such as list[1:3] return another array. So for example this code, which is an attempt to get the total length minus the number of elements whose contents[0] is 'Train', doesn't work:

counter = len(list) - (list[:].contents[0] == 'Train')

Is it possibile to replace the two loops with a one line code?

Upvotes: 3

Views: 1783

Answers (2)

bharatk
bharatk

Reputation: 4315

strip() in-built function of Python is used to remove all the leading and trailing spaces from a string. sum()-in-built function takes an iterable and returns the sum of items in it

list1 = soup.find('table',{'class':'class-name'}).find_all('a')
counter = len(list1) - sum(1 for a in list1 if a.text.strip() == 'Train')

Upvotes: 1

Corentin Limier
Corentin Limier

Reputation: 5006

For the first loop :

counter = sum(1 for ul in list for li in ul)

For the second one :

counter = sum(1 for el in list if el.contents[0] != 'Train')

Upvotes: 2

Related Questions