Reputation: 3537
I wrote a Python code to analyze a web page through Beautiful Soup.
Once the code was finished, I started removing unnecessary variables and lines.
I'm also trying to remove for loops, if possibile.
For example, I'd like to replace these two loops (which are in two different files) with a one line code (eg len(an object)
):
<li>
in all <ul>
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
list = soup.find_all('ul',{'class':'class-name'})
counter = 0
for ul in list:
for li in ul:
counter += 1
where the list
object is something like this
[<ul class="class-name">
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li></ul>,
<ul class="class-name">
<li class="section"><a href="...">...</a></li>
<li class="section"><a href="...">...</a></li></ul>]
<a>
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
list = soup.find('table',{'class':'class-name'}).find_all('a')
counter = 0
for el in list:
if el.contents[0] != 'Train':
counter += 1
where list
is something like
[<a href="…">Train</a>,
<a href="…">Car</a>,
<a href="…">Plane</a>]
Using numpy
, but the commands np.array(list)
and np.asarray(list)
get an error in both cases:
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
np.array(list)
File "C:\...\Python37-32\lib\site-packages\bs4\element.py", line 1016, in __getitem__
return self.attrs[key]
KeyError: 0
In case (2) I tried using the slice notation to retrieve element by element, but commands such as list[1:3]
return another array. So for example this code, which is an attempt to get the total length minus the number of elements whose contents[0] is 'Train'
, doesn't work:
counter = len(list) - (list[:].contents[0] == 'Train')
Is it possibile to replace the two loops with a one line code?
Upvotes: 3
Views: 1783
Reputation: 4315
strip() in-built function of Python is used to remove all the leading and trailing spaces from a string. sum()-in-built function takes an iterable and returns the sum of items in it
list1 = soup.find('table',{'class':'class-name'}).find_all('a')
counter = len(list1) - sum(1 for a in list1 if a.text.strip() == 'Train')
Upvotes: 1
Reputation: 5006
For the first loop :
counter = sum(1 for ul in list for li in ul)
For the second one :
counter = sum(1 for el in list if el.contents[0] != 'Train')
Upvotes: 2