Find all list (li) elements in a special position of a web site

So, my idea was to obtain all the elements, tagged with “li” from here – vitoshacademy.com/all, using BeautifulSoup4 and Python.

Initially, I thought about running a simple soup.findAll(“li”) , but it also added the text from the menus, which were considered also listed items. Thus, there should have been a way to filter out the not needed items. After checking the printed tags, I have noticed, that these are rather descriptive. E.g., with the “Inspect Element” in chrome, one could not see immediately the class, but the “soup” has printed it nicely:

<li class="subpost"><a href="https://www.vitoshacademy.com/c-implement-crud-functionality-asp-net-mvc-with-ef-core-video/">C# - Implement CRUD Functionality - ASP.NET MVC with EF Core - Video</a><span class="righttext">[Vitosh Doynov]</span></li>
C# - Get started with EF Core in an ASP.NET MVC Web App - Video[Vitosh Doynov]
<li class="subpost"><a href="https://www.vitoshacademy.com/c-get-started-with-ef-core-in-an-asp-net-mvc-web-app-video/">C# - Get started with EF Core in an ASP.NET MVC Web App - Video</a><span class="righttext">[Vitosh Doynov]</span></li>

Thus, after some research, the way to print the list items of a given class was considered to be the following:

for tag in soup.findAll("li", attrs={'class':'class_name'}):

And in my case, the whole code running looks like this:

import requests
from bs4 import BeautifulSoup

def main():
    url = 'https://www.vitoshacademy.com/all/'
    reqs = requests.get(url)
    soup = BeautifulSoup(reqs.text, features="html.parser")
    for tag in soup.findAll("li", attrs={'class':'subpost'}):
        print(tag.text)
        # print(tag)

if __name__== "__main__":
    main()

Producing the following “report”:

Find all list (li) elements in a special position of a web site

Related posts: