Find all HTML tags in a web page and print them from a sorted dictionary

Finding all HTML tags from a web page and recording these to a dictionary is magically easy task with Python (if you compare it with #VBA), when Beautiful Soup 4 is used. soup.find_all(True) loops through the tags and a simple if-else is quite a standard recipe for filling out the dictionary.

However, once the dictionary is printed, it looks a bit “ugly”, as far as the key-value pairs looks a bit randomized. Thus, it makes sense to sort the dictionary items to a list, based on their repetitiveness:

The code is about 20 lines, and it works quite flawlessly!

Enjoy it! If you like Beautiful Soup, you may consider taking a look at my walk-through tutorial from the official documentation here:

Tagged with: , , , ,