VBA – Extracting financial data from a website in table format

Although there are plenty of other tools for extracting data from a website (take a look at Beautiful Soup), VBA is somehow good, because … well, because it is somehow challenging to do it every time. Yesterday, I answered a question in StackOverflow about extracting financial data, and today I have considered to rewrite the code a bit and write an article about it. Fun, eh?

The idea:

This is the result in Excel, scraping the shown data:


So, how did I manage to do the “magic”? If you take a look at the Main() function, it is somehow self-explanatory:

At first, the website is loaded. Then the headers are recorded and written to an array called headers. Title is generated. A dictionary titlePeriods is filled with data. At the end, the data is written to Excel. Now let’s start function by function.

StartIE

The idea is to keep away some lines from the Main() function and to start the Internet Explorer separately. WaitSomeMilliseconds 2000 is somehow brutal, but the DoEvents  sometimes runs quite fast and appIE is not even starting to be “Busy”.

GenerateHeaders and GenerateTitle

Here we get the headers and the title for the report. The headers are the second row from the table:

After getting them, we split them by new line (because they come into different lines from .innerText property). Cleaning and removing the empty values in the array is done with RemoveEmptyElementsFromArray(headers). Concerning the title, it is quite self-explanatory – “title1” is the class name of the element and its innerText is “Daily Future Price Listing Fri November 29, 2019”.

GenerateTargetPeriods

Generating the target periods is the “meat” (I am not vegetarian) of the code. It returns a dictionary with key string and value list. In the key, I am writing the target and the first cell of the row:

Once it is generated, we return to writing the data to Excel.

WriteToExcel

Writing to Excel is nothing advanced – we loop through the worksheet, taking into account its rows and columns, incrementing the columns and the rows, whenever we need to. Probably it is a bit tricky at the beginning, as far as the columns should always start from 1 on the new row and the rows should be always incremented, but after some time it becomes clear.

Auxiliary Functions

  • WaitSomeMilliseconds

This is an old function, that I use for “asking” Excel to freeze for a couple of seconds, to compensate for some slow internet connection:

  • NumberToLetters

This one is used in the column autofit of the WriteToExcel  function – .Columns(startingCol & ":" & endingCol).EntireColumn.AutoFit

  • RemoveEmptyElementsFromArray

This is an interesting function, which removes the empty elements of an array. And it does it so nicely, I will add it to my VBA_personal repo at GitHub:

Pretty much that’s it. The whole code is available in GitHub here:

https://github.com/Vitosh/VBA/tree/master/ScrapingDataFromWebsiteWithVBA

Enjoy it!

Tagged with: , , , , , ,