VBA – Extracting financial data from a website in table format
Although there are plenty of other tools for extracting data from a website (take a look at Beautiful Soup), VBA is somehow good, because … well, because it is somehow challenging to do it every time. Yesterday, I answered a question in StackOverflow about extracting financial data, and today I have considered to rewrite the code a bit and write an article about it. Fun, eh?
The idea:
- Extract all the data for Brent Crude Oil(ICE) from the table in the site mrci.com/ohlc/ohlc-all.php.
- Write it to Excel
This is the result in Excel, scraping the shown data:


So, how did I manage to do the “magic”? If you take a look at the Main() function, it is somehow self-explanatory:
Sub Main()
Dim url As String: url = "https://www.mrci.com/ohlc/ohlc-all.php"
Dim target As String: target = "Brent Crude Oil(ICE)"
Dim appIE As Object
Set appIE = StartIE(url, appIE)
Dim allRowsOfData As Variant
allRowsOfData = appIE.document.getElementsByClassName("strat")
Dim headers As Variant
headers = GenerateHeaders(allRowsOfData)
Dim title As String
title = GenerateTitle(appIE)
Dim targetPeriods As Object
Set targetPeriods = CreateObject("Scripting.Dictionary")
Set targetPeriods = GenerateTargetPeriods(allRowsOfData, headers, target, targetPeriods)
WriteToExcel targetPeriods, headers, target, title
appIE.Quit
End Sub
At first, the website is loaded. Then the headers are recorded and written to an array called headers. Title is generated. A dictionary titlePeriods is filled with data. At the end, the data is written to Excel. Now let’s start function by function.
StartIE
Public Function StartIE(url As String, appIE As Object) As Object
Set appIE = CreateObject("InternetExplorer.Application")
With appIE
.Navigate url
.Visible = True
End With
WaitSomeMilliseconds 2000
Do While appIE.Busy: DoEvents: Loop
Set StartIE = appIE
End Function
The idea is to keep away some lines from the Main() function and to start the Internet Explorer separately. WaitSomeMilliseconds 2000 is somehow brutal, but the DoEvents sometimes runs quite fast and appIE is not even starting to be “Busy”.
GenerateHeaders and GenerateTitle
Public Function GenerateHeaders(allRowsOfData As Variant) As Variant
Dim headers As Variant
headers = Split(allRowsOfData.Rows(2).innerText, vbCrLf)
headers = RemoveEmptyElementsFromArray(headers)
GenerateHeaders = headers
End Function
Public Function GenerateTitle(appIE As Object) As String
Dim titleObj As Variant
titleObj = appIE.document.getElementsByClassName("title1")
GenerateTitle = titleObj.innerText
End Function
Here we get the headers and the title for the report. The headers are the second row from the table:

After getting them, we split them by new line (because they come into different lines from .innerText property). Cleaning and removing the empty values in the array is done with RemoveEmptyElementsFromArray(headers). Concerning the title, it is quite self-explanatory – “title1” is the class name of the element and its innerText is “Daily Future Price Listing Fri November 29, 2019”.
GenerateTargetPeriods
Public Function GenerateTargetPeriods(allRowsOfData As Variant, headers As Variant, target As String, targetPeriods As Object) As Variant
Dim child As Variant
Dim child2 As Variant
Dim myKey As Variant
Dim i As Long
Dim found As Boolean: found = False
For Each child In allRowsOfData.Children
For Each child2 In child.Children
If InStr(1, child2.innerText, target) Then found = True
If InStr(1, child2.outerhtml, "th class=") And (Not InStr(1, child2.outerhtml, target) > 0) Then
found = False
End If
If found And child2.Cells.Length = UBound(headers) + 1 Then
myKey = target & child2.Children(0).innerText
targetPeriods.Add myKey, New Collection
For i = LBound(headers) To UBound(headers)
targetPeriods(myKey).Add (child2.Children(i).innerText)
Debug.Print child2.Children(i).innerText
Next
End If
Next
Next
Set GenerateTargetPeriods = targetPeriods
End Function
Generating the target periods is the “meat” (I am not vegetarian) of the code. It returns a dictionary with key string and value list. In the key, I am writing the target and the first cell of the row:

Once it is generated, we return to writing the data to Excel.
WriteToExcel
Public Sub WriteToExcel(targetPeriods As Object, headers As Variant, target As String, title As String)
Dim row As Long: row = 1
Dim col As Long: col = 1
Dim element As Variant
Dim unit As Long
Dim child As Variant
With Worksheets(1)
.Cells.Delete
.Cells(row, col) = title
row = row + 1
.Cells(row, col) = target
row = row + 2
For Each element In headers
.Cells(row, col) = element
col = col + 1
Next
row = row + 1
For Each child In targetPeriods
col = 1
For unit = 1 To targetPeriods(child).Count
.Cells(row, col) = targetPeriods(child)(unit)
col = col + 1
Next
row = row + 1
Next child
Dim startingCol As String
Dim endingCol As String
startingCol = NumberToLetters(2)
endingCol = NumberToLetters(UBound(headers) + 1)
.Columns(startingCol & ":" & endingCol).EntireColumn.AutoFit
End With
End Sub
Writing to Excel is nothing advanced – we loop through the worksheet, taking into account its rows and columns, incrementing the columns and the rows, whenever we need to. Probably it is a bit tricky at the beginning, as far as the columns should always start from 1 on the new row and the rows should be always incremented, but after some time it becomes clear.
Auxiliary Functions
-
WaitSomeMilliseconds
This is an old function, that I use for “asking” Excel to freeze for a couple of seconds, to compensate for some slow internet connection:
Public Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr)
Public Sub WaitSomeMilliseconds(Optional Milliseconds As Long = 1000)
Sleep Milliseconds
End Sub
-
NumberToLetters
This one is used in the column autofit of the WriteToExcel function – .Columns(startingCol & “:” & endingCol).EntireColumn.AutoFit
Public Function NumberToLetters(number As Long) As String
NumberToLetters = Split(Cells(1, number).Address, "$")(1)
End Function
-
RemoveEmptyElementsFromArray
This is an interesting function, which removes the empty elements of an array. And it does it so nicely, I will add it to my VBA_personal repo at GitHub:
Public Function RemoveEmptyElementsFromArray(myArray As Variant) As Variant
Dim i As Long, j As Long
ReDim newArray(LBound(myArray) To UBound(myArray))
For i = LBound(myArray) To UBound(myArray)
If Trim(myArray(i)) <> "" Then
j = j + 1
newArray(j) = myArray(i)
End If
Next i
ReDim Preserve newArray(LBound(myArray) To j - 1)
RemoveEmptyElementsFromArray = newArray
End Function
Pretty much that’s it. The whole code is available in GitHub here:
https://github.com/Vitosh/VBA/tree/master/ScrapingDataFromWebsiteWithVBA
Enjoy it!