Code Review Asked on November 11, 2021
I need to extract unique Product Group names along with its corresponding services from a table in a worksheet. The table is generated by a bot and is not filtered, I have sorted it by alphabetical order. The data is not fixed and can contain anywhere from 5 – 100 rows of data, depending on the month which the report from the bot is generated.
I decided to use a Dictionary
to store the the Product Group Name as they Key
, while using a Collection
to store services. The Collection only stores unique services by using On Error Resume Next
What changes could I make to my code?
Snippet of my Table
Code
Public Sub BuildTMProductDictionary()
Dim tmData As Variant
tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value
Dim i As Long
For i = LBound(tmData, 1) To UBound(tmData, 1)
Dim product As String
product = tmData(i, 1)
'store unique services in a collection, On Error Resume Next used to avoid duplicates
On Error Resume Next
Dim services As New Collection
services.Add (tmData(i, 2)), (tmData(i, 2))
'get the product name of the next row
Dim nextProduct As String
nextProduct = tmData(i + 1, 2)
'compare the current product against the next product create New Dictionary if <>
If product <> nextProduct Then
Dim productGroup As New Dictionary
productGroup.Add product, services
Set services = New Collection
End If
Next
End Sub
Edit
My Collection
of services needs to be unique. As an example "Positive Pay" which belong to the "ARP" product group should only appear once in the collection.
I have sorted it by alphabetical order
A year from now are you going to remember that the data is supposed to be presorted? Adding a comment notating it would be helpful. Better yet would be suffix it to the routines name:
Public Sub BuildTMProductDictionaryFromSortedTable()
The best approach is not to rely on the data being sorted in the first place. The reason we use dictionaries in the first place is for lightning fast lookups and the ability to check if a key exists. Simply, store a new collection each time you create a key in the dictionary and use the key to retrieve the collection as needed.
If Not productGroup.Exists(product) Then productGroup.Add product, New Collection
On Error Resume Next
productGroup(product).Add tmData(i, 2)
On Error GoTo 0
It is best to limit the scope of On Error Resume Next
as much as possible by using On Error GoTo 0
. The tighter the scope the better chance we will find the errors while debugging.
Public Sub BuildTMProductDictionary()
So you have a sub routine that builds the compiles the data just the way you want it. Excellent! Now what? You could, of course, add some more functionality to the method but that isn't what you should be doing. Ideally, every routine should do as few things as possible and do them flawlessly in a very easy to read manor.
It would be better to change BuildTMProductDictionary()
from a sub routine to a function and have it return the data.
Something like this:
Public Function GetTMProductDictionary()
Const productCol As Long = 1, serviceCol As Long = 1
Dim Data As Variant
Data = Sheet1.ListObjects("Table1").DataBodyRange.Value
Dim productGroup As New Dictionary
Dim i As Long
For i = LBound(Data, 1) To UBound(Data, 1)
If Not productGroup.Exists(Data(i, productCol)) Then productGroup.Add Data(i, productCol), New Collection
On Error Resume Next
productGroup(Data(i, productCol)).Add Data(i, serviceCol)
On Error GoTo 0
Next
Set GetTMProductDictionary = productGroup
End Function
This is pretty good but is the function as simple as it can be? What does it actually do?
If the function is compiling data, it really need to return it. But does it need to retrieve the data?
What effects would passing the data in as a parameter have our overall design?
By decoupling data gathering from data processing makes it far easier to test the code. In this case we could make a test table an a unit test that will run regardless independently from the actual data.
It reduces the size of the method, which in turn, makes the code easier to read and modify.
Public Function GetTMProductDictionary(Data As Variant) Const productCol As Long = 1, serviceCol As Long = 1
Dim productGroup As New Dictionary
Dim i As Long
For i = LBound(Data, 1) To UBound(Data, 1)
If Not productGroup.Exists(Data(i, productCol)) Then productGroup.Add Data(i, productCol), New Collection
On Error Resume Next
productGroup(Data(i, productCol)).Add Data(i, serviceCol)
On Error GoTo 0
Next
Set GetTMProductDictionary = productGroup
End Function
How does this effect the meaning of our names? Should the variable names remain the same?
The larger the scope of the more descriptive the names should be.
Lets take a closer look at the names. Can they be simplified or improved? Can they be shortened or generalized?
GetTMProductDictionary()
, productCol
, serviceColThis
all makes sense.
But productGroup
? What is a productGroup
? Its a dictionary. How many dictionaries are there in this small function? Only 1. Why not just call it
Dictionary
? I name my dictionaries Map
or somethingMap
because it is a simple and clean naming pattern and I hate seeing dic
.
So now we have a Map
. Maps use key/value pairs. The Map
doesn't care if the key is a product group or that the product group or that the value is a collection. Does knowing about product groups and services even help us review the code? Maybe...just a little.
What would happen if we just generalized the code? If we gave everything simple, common, familiar, and meaningful names that we see every time we work with this type of code? What would it look like?
Public Function GetMapCollection(Data As Variant, keyColumn As Long, valueColumn As Long)
Dim Map As New Dictionary
Dim i As Long
For i = LBound(Data, 1) To UBound(Data, 1)
If Not Map.Exists(Data(i, keyColumn)) Then Map.Add Data(i, keyColumn), New Collection
On Error Resume Next
Map(Data(i, keyColumn)).Add Data(i, valueColumn)
On Error GoTo 0
Next
Set GetMapCollection = Map
End Function
Looks to me that we found a generic reusable function hiding in the code. Not only has the data retrieval and compilation been decouple but the context, in which, the compiled data is going to used has been washed away.
This is what we should strive for when we are refactoring. Our methods should be so small and simple that they only know the bare minimum.
I modified the function to use only dictionaries and added sample usage.
Sub Usage()
Dim productGroupServices As Scripting.Dictionary
Dim serviceProductGroups As Scripting.Dictionary
Dim tmData As Variant
tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value
Set productGroupServices = GetUniqueGroups(tmData, 1, 2)
Set serviceProductGroups = GetUniqueGroups(tmData, 2, 1)
Stop
End Sub
Public Function GetUniqueGroups(Data As Variant, keyColumn As Long, valueColumn As Long) As Dictionary
Dim Map As New Dictionary
Dim i As Long
Dim Key As Variant
Dim Value As Variant
For i = LBound(Data, 1) To UBound(Data, 1)
Key = Data(i, keyColumn)
Value = Data(i, valueColumn)
If Not Map.Exists(Key) Then Map.Add Key, New Dictionary
If Not Map(Key).Exists(Value) Then Map(Key).Add Value, Value
Next
Set GetUniqueGroups = Map
End Function
Answered by TinMan on November 11, 2021
You seem to be misunderstanding how to use a Scripting.Dictionary.
There is no need to sort the data before processing into a dictionary.
There is also no need to construct a collection before you add to the dictionary.
Its also slightly more sensible to write the sub as a function.
As a final tweak I'd pass the array in as a parameter rather than hardwiring it into the function, but I'll leave that as an exercise for the reader (smile)
Public Function BuildTMProductDictionary() As Scripting.Dictionary
Dim tmData As Variant
tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value
Dim myDict As Scripting.Dictionary
Set myDict = New Scripting.Dictionary
Dim i As Long
For i = LBound(tmData, 1) To UBound(tmData, 1)
Dim myProduct As String
myProduct = tmData(i, 1)
Dim myService As String
myService = tmData(i, 2)
If Not myDict.exists(myProduct) Then
myDict.Add myProduct, New Collection
End If
myDict.Item(myProduct).Add myService
Next
Set BuildTMProductDictionary = myDict
End Function
Replace
If Not myDict.exists(myProduct) Then
myDict.Add myProduct, New Collection
End If
myDict.Item(myProduct).Add myService
with
If Not myDict.exists(myProduct) Then
myDict.Add myProduct, New Scripting.Dictionary
End If
If Not myDict.Item(myProduct).exists(myService) Then
myDict.Item(myProduct).Add myService,myService
End If
Answered by Freeflow on November 11, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP