Full and Part Summaries with the Dictionary

Creating summaries using the scripting dictionary - I have touched on this issue in prior posts on both the main site and the blog.  It can be a little confusing to get your head around the scripting dictionary but here is some code which will help you on your journey.  I first saw this method years ago on Ozgrid by a poster named Jindon – he is rather famous in Excel circles as his posts are insightful and really fast and accurate.  The method takes a dataset and has the ability to summarise many thousands of rows instantly and when I say instantly it happens so fast when compared to other methods.


The following outlines the process in a video on Youtube.

 

Let’s take a simple data set and summarise it by a unique identifier.  If we have a fruit stall and want to summaries the sales amount by items sold.

Scripting Dictionary

The above is a sample of the data which is a lot longer inside the file but will give you a feel for the data we are trying to summarise.

Summary with Dic

 

The summary by fruit would look a bit like the above.  The scripting dictionary was used to produce this simplistic summary.  The first method I will show you will generate a sum based on unique criteria.  The unique part being the name of the fruit. 


Sub SumIt()
    Dim ar
    Dim i As Long
    Dim arr As Variant
    Dim n As Long
    
    ar = [a1].CurrentRegion.Value
    With CreateObject("Scripting.Dictionary")
        For i = 1 To UBound(ar, 1)
            .Item(ar(i, 1)) = .Item(ar(i, 1)) + ar(i, 6)
        Next
        arr = Array(.keys, .items)
        n = .Count
    End With
    [T4].CurrentRegion.ClearContents
    [T4].Resize(n, 2).Value = Application.Transpose(arr)
End Sub

The reason the above coding works is the dictionary will only include unique items.  The fruit is unique, so each item is assessed and summed in turn. 

.Item(ar(i, 1)) = .Item(ar(i, 1)) + ar(i, 6)

The 6 at the end of the line above refers to column 6 (the summation) in the example dataset.  Change the 6 to the column which is relevant for your data.

.Item(ar(i, 1)) = .Item(ar(i, 1)) + ar(i, 6)

Additionally the current items value is added with the addition of the next like value in the list.  When the loop is complete the entire contents of the dictionary is added to the variable arr.

arr = Array(.keys, .items)

Now all that needs to be done is to define a space to put the contents of the array (arr).

[T4].Resize(n, 2).Value = Application.Transpose(arr)

Where n is the length of the array and 2 is the width (two columns Fruit and the value we place against it.

 

The second method is to produce a summary but include each unique line of the table.

Excel Scripting dictionary

The above has more details about each of the items in the list.  You can see more details have been included in this type of summary.  This time Columns O and P have been summarised. 

 

Sub SumMultiple()
    Dim ar As Variant
    Dim i As Long
    Dim j As Long
    Dim n As Long
    Dim str As String
    
    n = 1
    ar = Cells(10, 1).CurrentRegion.Value
    With CreateObject("Scripting.Dictionary")
        For i = 2 To UBound(ar, 1)
            str = ar(i, 1)
            If Not .Exists(str) Then
                n = n + 1
                For j = 1 To UBound(ar, 2)
                    ar(n, j) = ar(i, j)
                Next
                .Item(str) = n
            Else
                For j = 5 To UBound(ar, 2)
                    ar(.Item(str), j) = ar(.Item(str), j) + ar(i, j)
                Next
            End If
        Next
    End With
    [K4].CurrentRegion.ClearContents
    [K4].Resize(n, UBound(ar, 2)).Value = ar
End Sub

The above has a lot more going on but all you really need to know is this:

 

                For j = 5 To UBound(ar, 2)
                    ar(.Item(str), j) = ar(.Item(str), j) + ar(i, j)
                Next

The columns to Sum start in Column 5 (E) and go to the end of the columns you need to sum in this case 6.  Now if the dataset was larger say 10 columns to sum it would sum these columns all in turn with the above loop starting in Column 5 summing all the like items and ending in Column 14.  It is very efficient and quite simple to adapt to your needs.  You will be remarkably surprised how fast it runs even on datasets which stretch into the thousands of rows.

 

SumWithDictionary.xlsm

 

Combining Data with the Scripting Dictionary

 

OK let’s take this puppy one step further.  Let’s say our data needs to be examined over multiple columns.  The data in Column A is linked to the Data in Column B to create a unique identifier.  We want to create a summary based on 2 columns.  Not just one as shown in the prior example.  Let’s say the supplier is Bob and Bob orders – Apples and Oranges.  The order is split over 6 different lines but variations of either Apples or Oranges. 

The following outlines the process in a video on Youtube. The following Excel file goes with the video.

MultiJoin.xlsm


Let’s say you need to summarise the data based on the supplier Bob and the fruit Apple or Orange.  If Bob buys a different fruit then we want our code to be uber flexible so it traps and records the data.

Embed Block
Add an embed URL or code. Learn more
Scripting Dictionary explained

The VBA code to achieve the task is as follows and it is remarkably easy to change to suit your needs.  The parts to change will be explained in the below segment.

Sub SumJoinCol()
Dim rng As Range
Dim r As Range
Dim i As Integer
Dim j As Long
Dim n As Long
Dim txt As String
Dim ar As Variant

Set rng = Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
ar = [a1].CurrentRegion

    With CreateObject("scripting.dictionary")
        For Each r In rng
            txt = Join(Application.Transpose(Application.Transpose(r.Resize(, 2))), ",") 'First 2 columns (2)
            If Not .Exists(txt) Then
                n = n + 1
                 .Add txt, n
                    For j = 1 To UBound(ar, 2) 'Number of Columns
                        ar(n, j) = r.Offset(, j - 1)
                    Next j            
            Else
                For i = 6 To UBound(ar, 2) 'Start of Calc Columns (Col 6) in this case.
                    ar(.Item(txt), i) = ar(.Item(txt), i) + r.Offset(, i - 1)
                Next i
            End If
        Next
      Sheet3.[a1].Resize(n, UBound(ar, 2)) = ar
    End With
End Sub

The summary from the above VBA procedure is as follows.  You will find it reconciles perfectly which is the result we want.

Excel VBA scripting dictionary

The secret with the above is the combining of data with the VBA  join.  The creation of a text join  between the first two columns –

 

txt = Join(Application.Transpose(Application.Transpose(r.Resize(, 2))), ",")

 

This allows the columns to be joined which in turn creates a unique identifier between Column A and Column B.

 

BobApple

BobOrange

 

The scripting key must  be unique so it adds up all of the BobApple  and BobOrange in Column 6 and Column 7.

 

 For i = 6 To UBound(ar, 2) 'Start of Calc Columns (Col 6) in this case.

 

In the above case the instruction is for the loop to start in Col 6 and go to the last column in the array which is 7.  If your data is larger then the above will handle this.  The hard coding of the start column is all you need to worry about. 

If you wanted to extend the procedure to cover the join for 3 or more columns the code would look as follows for 3 columns. 

 

txt = Join(Application.Transpose(Application.Transpose(r.Resize(, 3))), ",")

 

Where the first 3 columns are being joined to create a unique identifier.  The following file takes the procedure one step further.  If you are into your scripting dictionary code this has vast implications as you can spit out summary tables like lightning. :)  Enjoy gently!!!

 

MutiColJoin.xlsm