• VBA Design Help (Ranking Arrays) (xl97, Win2000)

    Home » Forums » AskWoody support » Productivity software by function » MS Excel and spreadsheet help » VBA Design Help (Ranking Arrays) (xl97, Win2000)

    Author
    Topic
    #407431

    I have an unsorted column of numbers and each number has an associated probability. I need to sort the numbers and then average the n lowest values. n is determined by starting at the lowest number and then proceeding up the list until the sum of the associated probabilities reaches some desired threshhold (i.e. 25%). I’ve got a function that works and I’m just wanting to see if anyone can help make this more efficient since this function does quite a few loops and it is expected to be used many times in a workbook:

    Option Explicit
    Option Base 1
    Function CTE(Level, Values As Object, Optional Max0 = False, _
     Optional Probabilities As Object, Optional Smallest = 1)
    ' Computes Conditional Tail Expectation from the specified 
    ' percentage (i.e. 1-Level) of Values
    '
    ' If the specified number of Values is non-integer, it will linearly interpolate
    ' between the CTEs given by the two integer number of Values
    '
    ' If Max0=TRUE, any Values greater than 0 will be set to 0
    ' If Smallest=1, it will compute the average of the smallest Values
    ' If Smallest1, it will compute the average of the largest Values
    '
    ' Created by DC 9/23/2003
    '  *7/15/2004 Modified to handle duplicate values. Prior version did not do this properly.
    '    Also modified to require explicit declaration of variable types. Also modified to
    '    normalize Probabilities so they sum to 1.00
    '
        If Level > 1 Or Level < 0 Then
            CTE = CVErr(xlValue)
            Exit Function
        End If
        Dim SortedValues(), SortedProbs(), SumProbs As Double
        Dim PriorProb, NewProb, PriorTotal, NewTotal, CTE1, CTE2 As Double
        Dim i, j, k, N, R As Long
        N = Values.Count
        ReDim SortedValues(1 To N), SortedProbs(1 To N)
        SumProbs = 0
        For i = 1 To N
            R = Application.Rank(Values(i), Values)
            Do While Not (IsEmpty(SortedValues®))
                R = R + 1
            Loop
            If Max0 Then
                SortedValues® = Application.Min(0, Values(i))
            Else
                SortedValues® = Values(i)
            End If
            If IsArray(Probabilities) Then
                SortedProbs® = Probabilities(i)
            Else
                SortedProbs® = 1 / N
            End If
            SumProbs = SumProbs + SortedProbs®
        Next i
        For i = 1 To N
            If IsEmpty(SortedProbs(i)) Then
                CTE = CVErr(xlValue)
                Exit Function
            End If
            SortedProbs(i) = SortedProbs(i) / SumProbs
        Next i
        If Smallest = 1 Then
            j = N + 1: k = -1
        Else
            j = 0: k = 1
        End If
        NewTotal = 0: NewProb = 0: CTE1 = 0: CTE2 = 0
        Do While NewProb < (1 - Level)
            PriorTotal = NewTotal
            PriorProb = NewProb
            j = j + k
            NewTotal = NewTotal + SortedProbs(j) * SortedValues(j)
            NewProb = NewProb + SortedProbs(j)
        Loop
        CTE1 = PriorTotal / PriorProb
        CTE2 = NewTotal / NewProb
        CTE = CTE1 + (CTE2 - CTE1) * ((1 - Level) - PriorProb) / (NewProb - PriorProb)
    End Function
    
    Viewing 1 reply thread
    Author
    Replies
    • #852154

      Since I don’t find a property or method of the Application object named Rank, I am not sure what the routine is doing. So I have not analyzed it completely. However, I will make one comment that should make it more efficient. VBA works more efficiently if variables are DIMed to the proper type. You have several statements like the following:

          Dim i, j, k, N, R As Long
      

      If you are expecting that to DIM i,j,k,N,R as Long, that is not what is happening. That statement will DIM i, j, k, and N as Variants, and R as a Long. To do what I think you want, you would need to change that to:

          Dim i As Long, j As Long, k As Long, N As Long, R As Long
      

      You must specify t he type of each variable in t he list.

      • #852229

        Legare,
        Thanks for the guidance on the Dim statement. I’m sure I’ve repeated this mistake many times so thanks for setting me straight.

        The Application.Rank statement calls the Rank worksheet function which gives me the ranking of each Values(i) within the Values array. This is how I’m sorting my array.

        Upon reviewing the Excel97 help, it says, “In previous versions of Microsoft Excel, worksheet functions were contained by the Application object.” I guess the proper syntax is now application.worksheetfunction.rank

        • #852323

          I kind of thought that was what you were doing. I’ll have to spend some more time looking at it, but I would think that this would be a pretty slow way to sort.

          • #853615

            Legare,
            Your comments made me think back to my university days and programming “bubble sort” routines in FORTRAN. Back then, it was considered an efficient way to sort so I tried implementing it in VBA and my testing indicates that it is indeed faster. The code is a little less readable than using the RANK function, but it seems to run considerably faster in most cases. I was also having trouble with the RANK function returning a zero value in some cases (perhaps due to inadequate numeric precision) so this solution avoids that.

            Function CTE(Level As Double, Values As Object, Optional Max0 As Boolean = False, _
             Optional Probabilities As Variant, Optional Smallest As Boolean = True)
            ' Computes Conditional Tail Expectation from the specified percentage (i.e. 1-Level) of Values
            '
            ' If Max0=TRUE, any Values greater than 0 will be set to 0
            ' If Smallest=TRUE, it will compute the average of the smallest Values
            ' If SmallestTRUE, it will compute the average of the largest Values
            '
            ' DC 9/23/2003
            '  *7/15/2004 Modified to handle duplicate values. Prior version did not do this properly.
            '    Also modified to require explicit declaration of variable types. Also modified to
            '    normalize Probabilities so they sum to 1.00
            '  *7/20/2004 Modified to improve efficiency
            '
                CTE = CVErr(xlErrValue)
                If Level >= 1 Or Level < 0 Then Exit Function
                
                Dim SortedValues() As Double, SortedProbs() As Double, SumProbs As Double, Temp As Double
                Dim TotalProb As Double, TotalValue As Double, ProbLimit As Double
                Dim i As Long, j As Long, N As Long
                Dim SortFinished As Boolean, UniqueProbs As Boolean
                Dim wfunc As Object
                Set wfunc = Application.WorksheetFunction
                
                N = Values.Count
                ReDim SortedValues(1 To N), SortedProbs(1 To N)
                
                UniqueProbs = IsArray(Probabilities)
                SumProbs = 0
                For i = 1 To N
                    If Max0 Then
                        SortedValues(i) = wfunc.Min(0, Values(i))
                    Else
                        SortedValues(i) = Values(i)
                    End If
                    If UniqueProbs Then
                        SortedProbs(i) = Probabilities(i)
                    Else
                        SortedProbs(i) = 1 / N
                    End If
                    SumProbs = SumProbs + SortedProbs(i)
                Next i
                
                SortFinished = False
                Do While Not (SortFinished)
                    SortFinished = True
                    For i = 1 To N - 1
                        If SortedValues(i) < SortedValues(i + 1) Then
                            SortFinished = False
                            Temp = SortedValues(i)
                            SortedValues(i) = SortedValues(i + 1)
                            SortedValues(i + 1) = Temp
                            If UniqueProbs Then
                                Temp = SortedProbs(i)
                                SortedProbs(i) = SortedProbs(i + 1)
                                SortedProbs(i + 1) = Temp
                            End If
                        End If
                    Next i
                Loop
                
                If Smallest Then
                    i = N + 1: j = -1
                Else
                    i = 0: j = 1
                End If
                TotalValue = 0: TotalProb = 0: ProbLimit = 1 - Level
                Do While TotalProb < ProbLimit
                    i = i + j
                    TotalValue = TotalValue + SortedProbs(i) / SumProbs * SortedValues(i)
                    TotalProb = TotalProb + SortedProbs(i) / SumProbs
                Loop
                CTE = (TotalValue - (TotalProb - ProbLimit) * SortedValues(i)) / ProbLimit
            End Function
            
            • #853627

              Sorry, I meant to get back to this message and I forgot about it. What you did was similar to what I was going to do but with a different sort algorithm. Try the sort algorithm in the code below, it may be a little faster than what you did. I don’t have your data so I could not test it.

              Function CTE(Level As Double, Values As Object, Optional Max0 As Boolean = False, _
               Optional Probabilities As Variant, Optional Smallest As Boolean = True)
              ' Computes Conditional Tail Expectation from the specified percentage (i.e. 1-Level) of Values
              '
              ' If Max0=TRUE, any Values greater than 0 will be set to 0
              ' If Smallest=TRUE, it will compute the average of the smallest Values
              ' If SmallestTRUE, it will compute the average of the largest Values
              '
              ' DC 9/23/2003
              '  *7/15/2004 Modified to handle duplicate values. Prior version did not do this properly.
              '    Also modified to require explicit declaration of variable types. Also modified to
              '    normalize Probabilities so they sum to 1.00
              '  *7/20/2004 Modified to improve efficiency
              '
                  CTE = CVErr(xlErrValue)
                  If Level >= 1 Or Level  SortedValues(j) Then
                              Temp = SortedValues(i)
                              SortedValues(i) = SortedValues(j)
                              SortedValues(j) = Temp
                              If UniqueProbs Then
                                  Temp = SortedProbs(i)
                                  SortedProbs(i) = SortedProbs(j)
                                  SortedProbs(j) = Temp
                              End If
                          End If
                      Next j
                  Next i
                      
                  If Smallest Then
                      i = N + 1: j = -1
                  Else
                      i = 0: j = 1
                  End If
                  TotalValue = 0: TotalProb = 0: ProbLimit = 1 - Level
                  Do While TotalProb < ProbLimit
                      i = i + j
                      TotalValue = TotalValue + SortedProbs(i) / SumProbs * SortedValues(i)
                      TotalProb = TotalProb + SortedProbs(i) / SumProbs
                  Loop
                  CTE = (TotalValue - (TotalProb - ProbLimit) * SortedValues(i)) / ProbLimit
              End Function
              
            • #853772

              Legare,
              Thanks for your thoughts on this. It looks like your algorithm will always pass through the data exactly .5*N^2 times ((N-1)+(N-2)+…+1). If I recall from my old comp sci classes, I think I will average N/2 passes through the data to sort my list. Since each pass is N-1 loops, I think I will have an average of .5*(N^2-N), but I could be wrong.

            • #853773

              Legare,
              Thanks for your thoughts on this. It looks like your algorithm will always pass through the data exactly .5*N^2 times ((N-1)+(N-2)+…+1). If I recall from my old comp sci classes, I think I will average N/2 passes through the data to sort my list. Since each pass is N-1 loops, I think I will have an average of .5*(N^2-N), but I could be wrong.

            • #853628

              Sorry, I meant to get back to this message and I forgot about it. What you did was similar to what I was going to do but with a different sort algorithm. Try the sort algorithm in the code below, it may be a little faster than what you did. I don’t have your data so I could not test it.

              Function CTE(Level As Double, Values As Object, Optional Max0 As Boolean = False, _
               Optional Probabilities As Variant, Optional Smallest As Boolean = True)
              ' Computes Conditional Tail Expectation from the specified percentage (i.e. 1-Level) of Values
              '
              ' If Max0=TRUE, any Values greater than 0 will be set to 0
              ' If Smallest=TRUE, it will compute the average of the smallest Values
              ' If SmallestTRUE, it will compute the average of the largest Values
              '
              ' DC 9/23/2003
              '  *7/15/2004 Modified to handle duplicate values. Prior version did not do this properly.
              '    Also modified to require explicit declaration of variable types. Also modified to
              '    normalize Probabilities so they sum to 1.00
              '  *7/20/2004 Modified to improve efficiency
              '
                  CTE = CVErr(xlErrValue)
                  If Level >= 1 Or Level  SortedValues(j) Then
                              Temp = SortedValues(i)
                              SortedValues(i) = SortedValues(j)
                              SortedValues(j) = Temp
                              If UniqueProbs Then
                                  Temp = SortedProbs(i)
                                  SortedProbs(i) = SortedProbs(j)
                                  SortedProbs(j) = Temp
                              End If
                          End If
                      Next j
                  Next i
                      
                  If Smallest Then
                      i = N + 1: j = -1
                  Else
                      i = 0: j = 1
                  End If
                  TotalValue = 0: TotalProb = 0: ProbLimit = 1 - Level
                  Do While TotalProb < ProbLimit
                      i = i + j
                      TotalValue = TotalValue + SortedProbs(i) / SumProbs * SortedValues(i)
                      TotalProb = TotalProb + SortedProbs(i) / SumProbs
                  Loop
                  CTE = (TotalValue - (TotalProb - ProbLimit) * SortedValues(i)) / ProbLimit
              End Function
              
        • #852324

          I kind of thought that was what you were doing. I’ll have to spend some more time looking at it, but I would think that this would be a pretty slow way to sort.

      • #852230

        Legare,
        Thanks for the guidance on the Dim statement. I’m sure I’ve repeated this mistake many times so thanks for setting me straight.

        The Application.Rank statement calls the Rank worksheet function which gives me the ranking of each Values(i) within the Values array. This is how I’m sorting my array.

        Upon reviewing the Excel97 help, it says, “In previous versions of Microsoft Excel, worksheet functions were contained by the Application object.” I guess the proper syntax is now application.worksheetfunction.rank

      • #852233

        IN Excel 5, the worksheetfunction object did not exist (to call worksheet functions one just used Application.FunctionName) and for backward compatibility it is still allowed to omit that.

      • #852234

        IN Excel 5, the worksheetfunction object did not exist (to call worksheet functions one just used Application.FunctionName) and for backward compatibility it is still allowed to omit that.

    • #852155

      Since I don’t find a property or method of the Application object named Rank, I am not sure what the routine is doing. So I have not analyzed it completely. However, I will make one comment that should make it more efficient. VBA works more efficiently if variables are DIMed to the proper type. You have several statements like the following:

          Dim i, j, k, N, R As Long
      

      If you are expecting that to DIM i,j,k,N,R as Long, that is not what is happening. That statement will DIM i, j, k, and N as Variants, and R as a Long. To do what I think you want, you would need to change that to:

          Dim i As Long, j As Long, k As Long, N As Long, R As Long
      

      You must specify t he type of each variable in t he list.

    Viewing 1 reply thread
    Reply To: VBA Design Help (Ranking Arrays) (xl97, Win2000)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: