Reputation: 54838

Function vs Sub(ByRef)

About

This question is not about when to use a Function or a Sub, or the difference between ByRef and ByVal (although some insights will be unavoidable).
It is about scenarios which are 'commonly' solved with a Function, but can optionally be solved with a Sub using ByRef in the sense of 'modifying'.

The Code

Consider the following function:

' Returns the worksheet (object) with a specified name in a specified workbook (object).
Function getWsF(wb As Workbook, _
                ByVal wsName As String) _
         As Worksheet
    ' 'getWsF' is 'Nothing' by default.
    ' Try to define worksheet.
    On Error Resume Next
    Set getWsF = wb.Worksheets(wsName)
End Function

You can utilize it like the following:

' Writes the name of a specified worksheet, if it exists, to the `Immediate` window...
Sub testFunction()
    
    Const wsName As String = "Sheet1"
    Dim wb As Workbook
    Set wb = ThisWorkbook ' The workbook containing this code.
    
    ' Define worksheet.
    Dim ws As Worksheet
    Set ws = getWsF(wb, wsName)
    
    ' Test if worksheet exists.
    If Not ws Is Nothing Then
        Debug.Print "The worksheet name is '" & ws.Name & "'."
    Else
        Debug.Print "Worksheet '" & wsName & "' doesn't exist in workbook '" _
                  & wb.Name & "'."
    End If

End Sub

But you can also write each of the procedures in the following way:

' Although 'ByRef' is not necessary, I'm using it to indicate that whatever
' its variable is referring to in another procedure (in this case
' a worksheet object), is going to be modified (possibly written to
' for other datatypes).
Sub getWsS(ByRef Sheet As Worksheet, _
           wb As Workbook, _
           ByVal wsName As String)
    ' 'Sheet' could be 'Nothing' or an existing worksheet. You could omit
    ' the following line if you plan to use the procedure immediately
    ' after declaring the worksheet object, but I would consider it
    ' as too risky. Therefore:
    ' 'Reinitialize' worksheet variable.
    Set Sheet = Nothing
    ' Try to define worksheet.
    On Error Resume Next
    Set Sheet = wb.Worksheets(wsName)
End Sub

' Writes the name of a specified worksheet, if it exists, to the `Immediate` window...
Sub testSub()
    
    Const wsName As String = "Sheet1"
    Dim wb As Workbook
    Set wb = ThisWorkbook ' The workbook containing this code.
    
    ' Define worksheet.
    Dim ws As Worksheet
    getWsS ws, wb, wsName
    
    ' Test if worksheet exists.
    If Not ws Is Nothing Then
        Debug.Print "The worksheet name is '" & ws.Name & "'."
    Else
        Debug.Print "Worksheet '" & wsName & "' doesn't exist in workbook '" _
                  & wb.Name & "'."
    End If

End Sub

Side by Side

Procedure

Function getWsF(wb As Workbook, _       Sub getWsS(ByRef Sheet As Worksheet, _
                wsName As String) _                wb As Workbook, _
         As Worksheet                              wsName As String)
                                          Set Sheet = Nothing
  On Error Resume Next                    On Error Resume Next
  Set getWsF = wb.Worksheets(wsName)      Set Sheet = wb.Worksheets(wsName)
End Function                            End Sub

Usage (relevant)

  ' Define worksheet.                     ' Define worksheet.
  Dim ws As Worksheet                     Dim ws As Worksheet
  Set ws = getWsF(wb, wsName)             getWsS ws, wb, wsName

The Question(s)

Is the second solution viable?
I'm looking for a proper description of what each of the relevant two procedures do and some insights in terms of common practice, readability, efficiency, pitfalls ...

Upvotes: 5

Answers (3)

Cristian Buse

Reputation: 4588

In your case, I would use the Function approach and here are a few reasons:

I can use the result of the Function without storing the returning variable:

With getWsF(ThisWorkbook, "Sheet1")
    '...
End With

Obviously, I would need to be sure it never returns Nothing or have some error handling in place.

DoSomething getWsF(ThisWorkbook, "Sheet1")

where DoSomething is a method expecting a Worksheet/Nothing

As @TimWilliams mentioned in the comments, if you don't expect multiple return values then this is the "expected" way to do it. A well-established convention is that if the method has no return value it should be a Sub. If it has a single return value then it should be a Function. If it returns multiple values then it should also be a Function and:
- you either use a Class or Type to pack them as one result
- or, the Function returns a primary value as the result and the rest of the return values ByRef (see @UnhandledException's answer for an example).
If you ever need to call the method with Application.Run then Function is safe. Using a Sub often leads to Automation Errors or code simply stops running after the method is executed. It doesn't matter if you need to use the result of the Function or not, don't call a Sub with Application.Run if you don't want nasty errors. Of course, to avoid issues with Application.Run, you could have a Function that doesn't get assigned a return value and still return the Worksheet ByRef but this would be too confusing for the reader.

Edit #1

Forgot to mention that the Application.Run automation errors are happening when calling methods from a different document (for Excel - different Workbook)

Edit #2

In this section I will try to address the proper description side of your question, without doing a begginer and an advanced explanation but a combined one.

Difference between a Sub and a Function

A Sub is just a function that does not return a value after the function executes. In lots of languages, such a function is called a Void Function.

The implications is that a Sub is just a stand-alone statement. It cannot be called from inside an expression. You can only call it with one of:

MySub [argsList]
Call MySub([argsList])

On the other hand, a Function can be used inside statements like:

arguments to other methods e.g. DoSomething MyFunction(...); Debug.Print MyFunction(...)
assignment e.g. x = MyFunction(...)
With blocks e.g. With MyFunction(...)
method chaining e.g. MyFunction(...).DoSomething

The convention mentioned above:

A well-established convention is that if the method has no return value it should be a Sub. If it has a single return value then it should be a Function

becomes quite clear when we understand that a Sub does something and a Function returns a single value, by definition.

Similarity between a Sub and a Function

Both value-returning functions (Function in VBA) and void functions (Sub in VBA) are receiving values as parameters. In VBA, it is possible to return results via ByRef parameters. Not all languages support ByRef parameters (e.g. Java - except modifying members of Objects for example).

Note that porting code from a platform that supports ByRef to another one that does not, can be quite time-consuming if the ByRef approach is abused in the source platform.

Difference between ByVal and ByRef parameters

Passing by value (ByVal):

a new memory space is allocated for a new variable which will be of local scope to the method being called
the contents of the original variable are copied in the newly allocated space of the new variable (for Objects the address of the interface Virtual Table is copied instead)
contents of the original variable are NOT changed regardless of what the method does
it is much safer because the programer does not need to keep in mind/care about other parts of the program (specifically the calling method)

Passing by reference (ByRef):

a new variable is created but no new memory space is allocated. Instead the new variable points to the memory space occupied by the original variable being passed from the calling method. Note that for Objects, the original variable is passed entirely (no new variable is created) unless the interface is different (passing a Collection as an Object type parameter - Object stands for IDispatch) but this is a discussion outside of the scope of this answer. Also note that if the parameter is declared as Variant, there are more complex operations happening to facilitate the redirection
the contents of the original variable can now be changed remotely because both the original variable and the newly created one point to the same memory space
it is considered more efficient because no new memory is allocated but this comes with the downside of increasing complexity

Comparison of the presented methods

Now that we have some understanding of the differences, we can look at both the presented methods. Let's start with the Sub:

Sub getWsS(ByRef Sheet As Worksheet, wb As Workbook, ByVal wsName As String)
    Set Sheet = Nothing
    On Error Resume Next
    Set Sheet = wb.Worksheets(wsName)
End Sub

First of all, there should be an On Error GoTo 0 statement before the End Sub because otherwise the Error 9 is propagated up the calling chain (if sheet not found) and can affect logic inside other methods, long after the getWsS method has returned.

The method name starts with the verb "get" which implies that this method returns something but the method declaration is a Sub which is, by definition, more like a Do Something than a Return Something. Readability is certainly affected.

There is a need for an extra ByRef parameter to return the single piece of result. Implications:

it affects readability
it requires a declared variable inside the calling method
the result cannot he chained/used in other expressions within the calling method
it requries the extra line Set Sheet = Nothing to make sure the original variable does not retain previous contents

Now, let's look at the Function approach:

Function getWsF(wb As Workbook, ByVal wsName As String) As Worksheet
    On Error Resume Next
    Set getWsF = wb.Worksheets(wsName)
End Function

Same as before, there should be an On Error GoTo 0 statement before the End Function because otherwise the Error 9 is propagated up the calling chain. Also, the Workbook can be passed ByVal as best practice.

Obvious differences:

the name getSomething is perfect for a function that returns Something. Readability is far better than the Sub couterpart
the reader/maintainer of the code instantly knows that the function returns a Worksheet just by looking at the return type (as opposed to looking through a list of ByRef parameters and figuring out which one is the return variable)
the result can be chained/used in expressions
no extra lines of code are needed, the default returning value is already Nothing
the most widely accepted convention is used

I've used CTimer and it seems like on my x64 machine, the Sub approach runs faster with about 20ms when running the methods for a million times. Are these minor efficiency gains worth the loss in readability and flexibility of use? That is something that only the maintainer of the code base can decide.

Upvotes: 5

AHeyne

Reputation: 3465

A good example for using both, a function and a ByRef parameter is a 'Try' function:

Public Function TryGetMyValue(ByRef outMyValue As Long) As Boolean
    On Error Goto Catch

    outMyValue = GetMyValue() 'Does anything to retrieve the value and could raise an error...

    TryGetMyValue = True

Done:
    Exit Function

Catch:
    Resume Done
End Function

It could be used like that:

Public Sub TestTryGetMyValue()
    Dim myValue As Long
    If Not TryGetMyValue(myValue) Then Exit Sub

    Debug.? "MyValue is: " & myValue
End Sub

Upvotes: 0

SierraOscar

Reputation: 17637

To answer your question directly:

Q: Is it viable?
A: Yes, it will compile and carry out the functionality that you're expecting.

The grey area comes about when you ask should you do this.

There's definitely nothing stopping you (assuming you aren't subject to some company coding standards or anything). Typically however, functions are used to take in parameters, perform some kind of logic and return a value or object at the end of that logic.

Functions are typically non-destructive and don't change the values or properties of the input parameters. This becomes especially important for code readability and maintenance because other developers (and even yourself a few months from now) will read the code expecting functions to behave in a certain way.

Sub routines on the other hand are not expected to return anything, and so they are used to run concise, related sections of code that carry out some kind of logic relevant to the application. Going back to the point of readability and maintenance, it's reasonable to assume that objects and properties will change inside of a sub routine and so again this makes the developer's life a little easier.

Ultimately there's no hard and fast rules - but there are years of experience and best practice which are often good advice to take on in these scenarios :)

Upvotes: 3

Function vs Sub(ByRef)

Answers (3)

Related Questions