504more
504more

Reputation: 449

Powershell arrays: When to use them; when to avoid them; and problems using them

Why doesn't the .NET Framework ArrayList class .Add method work in a PowerShell implementation?

Unless I'm otherwise corrected, I think the overall moral of my story might be: Don't assume that native PowerShell methods are going to be the same as .NET methods, and be careful when attempting .NET methods in PowerShell.

The original solution I was seeking was to return a list of dates from a function, as an array, with a user-defined date range as parameters. The array of dates would then be referenced to move and read files that are named with date stamps.

The first problem I encountered was creating a dynamic array. I didn't know what I was doing, and was incorrectly calling a .NET .Add method on an @() array declaration.

Exception calling "Add" with "1" argument(s): "Collection was of a fixed size."

I thought I needed to find a dynamic array type, when my real problem was that I wasn't doing it right. That sent me in a different direction, until very much later, I discovered that objects should be added to PowerShell arrays using the += syntax.

Anyway, I was off on some other tangents before I returned to how to use a PowerShell array correctly.

I then found the .NET ArrayList class. Okay, fine. Now I had a dynamic array object. I read the documentation, which said I should use the .Add method to add elements to the collection.

Then began my quest for deeper understanding, as I traversed a span of a couple of days of head-clasping frustration trying to troubleshoot problems.

I produced an implementation that seemed at first to work. It produced a date range - but it also produced some strange behavior. I observed weird dates returned, such as:

Monday, January 1, 0001 12:00:00 AM

It turns out, I discovered, that this is the result obtained when you do this:

Get-Date 0

The ArrayList was returning, first, a list of index values to array elements, and then the array values. That didn't make any sense at all. I began exploring whether I was calling functions correctly, whether I was experiencing some sort of variable scope issue, or whether I was just nuts.

I'm now fairly convinced that my frustration was caused by the lack of a solid beginner's reference that doesn't just show a couple of examples for how to do a simple array implementation, but that describes some of the caveats, with alternative solutions.

Let me explain here, then, three ways to implement arrays/collections, with a solution for what I was attempting to produce - that being, a list of dates in a date range.

For some reason, I initially thought that the proper method to add an element to a .NET ArrayList in Powershell is to use the .Add method. It's documented. I still don't understand the reason why this doesn't work (seriously - somebody please enlighten me). Through experimentation, I discovered that I could, however, obtain accurate results by using the += method for adding objects to an ArrayList.

Don't do this. This is absolutely WRONG. It will produce the errors I described above:

Function Get-DateRangeList {
    [cmdletbinding()]
    Param (
        [datetime] $startDate,
        [datetime] $endDate
    )

    $datesArray = [System.Collections.ArrayList]@()  # Second method

    for ($d = $startDate; $d -le $endDate; $d = $d.AddDays(1)) {
        if ($d.DayOfWeek -ne 'Sunday') {
            $datesArray.Add($d)
        }
    }

    Return $datesArray
}

# Get one week of dates, ending with yesterday's date
$startDate = Get-Date
$endDate = $startDate.AddDays(-1)  # Get yesterday's date as last date in range
$startDate = $endDate.AddDays(-7)  # Get 7th prior date as first date in range

$datesList = Get-DateRangeList  $startDate $endDate

# Loop through the dates
Foreach ($d in $datesList) {
    # Do something with each date, e.g., format the date as part of a list
    # of date-stamped files to retrieve
    $d
}

Now, there are three code examples below that DO WORK. In each example, the code is the same. All that I've done is commented/uncommented the corresponding instantiation lines, and method lines.

First, using the native PowerShell array object:

Function Get-DateRangeList {
    [cmdletbinding()]
    Param (
        [datetime] $startDate,
        [datetime] $endDate
    )

    $datesArray = @()  # First method
    #$datesArray = [System.Collections.ArrayList]@()  # Second method
    #$datesArray = New-Object System.Collections.Generic.List[System.Object]  # Third method

    for ($d = $startDate; $d -le $endDate; $d = $d.AddDays(1)) {
        if ($d.DayOfWeek -ne 'Sunday') {
            $datesArray += $d     # First and second method: += is the method to add elements to: Powershell array; or .NET ArrayList (confusing)
            #$datesArray.Add($d)  # Third method: .Add is the method to add elements to: .NET Generic List
        }
    }

    Return $datesArray
}

# Get one week of dates, ending with yesterday's date
$startDate = Get-Date
$endDate = $startDate.AddDays(-1)  # Get yesterday's date as last date in range
$startDate = $endDate.AddDays(-7)  # Get 7th prior date as first date in range

$datesList = Get-DateRangeList  $startDate $endDate

# Loop through the dates
Foreach ($d in $datesList) {
    # Do something with each date, e.g., format the date as part of a list
    # of date-stamped files to retrieve
    "FileName_{0}.txt" -f $d.ToString("yyyyMMdd")
}

Second, using a .NET Framework ArrayList:

Function Get-DateRangeList {
    [cmdletbinding()]
    Param (
        [datetime] $startDate,
        [datetime] $endDate
    )

    #$datesArray = @()  # First method
    $datesArray = [System.Collections.ArrayList]@()  # Second method
    #$datesArray = New-Object System.Collections.Generic.List[System.Object]  # Third method

    for ($d = $startDate; $d -le $endDate; $d = $d.AddDays(1)) {
        if ($d.DayOfWeek -ne 'Sunday') {
            $datesArray += $d     # First and second method: += is the method to add elements to: Powershell array; or .NET ArrayList (confusing)
            #$datesArray.Add($d)  # Third method: .Add is the method to add elements to: .NET Generic List
        }
    }

    Return $datesArray
}

# Get one week of dates, ending with yesterday's date
$startDate = Get-Date
$endDate = $startDate.AddDays(-1)  # Get yesterday's date as last date in range
$startDate = $endDate.AddDays(-7)  # Get 7th prior date as first date in range

$datesList = Get-DateRangeList  $startDate $endDate

# Loop through the dates
Foreach ($d in $datesList) {
    # Do something with each date, e.g., format the date as part of a list
    # of date-stamped files to retrieve
    "FileName_{0}.txt" -f $d.ToString("yyyyMMdd")
}

Third, using a .NET Framework Generic List:

Function Get-DateRangeList {
    [cmdletbinding()]
    Param (
        [datetime] $startDate,
        [datetime] $endDate
    )

    #$datesArray = @()  # First method
    #$datesArray = [System.Collections.ArrayList]@()  # Second method
    $datesArray = New-Object System.Collections.Generic.List[System.Object]  # Third method

    for ($d = $startDate; $d -le $endDate; $d = $d.AddDays(1)) {
        if ($d.DayOfWeek -ne 'Sunday') {
            #$datesArray += $d     # First and second method: += is the method to add elements to: Powershell array; or .NET ArrayList (confusing)
            $datesArray.Add($d)  # Third method: .Add is the method to add elements to: .NET Generic List
        }
    }

    Return $datesArray
}

# Get one week of dates, ending with yesterday's date
$startDate = Get-Date
$endDate = $startDate.AddDays(-1)  # Get yesterday's date as last date in range
$startDate = $endDate.AddDays(-7)  # Get 7th prior date as first date in range

$datesList = Get-DateRangeList  $startDate $endDate

# Loop through the dates
Foreach ($d in $datesList) {
    # Do something with each date, e.g., format the date as part of a list
    # of date-stamped files to retrieve
    "FileName_{0}.txt" -f $d.ToString("yyyyMMdd")
}

All three of these work. Why would you prefer one over another? The native PowerShell array, and the .NET Framework ArrayList class, both produce collections of objects that aren't strongly typed, so you can do this (in a Powershell array implementation):

$myArray = @(1, 2, 3, "A", "B", "C")

The Powershell array won't be efficient for a very large array. The ArrayList is a better choice for a very large collection.

The .NET Framework Generic List is the best choice, it seems, for very large collections of objects that are the same type. In my example, I want a list of dates. Each date is the same data type, so I have no need to mix object types. Therefore, the solution I'm deploying is the third working example above.

I appreciate Dave Wyatt's 2013 Powershell.org article on the topic: PowerShell Performance: The += Operator (and When to Avoid It). In particular, the += method creates a new array object in each pass within a loop, adding the new element, and then destroying the old array. This becomes very inefficient with a large collection.

I'm posting these solutions and discussion in the hope that some other beginner will more readily find the answers I was seeking.

Yes - that's right - I don't adhere to what seems in some people's minds to be strict PowerShell syntax etiquette. I use the return statement in a function, so it's obvious what the function produces. I prefer readable code that might look sprawling rather than tight. That's my preference, and I'm sticking to it.

For a more PowerShell-esque implementation of a date list, I refer readers to the tidy implementation posted by The Surly Admin.

Upvotes: 3

Views: 2160

Answers (2)

Χpẘ
Χpẘ

Reputation: 3451

Regarding 3rd paragraph of OP: Collections.arraylist does work in powershell, for instance:

# Create arraylist with space for 20 object
$ar = new-object collections.arraylist 20
$ar.add("hello everybody")
$ar.add([datetime]::now)
$ar.add( (gps)[9])
$ar[0]  # returns string
$ar[1]  # returns datetime
$ar[2]  # returns tenth process
$ar.count # returns 3

I think the takeaway from this is to read the MSDN documentation for arraylist more carefully.

If you use += on an arraylist in PS, it takes the elements from the arraylist, and the new element and creates an array. I believe that is an attempt to shield users from the complexity of .NET that you stumbled upon. (I suspect that one of the PS product team's primary use case is a user who is not familiar with .NET in general and arraylist in particular. You apparently don't fall in that category.)

I will mention a stumbling block with PS and arrays. PS will automatically unroll arrays in some cases. For examples, if I have an array of chars and I want to create a string (using the String..ctor([char[]]) overload) then this doesn't work:

# Fails because PS unrolls the array and thinks that each element is a
# different argument to String..ctor
$stringFromCharArray = new-object string $charArray
# Wrap $charArray to get it to work
$stringFromCharArray = new-object string @(,$charArray)
# This also works
$stringFromCharArray = new-object string (,$charArray)

There are also similar issues when you pass an array down a pipeline. If you want the array passed down the pipeline (versus the array elements) then you need to wrap it in another array first.

Upvotes: 2

mjolinor
mjolinor

Reputation: 68273

Most of the time I see array addition, it's totally unnecessary. The Powershell pipeline will automatically create arrays for you any time an expression returns more than one object, and it will do it very efficiently.

Consider:

Clear-Host 

Function Get-DateRangeList {

    [cmdletbinding()]
    Param (
        [datetime] $startDate,
        [datetime] $endDate
    )

    $datesArray = 
    for ($d = $startDate; $d -le $endDate; $d = $d.AddDays(1)) {

        if ($d.DayOfWeek -ne 'Sunday') {

            $d
        }

    }

    Return ,$datesArray

}


# Get one week of dates, ending with yesterday's date
$startDate = Get-Date
$endDate = $startDate.AddDays(-1)  # Get yesterday's date as last date in range
$startDate = $endDate.AddDays(-7)  # Get 7th prior date as first date in range


$datesList = Get-DateRangeList  $startDate $endDate

# Loop through the dates
Foreach ($d in $datesList) {

    # Do something with each date, e.g., format the date as part of a list of date-stamped files to retrieve
    “FileName_{0}.txt" -f $d.ToString("yyyyMMdd")
}

All that's required is to create and output your objects, and assign the result back to your variable and you'll have an array.

Upvotes: 6

Related Questions