deeksha
deeksha

Reputation: 17

comparision of 2 csv files using powershell

The format of two files is same and as follows:

ServiceName Status computer State

AdobeARMservice OK NEE Running

Amazon Assistan OK NEE Running

the requirement is, i have to check the service name and computer name..if both are same, then i have to check whether the state of particular service is same in both the files or not. And if it is not same then display it..

$preser = import-csv C:\info.csv
$postser = import-csv C:\serviceinfo.csv

foreach($ser1 in $preser)
{
    foreach($ser2 in $postser)
    {
        if(($ser1.computer -eq $ser2.computer) -and ($ser1.ServiceName -eq $ser2.ServiceName))
        {
            if($ser1.State -eq $ser2.State)
            {

            }
            else
            {
                write-host $ser1,$ser2
            }

        }
    }
}

This code is working fine but as the files length is very large, the time of execution is more.

Is there any alternative method to reduce the time of execution..?

Thank you

Upvotes: 0

Views: 91

Answers (2)

Theo
Theo

Reputation: 61068

Although Import-Csv on very large files will take its time, maybe this will be faster:

$preser  = Import-Csv -Path 'C:\info.csv'
$postser = Import-Csv -Path 'C:\serviceinfo.csv'

# build a lookup Hashtable for $preser
$hash = @{}
foreach ($item in $preser) {
    # combine the ServiceName and Computer to form the hash key
    $key = '{0}@{1}' -f $item.ServiceName, $item.computer
    $hash[$key] = $item
}

# now loop through the items in $postser
foreach ($item in $postser) {
    $key = '{0}@{1}' -f $item.ServiceName, $item.computer
    if ($hash.ContainsKey($key)) {
        if ($hash[$key].State -ne $item.State) {
            # create a new object for output
            $out = $hash[$key] | Select-Object * -ExcludeProperty State
            $out | Add-Member -MemberType NoteProperty -Name 'State in Preser'  -Value $hash[$key].State
            $out | Add-Member -MemberType NoteProperty -Name 'State in Postser' -Value $item.State
            $out
        }
    }
}

The output on screen will look something like this:

ServiceName      : AdobeARMservice
Status           : OK
computer         : NEE
State in Preser  : Running
State in Postser : Stopped

Of course, you can capture this output and save it as new csv if you do

$result = foreach ($item in $postser) {
    # rest of the above foreach loop
}

# output on screen
$result

# output to new csv
$result | Export-Csv -Path 'C:\ServiceInfoDifference.csv' -NoTypeInformation

Upvotes: 1

Vish
Vish

Reputation: 466

There are a few ways to do this:

1. Sorting the columns

If the columns are unsorted in the files, sort them first, and then try finding a match by using linear search.

2. Binary search

What you are currently doing is an implementation of a linear search. You can implement binary search (works best on sorted lists) to find a result faster.

Taken from dfinkey's github repo

function binarySearch {
    param($sortedArray, $seekElement, $comparatorCallback)

    $comparator = New-Object Comparator $comparatorCallback

    $startIndex = 0
    $endIndex = $sortedArray.length - 1

    while ($startIndex -le $endIndex) {
        $middleIndex = $startIndex + [Math]::floor(($endIndex - $startIndex) / 2)

        # If we've found the element just return its position.
        if ($comparator.equal($sortedArray[$middleIndex], $seekElement)) {
            return $middleIndex
        }

        # Decide which half to choose for seeking next: left or right one.
        if ($comparator.lessThan($sortedArray[$middleIndex], $seekElement)) {
            # Go to the right half of the array.
            $startIndex = $middleIndex + 1
        }
        else {
            # Go to the left half of the array.
            $endIndex = $middleIndex - 1
        }
    }

    return -1
}

3. Hashes

I am not completely sure of this method, but, you can load the columns into hashes and then compare them. Hash comparisons are generally faster than array comparisons.

Upvotes: 0

Related Questions