Reputation: 1
I am new to powershell. Currently we are in need of a poweshell script to compare two large (100000 rows and n columns (n > 300, also column headers are Dates corresponding to each wednesday). The value of n keeps on incrementing each week in the file. We need to compare the files (current week and last week), and need to make sure that the only difference between the two files is the last column.
I have gone through some Forums and Blogs and I could do only Little due to my ignorance.
If there is a way to drop the last column from a csv file in powershell, we may be able to make use of the below script below to compare the previous week's file and the current week's file after droping the last column from current week's file.
It would be really helpful if someone can help me here with your hard earned knowledge
[System.Collections.ArrayList]$file1Array = Get-Content "C:\Risk Management\ref_previous.csv"|Sort-Object
[System.Collections.ArrayList]$file2Array = Get-Content "C:\Risk Management\ref_current.csv"|Sort-Object
$matchingEntries = @()
foreach ($entry in $file1Array) {
if ($file2Array.Contains($entry)) {
$matchingEntries += $entry
}
}
foreach ($entry in $matchingEntries){
$file1Array.Remove($entry)
$file2Array.Remove($entry)
}
Cheers, Anil
Upvotes: 0
Views: 3874
Reputation: 81
The import-csv and export-csv both give the opportunity to exclude columns.
The import-csv has the -header option and you simply name the incoming headers and exclude the last columns header. If there are 10 columns, only name 9. The last column will be excluded.
For export-csv, select the columns you'd like to write out ( |select col1,col2,col3|export-csv... ) and don't select the column you're trying to exclude.
Upvotes: 0
Reputation: 1543
Based on the answer that alroc gave, you should be able to get the last column name using a split operation on the first line of the CSV file, and then using that on the -ExcludeProperty parameter.
However, the Compare-Object command on this doesn't work for me, but it does pull back the right data into each variable.
$CurrentFile = "C:\Temp\Current.csv"
$PreviousFile = "C:\Temp\Previous.csv"
$CurrentHeaders = gc $CurrentFile | Select -First 1
$CurrentHeadersSplit = $CurrentHeaders.Split(",")
$LastColumn = $CurrentHeadersSplit[-1] -Replace '"'
$Current = Import-Csv $CurrentFile | Select -Property * -ExcludeProperty $LastColumn | Sort-Object
$Previous = Import-Csv $PreviousFile | Sort-Object
Compare-Object $Current $Previous
Upvotes: 1
Reputation: 28154
Assuming that the column name you want to exclude is LastCol
(adjust to your actual column name):
$previous = Import-csv "C:\Risk Management\ref_previous.csv" | Select-Object -Property * -ExcludeProperty LastCol | Sort-Object;
$current = Import-csv "C:\Risk Management\ref_current.csv" | Sort-Object;
Compare-Object $previous $current;
This will drop the last column from each of the input files and indicate whether the remaining content differs.
Upvotes: 1