Powershell: How to merge unique headers from one CSV to another?

Question

Edit 1:

So I've figure out how to get the unique headers in CSV 2 to append to CSV 1.

$header = ($table | Get-Member -MemberType NoteProperty).Name
$header_add = ($table_add | Get-Member -MemberType NoteProperty).Name
$header_diff = $header + $header_add
$header_diff = ($header_diff | Sort-Object -Unique)
$header_diff = (Compare-Object -ReferenceObject $header -DifferenceObject $header_diff -PassThru)

$header is an array of headers from CSV 1 ($table). $header_add is an array of headers from CSV 2 ($table_add). $header_diff houses the unique headers in CSV 2 by the end of the code block.

So as far as I'm aware, my next step would be:

$append = ($table_add | Select-Object $header_diff)

My problem now is how do I append these objects to my CSV 1 ($table 1) object? I don't quite see a way for Add-Member to do this in a particularly nice fashion.

Original:

Here's the headers for the two CSV files I'm trying to combine.

CSV 1:

Date, Name, Assigned Router, City, Country, # of Calls  , Calls in  , Calls out

CSV 2:

Date, Name, Assigned Router, City, Country, # of Minutes, Minutes in, Minutes out

So a quick rundown of what these files are; both files contain call information for a set of names for one day (the date column has the same date for each row; this is because this eventually gets sent to a master .xlsx file with all dates combined). All of the columns up to Country contain the same values in the same order in both files. The files simply separate the # of calls and # of minutes data. I was wondering if there was a convenient way to move the unlike columns from one CSV to another.

I've tried using something along the lines of:

Import-Csv (Get-ChildItem  -Include ) | Export-Csv  -NoTypeInformation

This didn't combine all of the matching headers and append the unique ones afterwards. Only the first file that's processed kept its unique headers. The second file that was processed had all of those headers and data discarded in the output. Shared header data in the second CSV was added as additional rows.

An example output of my described fail output:

PS > $small | Format-Table

Column_1 Column_2 Column_3
-------- -------- --------
1        a        a
1        b        b
1        c        c


PS > $small_add | Format-Table

Column_1 Column_4 Column_5
-------- -------- --------
1        x        x
1        y        y
1        z        z


PS > Import-Csv (Get-ChildItem ./*.* -Include "small*.csv") | Select-Object * -unique | Format-Table

Column_1 Column_2 Column_3
-------- -------- --------
1        a        a
1        b        b
1        c        c
1
1
1

I was wondering if I could do something like the following algorithm:

Import-Csv CSV_1 and CSV_2 to separate variables
Compare CSV_2 headers to CSV_1 headers, storing the unlike headers in CSV_2 into a separate variable
Select-Object all CSV_1 headers and unlike CSV_2 headers
Pipe the Select-Object output to Export-Csv

The only other method I could only think of is doing it line by line where I would:

Import-Csv both
remove all of the shared columns from CSV_2
change it from the custom object Powershell uses for CSVs to a string
append each line of CSV_2 to each line of CSV_1

It feels a bit unrefined and inflexible (flexibility can probably be dealt with by how columns/headers are isolated so there's no problem appending strings).

mklement0 · Accepted Answer

^{* This answer focuses on a high-level-of-abstraction OO solution.

* The OP's own solution relies more on string processing, which has the potential to be faster.}

# The input file paths.
$files = 'csv1.csv', 'csv2.csv'
$outFile = 'csvMerged.csv'

# Read the 2 CSV files into collections of custom objects.
# Note: This reads the entire files into memory.
$doc1 = Import-Csv $files[0]
$doc2 = Import-Csv $files[1]

# Determine the column (property) names that are unique to document 2.
$doc2OnlyColNames = (
  Compare-Object $doc1[0].psobject.properties.name $doc2[0].psobject.properties.name |
    Where-Object SideIndicator -eq '=>'
).InputObject

# Initialize an ordered hashtable that will be used to temporarily store
# each document 2 row's unique values as key-value pairs, so that they
# can be appended as properties to each document-1 row.
$htUniqueRowD2Props = [ordered] @{}

# Process the corresponding rows one by one, construct a merged output object
# for each, and export the merged objects to a new CSV file.
$i = 0
$(foreach($rowD1 in $doc1) {
  # Get the corresponding row from document 2.
  $rowD2 = $doc2[$i++]
  # Extract the values from the unique document-2 columns and store them in the ordered
  # hashtable.
  foreach($pname in $doc2OnlyColNames) { $htUniqueRowD2Props.$pname = $rowD2.$pname }
  # Add the properties represented by the hashtable entries to the
  # document-1 row at hand and output the augmented object (-PassThru).
  $rowD1 | Add-Member -NotePropertyMembers $htUniqueRowD2Props -PassThru
}) | Export-Csv -NoTypeInformation -Encoding Utf8 $outFile

To put the above to the test, you can use the following sample input:

# Create sample input CSV files
@'
Date,Name,Assigned Router,City,Country,# of Calls,Calls in,Calls out
dt,nm,ar,ct,cy,cc,ci,co
dt2,nm2,ar2,ct2,cy2,cc2,ci2,co2
'@ > csv1.csv

# Same column layout and data as above through column 'Country', then different.
@'
Date,Name,Assigned Router,City,Country,# of Minutes,Minutes in,Minutes out
dt,nm,ar,ct,cy,mc,mi,mo
dt2,nm2,ar2,ct2,cy2,mc2,mi2,mo2
'@ > csv2.csv

The code should produce the following content in csvMerged.csv:

"Date","Name","Assigned Router","City","Country","# of Calls","Calls in","Calls out","# of Minutes","Minutes in","Minutes out"
"dt","nm","ar","ct","cy","cc","ci","co","mc","mi","mo"
"dt2","nm2","ar2","ct2","cy2","cc2","ci2","co2","mc2","mi2","mo2"

Powershell: How to merge unique headers from one CSV to another?

Answers (2)

Related Questions