Reputation: 15
I have 3 csv files in C:\temp. Trying to combine all 3 csv files to single file.
F1.csv
, F2.csv
, F3.csv
[All having unique headers and different number of rows and columns]. Below are sample contents in the file.
F1C1 F1C2
ABC 123
F2C1 F2C2
DEF 456
GHI 789
JKL 101112
F3C1
MNO
PQR
F1C1 F1C2 F2C1 F2C2 F3C1
ABC 123 DEF 456 MNO
GHI 789 PQR
JKL 101112
I tried running the below script, but FR.csv
gives output in single column.
Get-Content C:\temp\*csv | Add-Content C:\temp\FinalResult.csv
Upvotes: 1
Views: 1008
Reputation: 437100
The following solutions assume that Get-ChildItem *.csv
enumerates the files to merge, in the desired order (which works with input files F1.csv
, F2.csv
, F3.csv
in the current dir).
Plain-text solution, using .NET APIs, System.IO.StreamReader
and System.IO.StreamWriter
:
This solution performs much better than the OO solution below, but the latter gives you more flexibility. Input files without a Unicode BOM are assumed to be UTF-8-encoded, and the output is saved to a BOM-less UTF8 file named FR.csv
in the current dir. (the APIs used do allow you to specify different encodings, if needed).
$outFile = 'FR.csv'
# IMPORTANT: Always use *full* paths with .NET APIs.
# Writer for the output file.
$writer = [System.IO.StreamWriter] (Join-Path $Pwd.ProviderPath $outFile)
# Readers for all input files.
$readers = [System.IO.StreamReader[]] (Get-ChildItem *.csv -Exclude $outFile).FullName
# Read all files in batches of corresponding lines, join the
# lines of each batch with ",", and save to the output file.
$isHeader = $true
while ($readers.EndOfStream -contains $false) {
if ($isHeader) {
$headerLines = $readers.ReadLine()
$colCounts = $headerLines.ForEach({ ($_ -split ',').Count })
$writer.WriteLine($headerLines -join ',')
$isHeader = $false
} else {
$i = 0
$lines = $readers.ForEach({
if ($line = $_.ReadLine()) { $line }
else { ',' * ($colCounts[$i] - 1) }
++$i
})
$writer.WriteLine($lines -join ',')
}
}
$writer.Close()
$readers.Close()
OO solution, using Import-Csv
and ConvertTo-Csv
/ Export-Csv
:
# Read all CSV files into an array of object arrays.
$objectsPerCsv =
Get-ChildItem *.csv -Exclude FR.csv |
ForEach-Object {
, @(Import-Csv $_.FullName)
}
# Determine the max. row count.
$maxCount = [Linq.Enumerable]::Max($objectsPerCsv.ForEach('Count'))
# Get all column names per CSV.
$colNamesPerCsv = $objectsPerCsv.ForEach({ , $_[0].psobject.Properties.Name })
0..($maxCount-1) | ForEach-Object {
$combinedProps = [ordered] @{}
$row = $_; $col = 0
$objectsPerCsv.ForEach({
if ($object = $_[$row]) {
foreach ($prop in $object.psobject.Properties) {
$combinedProps.Add($prop.Name, $prop.Value)
}
}
else {
foreach ($colName in $colNamesPerCsv[$col]) {
$combinedProps.Add($colName, $null)
}
}
++$col
})
[pscustomobject] $combinedProps
} | ConvertTo-Csv
Replace ConvertTo-Csv
with Export-Csv
to export the data to a file; use the -NoTypeInformation
parameter and -Encoding
as needed; e.g. ... | Export-Csv -NoTypeInformation -Encoding utf8 Merged.csv
Upvotes: 3