quickshare
quickshare

Reputation: 165

Merge csv files with the same filename from different zip archives

this is my situation:

I have multiple zip archives with file names like this 20130101_001.zip, 20130102_001.zip, 20130103_001.zip, etc.

Each of those archives contains csv files with the same name: file1.csv, file2.csv, file3.csv (these files are not the same, but they all have the same names across all of the archives)

I'm using those files in ETL process and I would like to unzip all of the archives and merge these files together so I have to run the process only once. If there's a way of doing this so the files don't have duplicate records it would be great, but if that can't be achieved, I would use ETL tools to remove them.

This should be done in Windows, I don't have language preference.

Upvotes: 0

Views: 446

Answers (2)

quickshare
quickshare

Reputation: 165

Thanks for the reply, eventually I solved it without cmdlets.

I use 7zip command to unzip all files and then this batch script to merge files:

setlocal
set first=1
>pro.txt (
  for %%F in (file1*.csv) do (
    if defined first (
      type "%%F"
      set "first="
    ) else more +1 "%%F"
  )
)

I have about 20 files so I repeat this loop for each of them. Later I normalize the records with SyncSort

Upvotes: 0

KevinD
KevinD

Reputation: 3153

Take a look at the cmdlets ConvertFrom-Csv and ConvertTo-Csv. They allow you to convert csv to an array of PowerShell objects, and vice-versa.

The syntax is fairly simple:

$csvObject1 = Get-Content $pathToCSVFile | ConvertFrom-Csv 

Repeat this for any csv files you want to process, and you can then perform any logic you need in PowerShell to merge them. When done, use this:

$csvOutputObject | ConvertTo-Csv -NoTypeInformation | Set-Content $pathToOutputCSVFile

Upvotes: 1

Related Questions