Reputation: 93
I have a requirement to read the data from a csv file and populate it into the database. I am using the bcp commandline utility for this purpose. My CSV File look like this:
First_name,Last_name,EmpID,company,languages
"Jack","Thomas","57616","IBM","C
C++
JAVA
COBOL
PERL
SQL
"
"Tim","Cook","10001","Apple","Python
C++
Java
XML
"
As you see, the last column(languages) has values each on a new line. The bcp command has been scripted to check for the row delimiter and once it gets the first value from the last column, it terminates Request you to please suggest how to parse this using bcp?
Upvotes: 1
Views: 884
Reputation: 1051
Here is a SQL solution: This walks through your import file and parses the data into two tables. There are two loops. One loop for the 'master' table and one loop for the 'detail' table.
Setup
IF EXISTS(SELECT *
FROM #tempTable)
DROP TABLE #tempTable
/*
Create Table emps
(
First_Name Varchar(25),
Last_Name VarChar(25),
EmpID VarChar(10),
Company VarChar(30)
)
Create Table langs
(
EmpID VarChar(10),
Lang VarChar(15)
)
*/
Delete From langs
Delete From emps
CREATE TABLE #tempTable
(
RowVal VarChar(Max)
)
Query
BULK INSERT #tempTable
FROM 'c:\Downloads\EmpLangs.txt'
WITH
(
FIRSTROW = 2,
ROWTERMINATOR = '\n'
)
Declare @RowV VarChar(100)
--Use the following to get the location of each delimiter
Declare @f1q1 Int
Declare @f1q2 Int
Declare @f2q1 Int
Declare @f2q2 Int
Declare @f3q1 Int
Declare @f3q2 Int
Declare @f4q1 Int
Declare @f4q2 Int
Declare @f5q1 Int
Declare @empid VarChar(10)
Declare @vHeader Int = 1 --Is header row?
Declare vCursor CURSOR For Select RowVal From #tempTable
Open vCursor;
Fetch Next From vCursor Into @RowV
While @@FETCH_STATUS = 0 --Walk through rows to parse
Begin
If @vHeader = 1
Begin
Set @f1q1 = CHARINDEX('"',@RowV,1)
Set @f1q2 = CHARINDEX('"',@RowV,@f1q1+1)
Set @f2q1 = CHARINDEX('"',@RowV,@f1q2+1)
Set @f2q2 = CHARINDEX('"',@RowV,@f2q1+1)
Set @f3q1 = CHARINDEX('"',@RowV,@f2q2+1)
Set @f3q2 = CHARINDEX('"',@RowV,@f3q1+1)
Set @f4q1 = CHARINDEX('"',@RowV,@f3q2+1)
Set @f4q2 = CHARINDEX('"',@RowV,@f4q1+1)
Set @f5q1 = CHARINDEX('"',@RowV,@f4q2+1)
Insert Into emps Values
(SUBSTRING(@RowV,@f1q1+1,@f1q2-@f1q1-1),
SUBSTRING(@RowV,@f2q1+1,@f2q2-@f2q1-1),
SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1),
SUBSTRING(@RowV,@f4q1+1,@f4q2-@f4q1-1)
)
Set @vHeader = 0
Set @empid = SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1)
Insert Into langs Values (@empid,SUBSTRING(@RowV,@f5q1+1,Len(@RowV)- @f5q1 + 1)) -- ADDED to get the trailing language from the header row
End
Fetch Next From vCursor Into @RowV
While @@FETCH_STATUS = 0 And @vHeader = 0 And @RowV <> ' "'
Begin
Insert Into langs Values (@empid,@RowV)
Fetch Next From vCursor Into @RowV
If @RowV = ' "'
Begin
If @@FETCH_STATUS = 0
Begin
Fetch Next From vCursor Into @RowV
Set @vHeader = 1
End
End
End
End;
Close vCursor
Deallocate vCursor
Select e.*,l.lang From emps e
INNER JOIN
langs l ON e.EmpID = l.EmpID
Result
First_Name Last_Name EmpID Company Lang
Jack Thomas 57616 IBM C
Jack Thomas 57616 IBM C++
Jack Thomas 57616 IBM JAVA
Jack Thomas 57616 IBM COBOL
Jack Thomas 57616 IBM PERL
Jack Thomas 57616 IBM SQL
Tim Cook 10001 Apple Python
Tim Cook 10001 Apple C++
Tim Cook 10001 Apple Java
Tim Cook 10001 Apple XML
Upvotes: 1
Reputation:
I don't see much progress in trying to find a solution/doing research on your own - what is expected in [SO].
Here a possible PowerShell solution importing the csv,
converting the multiline column to a semicolon separated one and exporting as csv.
Import-Csv .\old.csv| ForEach-Object {
$_.Languages=$_.Languages -split "`r?`n" -ne ' ' -join ';'
$_
} | Export-Csv .\New.csv -NoTypeInformation
This will result in all columns double quoted:
> Get-Content .\new.csv
"First_name","Last_name","EmpID","company","languages"
"Jack","Thomas","57616","IBM","C;C++;JAVA;COBOL;PERL;SQL"
"Tim","Cook","10001","Apple","Python;C++;Java;XML"
Another PowerShell one liner will remedy this:
(Get-Content .\new.csv).trim('"') -replace '","',',' | Set-Content .\new.csv
First_name,Last_name,EmpID,company,languages
Jack,Thomas,57616,IBM,C;C++;JAVA;COBOL;PERL;SQL
Tim,Cook,10001,Apple,Python;C++;Java;XML
EDIT: one combined .ps1 file
## Q:\Test\2018\12\14\SO_53777634.ps1
$FileIn = '.\old.csv'
$FileOut= '.\new.csv'
Import-Csv $FileIn | ForEach-Object {
$_.Languages=$_.Languages -split "`r?`n" -ne ' ' -join ';'
$_
} | Export-Csv $FileOut -NoTypeInformation
(Get-Content $FileOut).trim('"') -replace '","',',' | Set-Content $FileOut
Upvotes: 3