Reputation: 147
I usually don't have to deal with external files but now I have to automate a process to import entire directories of JSON files. All the files have the same structure. I'm using the following sql to import a single file:
DECLARE @JSON VARCHAR(MAX)
SELECT @JSON = BulkColumn
FROM OPENROWSET
(BULK 'R:\data\ETL\json\3781.json', SINGLE_CLOB)
AS j
If (ISJSON(@JSON)=1)
Select * FROM OPENJSON (@JSON, '$.api.events')
with (FixtureID int '$.fixture_id',
EventDate datetime2 '$.event_date',
Venue nvarchar(50) '$.venue',
CityName nvarchar(20) '$.region.city_name',
EventID tinyint '$.region.event_id'
)
But is there an easy way to import an entire directory of similar files ? Or do I have to create a sp in which I list all the files of a directory (using something like xp_dirtree) and loop through them to import the whole batch ?
Upvotes: 0
Views: 646
Reputation: 46223
Below is a PowerShell example ETL script that imports all json files in the specified directory into a relational database table using a parameterized INSERT
query. The script reads the file contents from each file and uses ConvertFrom-Json to convert the json string into a custom object that exposes the parsed values as properties, making it easy to extract the values.
This example can be extended according to your needs, such as additional validation and transformation, handle missing fields, using SqlBulkCopy for bulk inserts, etc.
Example DDL for target table:
CREATE TABLE dbo.YourTable(
FixtureID int
,EventDate datetime2
,Venue nvarchar(50)
,CityName nvarchar(20)
,EventID tinyint
);
PS script:
$connectionString = "Data Source=YourServer;Initial Catalog=YourDatabase;Integrated Security=SSPI;Application Name=JsonImportExample"
Function Insert-YourTable($json) {
$connection = New-Object System.Data.SqlClient.SqlConnection($connectionString)
$insertQuery = @"
INSERT INTO dbo.YourTable(
FixtureID
,EventDate
,Venue
,CityName
,EventID
)
VALUES(
@FixtureID
,@EventDate
,@Venue
,@CityName
,@EventID
);
"@
$insertCommand = New-Object System.Data.SqlClient.SqlCommand($insertQuery, $connection)
[void]($insertCommand.Parameters.Add("@FixtureID", [System.Data.SqlDbType]::Int).Value = $json.api.events.fixture_id)
[void]($insertCommand.Parameters.Add("@EventDate", [System.Data.SqlDbType]::DateTime2).Value = $json.api.events.event_date)
[void]($insertCommand.Parameters.Add("@Venue", [System.Data.SqlDbType]::VarChar, 50).Value = $json.api.events.venue)
[void]($insertCommand.Parameters.Add("@CityName", [System.Data.SqlDbType]::VarChar, 20).Value = $json.api.events.region.city_name)
[void]($insertCommand.Parameters.Add("@EventID", [System.Data.SqlDbType]::TinyInt).Value = $json.api.events.region.event_id)
$connection.Open()
[void]$insertCommand.ExecuteNonQuery()
$connection.Close()
}
# ############
# ### MAIN ###
# ############
try {
$jsonFiles = Get-ChildItem "C:\temp\JsonFiles\*.json"
foreach ($jsonFile in $jsonFiles) {
Write-Host "Importing $($jsonFile.FullName)..."
try {
$json = [System.IO.File]::ReadAllText($jsonFile.FullName) | ConvertFrom-Json
}
catch [System.ArgumentException]{
Write-Host "Error parsing json. File ignored" -ForegroundColor Yellow
}
Insert-YourTable -json $json
}
}
catch {
throw
}
EDIT:
In cases where the source json value is null, you can use null coalescing to substute DbNull.Value
(or some other value) using a technique like:
[void]($insertCommand.Parameters.Add("@FixtureID", [System.Data.SqlDbType]::Int).Value = if($json.api.events.fixture_id -eq $null) {[DbNull]::Value} else {$json.api.events.fixture_id} )
[void]($insertCommand.Parameters.Add("@EventDate", [System.Data.SqlDbType]::DateTime2).Value = if($json.api.events.event_date -eq $null) {[DbNull]::Value} else {$json.api.events.event_date} )
[void]($insertCommand.Parameters.Add("@Venue", [System.Data.SqlDbType]::VarChar, 50).Value = if($json.api.events.venue -eq $null) {[DbNull]::Value} else {$json.api.events.venue} )
[void]($insertCommand.Parameters.Add("@CityName", [System.Data.SqlDbType]::VarChar, 20).Value = if($json.api.events.region.city_name -eq $null) {[DbNull]::Value} else {$json.api.events.region.city_name} )
[void]($insertCommand.Parameters.Add("@EventID", [System.Data.SqlDbType]::TinyInt).Value = if($json.api.events.region.event_id) {[DbNull]::Value} else {$json.api.events.region.event_id} )
Upvotes: 1