Ryan Mann
Ryan Mann

Reputation: 5357

How can I deploy a script with 130,000+ entries without running out of memory

I have a database project in Visual Studio 2015 build with SSDT. In this project is a T4 template that generates SQL calls to stored procedures to enter inital data for a world database.

It includes, Countries, zones, time zones, regions, and cities.

When the T4 template is done it generates about 130,000 of these:

EXEC [geo].[addUpdateCountry] @countryCode = N'AD', @countryName = N'Andorra', @initData = 1
EXEC [geo].[addUpdateCountry] @countryCode = N'AE', @countryName = N'United Arab Emirates', @initData = 1
EXEC [geo].[addUpdateCountry] @countryCode = N'AF', @countryName = N'Afghanistan', @initData = 1
EXEC [geo].[addUpdateCountry] @countryCode = N'AG', @countryName = N'Antigua and Barbuda', @

I then linked in a file to a PostDeploy script to run those when a specific publishing profile is used.

It's just to large, the publishing process runs out of memory so I figure I have to break them up into batches some how, but am not sure how I would do that in this context.

Upvotes: 0

Views: 703

Answers (3)

Ed Elliott
Ed Elliott

Reputation: 6856

When I have hit issues with deployment scripts and memory I have put my put my scripts inside stored procedures and call those stored procedures from the deployment scripts do your t4 template would generate something like:

create proc deploy.country_data
as
   EXEC [geo].[addUpdateCountry] @countryCode = N'AD', @countryName = N'Andorra', @initData = 1
   EXEC [geo].[addUpdateCountry] @countryCode = N'AE', @countryName = N'United Arab Emirates', @initData = 1
   EXEC [geo].[addUpdateCountry] @countryCode = N'AF', @countryName = N'Afghanistan', @initData = 1
   etc...

and then just call it from your post-deployment script.

I normally separate these deploy stored procedures into a separate ssdt project and deploy using /p:IncludeCompositeObject=true.

I would also raise a connect item about the OO exception.

You can also separate deploy scripts by importing different scripts into your main post-deploy script using :r import but you need to know the location of the script.

Upvotes: 1

Ryan Mann
Ryan Mann

Reputation: 5357

I ran into a few issues using a T4 template to generate initial setup data for my database.

1: T4 Templates without an output extension of .sql reset their Build Action to Build every time the template reruns. This was a problem because I wanted to link the script into the postDeploy Script and didn't want it to build as it building would throw build errors.

2: The post deploy script was so large it ran out of memory deploying it, as it had print statements for debugging as well.

To solve the problems I went with @Ed Alliot's solution to make all my init data stored procedures. However I created a new schema called setup and put all the setup stored procedures in that schema so that they can be secured separately of the rest of the database. Because I am not generated stored procedures I can let their build action stay "Build" it will deploy the stored procedures.

In case any one is interested in this design process, I'll post some screens and the t4 template for reference.

enter image description here

geoDataSql.ttinclude

<#@ template language="C#" hostspecific="true" #>
<#@ CleanupBehavior processor="T4VSHost" CleanupAfterProcessingtemplate="true" #>
<#@ output extension=".sql" #>
<#@ import namespace="System" #>
<#@ import namespace="System.IO" #>
<#@ import namespace="System.Diagnostics" #>
<#@ import namespace="System.Collections" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="System.Text" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="System.Text.RegularExpressions" #>
<#+
    string countriesFile = "";
    string zonesFile = "";
    string timeZonesFile = "";
    string regionsFile = "";
    string citiesFile = ""; 

    public void GenerateCountries() 
    {       
        List<country> countries = ParseCSV<country>(Host.ResolvePath("..\\T4\\geoSetup\\" + countriesFile));        
        WriteLine("");
        WriteLine("-- Sql to add/update countries");
        foreach(var country in countries)
        {
            if(string.IsNullOrEmpty(country.country_name) || string.IsNullOrEmpty(country.country_code))
            {
                WriteLine("--Skipped: country_code:" + country.country_code + "\ncountry_code was null or empty in the csv file.");
                continue;
            }
            WriteLine(string.Format("EXEC [geo].[addUpdateCountry] @countryCode = N'{0}', @countryName = N'{1}', @initData = 1", country.country_code, country.country_name.Replace("'", "''")));
        }
    }

    public void GenerateZones()
    {
        List<zone> zones = ParseCSV<zone>(Host.ResolvePath("..\\T4\\geoSetup\\"  + zonesFile));
        WriteLine("");
        WriteLine("-- Sql to add/update zones");
        foreach(var zone in zones)
        {
            WriteLine(string.Format("EXEC [geo].[addUpdateZone] @id={0}, @countryCode = N'{1}', @zoneName = N'{2}', @initData = 1", zone.zone_id, zone.country_code, zone.zone_name.Replace("'", "''")));
        }
    }
    public void GenerateTimeZones()
    {
        List<timezone> timeZones = ParseCSV<timezone>(Host.ResolvePath("..\\T4\\geoSetup\\"  + timeZonesFile));
        WriteLine("");
        WriteLine("-- Sql to add/update zones");
        foreach(var timeZone in timeZones)
        {
            if (string.IsNullOrEmpty(timeZone.time_start))
                timeZone.time_start = "0";
            if (string.IsNullOrEmpty(timeZone.gmt_offset))
                timeZone.gmt_offset = "0";
            WriteLine(string.Format("EXEC [geo].[addUpdateTimeZone] @zoneId={0}, @zoneShortName = N'{1}', @timeStart = {2}, @gmtOffset = {3}, @dst = {4}, @initData = 1", timeZone.zone_id, timeZone.abbreviation, timeZone.time_start, timeZone.gmt_offset, timeZone.dst));
        }
    }

    public void GenerateRegions()
    {
        List<city> cities = ParseCSV<city>(Host.ResolvePath("..\\T4\\geoSetup\\"  + citiesFile));
        List<region> regionsOther = ParseCSV<region>(Host.ResolvePath("..\\T4\\geoSetup\\" + regionsFile));
        Dictionary<string, region> regions = new Dictionary<string, region>();

        foreach(var city in cities)
        {
            if (string.IsNullOrEmpty(city.city_name))
                continue;

            if (!string.IsNullOrEmpty(city.subdivision_1_iso_code))
            {
                string rKey = city.country_iso_code + "_" + city.subdivision_1_iso_code;
                region r = new region() { countryCode = city.country_iso_code, regionCode = city.subdivision_1_iso_code, regionName = city.subdivision_1_name };
                if (!regions.ContainsKey(rKey))
                    regions[rKey] = r;
            }           
            if (!string.IsNullOrEmpty(city.subdivision_2_iso_code))
            {
                string rKey = city.country_iso_code + "_" + city.subdivision_2_iso_code;
                region r = new region() { countryCode = city.country_iso_code, regionCode = city.subdivision_2_iso_code, regionName = city.subdivision_2_name };
                if (!regions.ContainsKey(rKey))
                    regions[rKey] = r;              
            }
        }

        foreach (var region in regionsOther)
        {
            string rKey = region.countryCode + "_" + region.regionCode;
            if (!regions.ContainsKey(rKey))
                regions[rKey] = region;
        }

        WriteLine("");
        WriteLine("-- Regions (Pulled from cities.csv, only regions with cities/towns etc are here.)");
        foreach(var region in regions.Values)
        {
            WriteLine(string.Format("EXEC [geo].[addUpdateRegion] @countryCode = N'{0}', @regionCode = N'{1}', @regionName = N'{2}', @initData = 1", region.countryCode, region.regionCode, region.regionName.Replace("'", "''")));
        }

    }

    public void GenerateCities()
    {
        List<city> cities = ParseCSV<city>(Host.ResolvePath("..\\T4\\geoSetup\\"  + citiesFile));
        WriteLine("");
        WriteLine("-- Cities");
        foreach (var city in cities)
        {
            if (string.IsNullOrEmpty(city.city_name))
                continue;
            if (string.IsNullOrEmpty(city.subdivision_1_iso_code) && string.IsNullOrEmpty(city.subdivision_2_iso_code))
            {
                WriteLine("--Skipped City: " + city.geoname_id.ToString() + " it doesn't have any region info!");
                continue;
            }
            string sql = "EXEC [geo].[addUpdateCity] @countryCode = N'{0}', @cityName = N'{1}', @region1Code = {2}, @region2Code = {3},  @zoneName = N'{4}', @initData = 1";
            WriteLine(string.Format(sql, city.country_iso_code, city.city_name.Replace("'", "''"), city.subdivision_1_iso_code == null ? "null" : "N'" + city.subdivision_1_iso_code + "'", city.subdivision_2_iso_code == null ? "null" : "N'" + city.subdivision_2_iso_code + "'", city.time_zone));
        }
    }
    public List<T> ParseCSV<T>(string filePath) where T: class, new()
    {
        if (!System.IO.File.Exists(filePath))
            return null;

        string[] csvContents = File.ReadAllText(filePath).Split(new string[] { "\n" }, StringSplitOptions.RemoveEmptyEntries);
        if (csvContents.Length <= 1)
            return null;
        Regex csvSplit = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
        string[] fieldNames = csvContents[0].Split(new string[] { "," }, StringSplitOptions.None);
        if (fieldNames.Length <= 0)
            return null;
        List<T> ret = new List<T>();
        Type objType = typeof(T);

        for(int i = 1; i < csvContents.Length; ++i)
        {
            List<string> values = new List<string>();
            foreach (Match match in csvSplit.Matches(csvContents[i]))                
                values.Add(match.Value.TrimStart(',').Trim('"').Trim());   
            if (values.Count != fieldNames.Length)                              
                throw new Exception("Test");                

            T obj = new T();
            for(int i2 = 0; i2 < fieldNames.Length; ++i2)
            {
                var field = fieldNames[i2];
                var props = objType.GetProperties();
                var property = objType.GetProperty(field.Trim(), System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.IgnoreCase | System.Reflection.BindingFlags.SetProperty);
                if (property == null) {
                    throw new Exception("PROPERTY NULL");
                }                   
                if (property == null)
                    throw new Exception("Unable to parse field: " + field + " into Type: " + objType.FullName + "\n unable to find property on the Type matching the fields name.");
                if (property.PropertyType != typeof(string))
                    throw new Exception("Unable to parse field: " + field + " into Property becuase the Property is not of Type string. Type:" + objType.FullName);
                string v = values[i2] == null || values[i2] == string.Empty ? null : values[i2].Trim();
                property.SetValue(obj, v);
            }
            ret.Add(obj);
        }
        return ret;
    }
    public class city
    {
        public string geoname_id { get; set;}
        public string locale_code { get; set;}
        public string continent_code { get; set;}
        public string continent_name { get; set;}
        public string country_iso_code { get; set;}
        public string country_name { get; set;}
        public string subdivision_1_iso_code { get; set;}
        public string subdivision_1_name { get; set;}
        public string subdivision_2_iso_code { get; set;}
        public string subdivision_2_name { get; set;}
        public string city_name { get; set;}
        public string metro_code { get; set;}
        public string time_zone { get; set;}
    }
    public class region
    {
        public string countryCode { get; set;}
        public string regionCode { get; set;}
        public string regionName { get; set;}
    }
    public class country
    {
        public string country_code { get; set; }
        public string country_name { get; set; }
    }
    public class zone
    {
        public string zone_id { get; set;}
        public string country_code { get; set;}
        public string zone_name { get; set;}
    }
    public class timezone
    {
        public string zone_id { get; set;}
        public string abbreviation { get; set;}
        public string time_start { get; set;}
        public string gmt_offset { get; set;}
        public string dst { get; set;}
    }
#>    

setupCountryData.tt

<#@ include file="..\T4\geoSetup\geoDataSql.ttinclude" #>

<#  
    countriesFile = "countries.csv";
    WriteLine("CREATE PROCEDURE [setup].[setupCountryData] AS");
    GenerateCountries();
    WriteLine("RETURN 0;");
#>

To create other csv driven stored procedures just add a method called GenerateXYZ() to geoDataSql.ttinclude. Then create a file like setupCountryData.sql and set the file name and call the appropriate generate method you added.

CONS: This makes the build take forever.... So I'll probably move this logic out to a console app after a while... Then just run that after deploys/changes.

Upvotes: 1

Steven Green
Steven Green

Reputation: 3517

I would suggest opening a Connect issue related to the OOM exception you're encountering. This can be done at https://connect.microsoft.com/SQLServer/feedback/CreateFeedback.aspx using the category "Developer Tools (SSDT, BIDS, etc.)".

To work around an out of memory issue during publishing it's possible to use the (64-bit) command line tool SqlPackage.exe to perform publish operations. SqlPackage.exe is a command-line wrapper around the same data-tier application framework code that performs the publish operation in Visual Studio and SSMS.

Upvotes: 1

Related Questions