Joseph
Joseph

Reputation: 570

Loading data from Excel file into Azure Datawarehouse

I have succeeded in loading data from CSV to Azure SQL Server data warehouse, however I now have a excel file as a source and when I try reading this excel file into BLOB CSV it creates a file with junk characters. Any help would be appreciated.

Upvotes: 1

Views: 5522

Answers (3)

Talha Tayyab
Talha Tayyab

Reputation: 27790

Azure supports .xlsx and .xls files as a source but not as a sink.

https://learn.microsoft.com/en-us/azure/data-factory/format-excel

Excel format is supported for the following connectors:

Amazon S3, Amazon S3 Compatible Storage, Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure Files, File System, FTP, Google Cloud Storage, HDFS, HTTP, Oracle Cloud Storage and SFTP.

Upvotes: 0

Paul Andrew
Paul Andrew

Reputation: 3253

To handle this within data factory you'll need to use a custom activity (DotNotActivity) that first converts the Excel file to CSV. Then have a downstream activity that deals with the CSV dataset as required.

The custom activity will require some C# classes to be written that handle the conversation. Either using the Office Interoperability libraries or by doing something like this treating the Excel file as a data table.

    public static string ToCSV(this DataTable table)
    {
        var result = new StringBuilder();
        for (int i = 0; i < table.Columns.Count; i++)
        {
            result.Append(table.Columns[i].ColumnName);
            result.Append(i == table.Columns.Count - 1 ? "\n" : ",");
        }

        foreach (DataRow row in table.Rows)
        {
            for (int i = 0; i < table.Columns.Count; i++)
            {
                result.Append(row[i].ToString());
                result.Append(i == table.Columns.Count - 1 ? "\n" : ",");
            }
        }
        return result.ToString();
    }

Or check out other SO questions about the same. Eg:

Is there any simple way to convert .xls file to .csv file? (Excel)

In terms of the other Azure Data Factory glue, the compiled libraries will need to be stored in blob storage and will actually get executed by an Azure Batch Service. Which will require an Azure AD service principal if to authenticate against your Azure Data Lake storage.

Check out this blob post for more details on creating the custom activity.

https://www.purplefrogsystems.com/paul/2016/11/creating-azure-data-factory-custom-activities/

And this one for authentication for ADL:

https://www.purplefrogsystems.com/paul/2016/12/azure-data-lake-authentication-from-azure-data-factory/

Hope this helps.

Upvotes: 1

wBob
wBob

Reputation: 14389

Neither Azure SQL Data Warehouse or Polybase support Excel natively so you will either have to use a flat-file format, or use a tool with the ability to connect with and/or transform Excel.

One method might be to convert your source file to flat file and use Polybase and/or Azure Data Factory.

One method might be to use SQL Server Integration Services (SSIS) which does support Excel natively and now has better integration with Azure since the Azure Feature Pack for Integration Services.

HTH

Upvotes: 0

Related Questions