Reputation: 3
I received an XML file where the date fields are Excel Serial Date Numbers instead of mm/dd/yyyy dates like they usually are. I'm having trouble adding a data conversion to my preexisting SSIS package because the serial number is in an XML file, not an excel file.
The SSIS package cleans and loads one xml file into a SQL Server table.
Does anyone have any ideas? I'm working in Visual Studios 2015 fyi.
XML Data Snippet:
<AGREEMENT_CODE>1960-EMPR</AGREEMENT_CODE>
<AGREEMENT_NAME>1960-Legacy Employer Conversion
Default</AGREEMENT_NAME>
<AGREEMENT_TYPE>MBA</AGREEMENT_TYPE>
<FUND_TYPE>Health & Pension</FUND_TYPE>
<CONTRACT_START_DATE>21916</CONTRACT_START_DATE>
<EMPLOYER_ID>25568</EMPLOYER_ID>
<EMPLOYER_NAME>10409</EMPLOYER_NAME>
<BILLING_ENTITY_CODE>ACT III TELEVISION, L.P.</BILLING_ENTITY_CODE>
<BILLING_ENTITY_NAME>10409</BILLING_ENTITY_NAME>
<PARTICIPATION_START_DATE>ACT III TELEVISION, L.P.
</PARTICIPATION_START_DATE>
<PARTICIPATION_SIGNED_DATE>35917</PARTICIPATION_SIGNED_DATE>
The Excel Serial Date Numbers are on lines 6 and 13. They are 5 digit numbers.
Upvotes: 0
Views: 429
Reputation: 61211
The code that I have used for more than a decade now came from a defunct site
/// <summary>
/// Seriously? For the loss
/// <see cref="http://www.debugging.com/bug/19252"></see>
/// </summary>
/// <param name="excelDate">Number of days since 1900-01-01</param>
/// <returns>The converted days to date</returns>
public static DateTime ConvertXlsdtToDateTime(int excelDate)
{
DateTime dt = new DateTime(1899, 12, 31);
// adjust for 29 Feb 1900 which Excel considers a valid date
if (excelDate >= 60)
{
excelDate--;
}
return dt.AddDays(excelDate);
}
If you're going to use this in a Derived Column Task, then you need to convert the C# into the SSIS expression language. We'll use the ternary operator (boolean) ? truevalue : falsevalue
to make this happen
DATEADD("Day", [CONTRACT_START_DATE] - ([CONTRACT_START_DATE] >= 60 ? 1 : 0), (DT_DATE)"1899-12-30")
Typically, I'd have two Derived Column Components in my data flow as it's the only way to debug things. In your case, I'd have the first Derived Column add 1 + Number Of Excel columns to the data flow.
Note, the assignment =
I show here is short hand syntax for the values you put into Derived Column Name
and the Expression
columns in the Derived Task
BaseDate = (DT_DATE)"1899-12-30"
CONTRACT_START_DATE_Offset = [CONTRACT_START_DATE] >= 60 ? -1 : 0
PARTICIPATION_SIGNED_DATE_Offset = [PARTICIPATION_SIGNED_DATE] >= 60 ? -1 : 0
etc. Now I can check that I am correctly handling the offset value for the 1900 bug.
That then simplifies my first expression to
DATEADD("Day", [CONTRACT_START_DATE] - [CONTRACT_START_DATE_Offset], [BaseDate])
If you're really trying to track down where something could go wrong, I'd even look at encapsulating the math logic there Col1 - Col1_Offset
into a precursor derived column just so I can put a data tap/data viewer on it and capture values.
As a minimal repro, I created a package that had an OLESRC component that adds our CONTRACT_START_DATE to the data flow
SELECT 35 AS CONTRACT_START_DATE
UNION ALL SELECT 21916
My Derived Column "OneShot" which uses the first expression adds the column "OneShot"
DER Multistatement adds the BaseDate and CONTRACT_START_DATE_Offset values from the second set of expressions
DER CSD adds ContractStartDate to the data flow using the third expression.
In the above image, you can see the results of a data viewer (and a copy of the Metadata at that point). My datatypes are as expected after using the DATEADD function - DT_DBTIMESTAMP.
If your type is showing as DT_WSTR, ensure that you are adding a new column and not attempting to replace the existing column.
Upvotes: 1
Reputation: 6685
The date in Excel is actually a number, representing (for all intents and purposes) the number of days since 30 December 1899. However there is one caveat below.
Therefore, you can convert the number to a date with DATEADD(day, @yourDateNum, '18991230')
The caveat is that Microsoft (and Lotus 1-2-3 before it) have a bug in the date calculation: they assume 1900 is a leap year, which it isn't. Therefore there's an extra day in the count, and I've adjusted the formula above to take this into account. The result, however, is that this formula doesn't work for dates before 1 March 1900.
Please check your formula with known values before putting in production!
Upvotes: 1