Reputation: 772
Given a set of date/times in Excel (formatted to maintain and display milliseconds)
2020-01-01 12:00:00.500 AM
2020-01-01 12:00:01.500 AM
2020-01-01 12:00:02.500 AM
2020-01-01 12:00:03.500 AM
2020-01-01 12:00:04.500 AM
2020-01-01 12:00:05.500 AM
2020-01-01 12:00:06.500 AM
2020-01-01 12:00:07.500 AM
2020-01-01 12:00:08.500 AM
2020-01-01 12:00:09.500 AM
2020-01-01 12:00:10.500 AM
Excel rounds these when displayed (or collected by a Pivot Table) in this way
2020-01-01 12:00:00 AM Down
2020-01-01 12:00:01 AM Down
2020-01-01 12:00:03 AM Up
2020-01-01 12:00:03 AM Down
2020-01-01 12:00:04 AM Down
2020-01-01 12:00:05 AM Down
2020-01-01 12:00:06 AM Down
2020-01-01 12:00:07 AM Down
2020-01-01 12:00:08 AM Down
2020-01-01 12:00:09 AM Down
2020-01-01 12:00:11 AM Up
How do I replicate this seemingly random midpoint rounding?
Note I know that Excel stores dates as doubles, and I actually have direct access to the values stored in Excel. I'm also aware of the floating point precision issues that floating point date-time values can have.
What I'm looking for is a concrete set of operations (preferably in C#) that provides parity with what Excel in obviously doing. I've tried a number of ways to calculate and round to the "nearest" second nothing I've tried can replicate the seemingly random half rounding.
I'm pretty sure there's something to do with precision at play as well, since the date value (whole number portion) can affect how the time (fractional) portion rounds to the nearest second.
Update
Regular DateTime
rounding isn't going to cut it.
Converting the double date to a DateTime
(which works in ticks) will always result in exactly 500ms and therefore a standard rounding will always round up. This is not what Excel is doing (see data above).
Neither is this an issue of simple accuracy or floating-point rounding... @Jeremy Lakeman provided some code to get a more accurate string representation of the values. So I tried it.
Take the following data. Values in brackets are the stringified double values that are stored in the XLSX file XML for the date-time values shown. Left are unrounded, right are rounded by Excel.
2020-01-01 12:00:16.500 AM [43831.00019097222] --> 2020-01-01 12:00:16.000 AM [43831.000185185185]
2020-01-01 12:00:17.500 AM [43831.000202546296] --> 2020-01-01 12:00:17.000 AM [43831.000196759262]
2020-01-01 12:00:18.500 AM [43831.000214120373] --> 2020-01-01 12:00:19.000 AM [43831.000219907408]
2020-01-01 12:00:19.500 AM [43831.000225694443] --> 2020-01-01 12:00:19.000 AM [43831.000219907408]
The DoubleConverter
gives the following outputs for the unrounded values multiplied by seconds: DoubleConverter.ToExactString(d * 86400)
2020-01-01 12:00:16.500 AM --> 3786998416.5
2020-01-01 12:00:17.500 AM --> 3786998417.5
2020-01-01 12:00:18.500 AM --> 3786998418.5
2020-01-01 12:00:19.500 AM --> 3786998419.5
Note that only the 3rd value rounds up, while the other 3 round down. How do I replicate that?
Upvotes: 1
Views: 106
Reputation: 11100
I know I'm on the right track here, because the following (using DoubleConverter);
var epoch = new DateTime(1900, 1, 1);
var start = new DateTime(2020, 1, 1);
var offset = new TimeSpan(0, 0, 0, 0, 500);
var values = Enumerable.Range(0, 100)
.Select(i =>
{
var v = start + offset * (i * 2 + 1);
var seconds1 = (v - epoch).TotalSeconds;
var seconds2 = (v - epoch).TotalDays * 86400;
return $"{v:yyyy-MM-dd HH:mm:ss.f} => {DoubleConverter.ToExactString(seconds2)}, { (seconds2>seconds1 ? "Up" : "Down")}";
})
.ToList();
Produces;
2020-01-01 00:00:00.5 => 3786825600.5, Down
2020-01-01 00:00:01.5 => 3786825601.5, Down
2020-01-01 00:00:02.5 => 3786825602.500000476837158203125, Up
2020-01-01 00:00:03.5 => 3786825603.5, Down
2020-01-01 00:00:04.5 => 3786825604.5, Down
2020-01-01 00:00:05.5 => 3786825605.499999523162841796875, Down
2020-01-01 00:00:06.5 => 3786825606.5, Down
2020-01-01 00:00:07.5 => 3786825607.5, Down
2020-01-01 00:00:08.5 => 3786825608.5, Down
2020-01-01 00:00:09.5 => 3786825609.5, Down
2020-01-01 00:00:10.5 => 3786825610.500000476837158203125, Up
...
Each of the values that are slightly greater than 0.5
are the values that excel is rounding up, the others round down. Note that TimeSpan.TotalSeconds
gives a double representing the exact number of milliseconds, while .TotalDays * 86400
appears to have the same accuracy as your example.
So the trick is to work out which results are slightly above .5
and round them up.
You can perform this same test in excel with =MOD(A2*86400,1)
. To see the same floating point accuracy problem and determine which values excel will round up.
private static readonly DateTime EPOCH = new DateTime(1900, 1, 1);
public DateTime ExcelRound(DateTime value)
{
var m = value.Ticks % TimeSpan.TicksPerSecond;
if (value.Millisecond > 500)
m -= TimeSpan.TicksPerSecond;
else if (value.Millisecond == 500)
{
var seconds1 = (value - EPOCH).TotalSeconds;
var seconds2 = (value - EPOCH).TotalDays * 86400;
if (seconds2 > seconds1)
m -= TimeSpan.TicksPerSecond;
}
return value.Add(new TimeSpan(-m));
}
However, now that you know what excel is doing. I would personally take that evidence and complain about your requirement. Just because excel has a bug, doesn't mean you should be forced to replicated it.
Upvotes: -1