user1854628
user1854628

Reputation: 253

Timestamp Processing Brain Teaser

I am processing 1Hz timestamps (variable 'timestamp_1hz') from a logger which doesn't log exactly at the same time every second (the difference varies from 0.984 to 1.094, but sometimes 0.5 or several seconds if the logger burps). The 1Hz dataset is used to build a 10 minute averaged dataset, and each 10 minute interval must have 600 records. Because the logger doesn't log exactly at the same time every second, the timestamp slowly drifts through the 1 second mark. Issues come up when the timestamp cross the 0 mark, as well as the 0.5 mark.

I have tried various ways to pre-process the timestamps. The timestamps with around 1 second between them should be considered valid. A few examples include:

% simple
% this screws up around half second and full second values    
rawseconds = raw_1hz_new(:,6)+(raw_1hz_new(:,7)./1000);
rawsecondstest = rawseconds;
    rawsecondstest(:,1) = floor(rawseconds(:,1))+ rawseconds(1,1);

% more complicated
% this screws up if there is missing data, then the issue compounds because k+1 timestamp is dependent on k timestamp
rawseconds = raw_1hz_new(:,6)+(raw_1hz_new(:,7)./1000);
 A = diff(rawseconds);
    numcheck = rawseconds(1,1);
    integ = floor(numcheck);
    fract = numcheck-integ;
    if fract>0.5
        rawseconds(1,1) = rawseconds(1,1)-0.5;
    end
 for k=2:length(rawseconds)
        rawsecondstest(k,1) =  rawsecondstest(k-1,1)+round(A(k-1,1)); 
end

I would like to pre-process the timestamps then compare it to a contiguous 1Hz timestamp using 'intersect' in order to find the missing, repeating, etc data such as this:

 % pull out the time stamp (round to 1hz and convert to serial number)
timestamp_1hz=round((datenum(raw_1hz_new(:,[1:6])))*86400)/86400;

% calculate new start time and end time to find contig time
starttime=min(timestamp_1hz);
endtime=max(timestamp_1hz);

% determine the contig time
contigtime=round([floor(mean([starttime endtime])):1/86400:ceil(mean([starttime endtime]))-1/86400]'*86400)/86400;
% find indices where logger time stamp matches real time and puts
% the indices of a and b
clear Ia Ib Ic Id
[~,Ia,Ib]=intersect(timestamp_1hz,contigtime);
% find indices where there is a value in real time that is not in
% logger time
[~,Ic] = setdiff(contigtime,timestamp_1hz);
% finds the indices that are unique
[~,Id] = unique(timestamp_1hz);

You can download 10 days of the raw_1hz_new timestamps here. Any help or tips would be much appreciated!

Upvotes: 2

Views: 172

Answers (1)

nkjt
nkjt

Reputation: 7817

The problem you have is that you can't simply match these stamps up to a list of times, because you could be expecting a set of datapoints at seconds = 1000, 1001, 1002, but if there was an earlier blip you could have entirely legitimate data at 1000.5, 1001.5, 1002.5 instead.

If all you want is a list of valid times/their location in your series, why not just something like (times in seconds):

A = diff(times); % difference between times
n = find(abs(A-1)<0.1) % change 0.1 to whatever your tolerance is
times2 = times(n+1); 

times2 should then be a list of all your timestamps where the previous timestamp was approximately 1 second ago - works on a small set of fake data I constructed, didn't try it on yours. (For future reference: it would be more help to provide a small subset of your data, e.g. just a few minutes worth, that you know contains a blip).

I would then take the list of valid timestamps and split it up into 10 minute sections for averaging, counting how many valid timestamps were obtained in each section. If it's working, you should end up with no more than 600 - but not much less if the blips are occasional.

Upvotes: 0

Related Questions