Reputation: 117
The Problem is found under Windows XP when I want to write a function like GetTicketCount64 which does not exists on this platform. Here is my test code:
uint64_t GetTickCountEx()
{
#if _WIN32_WINNT > _WIN32_WINNT_WINXP
return GetTickCount64();
#else
// http://msdn.microsoft.com/en-us/library/windows/desktop/dn553408.aspx
LARGE_INTEGER Frequency = {};
LARGE_INTEGER Counter = {};
BOOST_VERIFY(QueryPerformanceFrequency(&Frequency));
BOOST_VERIFY(QueryPerformanceCounter(&Counter));
return 1000 * Counter.QuadPart / Frequency.QuadPart;
#endif
}
for (int i = 0; ++i < 1000; Sleep(30000))
{
const auto utc = time(nullptr); // System time
const auto xp = GetTickCount(); // API of Windows XP SP3
const auto ex = GetTickCountEx(); // Performance counter
const auto diff = ex - xp;
printf("%lld %I32u %I64u %I64u \n", utc, xp, ex, diff);
}
I cannot understand the below result. From this article, reply from Angstrom seems not correct. Last column suggests that the difference of GTC and GPC is closer as time goes by! ... and, will it reaches zero some hours later?
So, my question is: Is my implementation of GetTickCount64 correct, and why?
1401778679 503258484 503355416 96932
1401778709 503288484 503385374 96890
1401778739 503318484 503415354 96870
1401778769 503348484 503445289 96805
1401778799 503378484 503475274 96790
1401778829 503408484 503505272 96788
1401778859 503438484 503535245 96761
1401778889 503468500 503565210 96710
1401778919 503498500 503595143 96643
1401778949 503528500 503625137 96637
1401778979 503558500 503655100 96600
1401779009 503588500 503685069 96569
1401779039 503618500 503715069 96569
1401779069 503648500 503745006 96506
1401779099 503678500 503774951 96451
1401779129 503708500 503804958 96458
1401779159 503738500 503834943 96443
1401779189 503768500 503864911 96411
1401779219 503798500 503894792 96292
1401779249 503828500 503924759 96259
1401779279 503858500 503954607 96107
1401779309 503888500 503984607 96107
1401779339 503918500 504014392 95892
1401779369 503948500 504044362 95862
CPU Core Info from coreinfo.exe: Coreinfo v3.21 - Dump information on system CPU and memory topology Copyright (C) 2008-2013 Mark Russinovich Sysinternals - www.sysinternals.com
Intel(R) Core(TM) i3 CPU M 380 @ 2.53GHz
x86 Family 6 Model 37 Stepping 5, GenuineIntel
HTT * Hyperthreading enabled
HYPERVISOR - Hypervisor is present
VMX * Supports Intel hardware-assisted virtualization
SVM - Supports AMD hardware-assisted virtualization
EM64T * Supports 64-bit mode
SMX - Supports Intel trusted execution
SKINIT - Supports AMD SKINIT
NX * Supports no-execute page protection
SMEP - Supports Supervisor Mode Execution Prevention
SMAP - Supports Supervisor Mode Access Prevention
PAGE1GB - Supports 1 GB large pages
PAE * Supports > 32-bit physical addresses
PAT * Supports Page Attribute Table
PSE * Supports 4 MB pages
PSE36 * Supports > 32-bit address 4 MB pages
PGE * Supports global bit in page tables
SS * Supports bus snooping for cache operations
VME * Supports Virtual-8086 mode
RDWRFSGSBASE - Supports direct GS/FS base access
FPU * Implements i387 floating point instructions
MMX * Supports MMX instruction set
MMXEXT - Implements AMD MMX extensions
3DNOW - Supports 3DNow! instructions
3DNOWEXT - Supports 3DNow! extension instructions
SSE * Supports Streaming SIMD Extensions
SSE2 * Supports Streaming SIMD Extensions 2
SSE3 * Supports Streaming SIMD Extensions 3
SSSE3 * Supports Supplemental SIMD Extensions 3
SSE4a - Supports Sreaming SIMDR Extensions 4a
SSE4.1 * Supports Streaming SIMD Extensions 4.1
SSE4.2 * Supports Streaming SIMD Extensions 4.2
AES - Supports AES extensions
AVX - Supports AVX intruction extensions
FMA - Supports FMA extensions using YMM state
MSR * Implements RDMSR/WRMSR instructions
MTRR * Supports Memory Type Range Registers
XSAVE - Supports XSAVE/XRSTOR instructions
OSXSAVE - Supports XSETBV/XGETBV instructions
RDRAND - Supports RDRAND instruction
RDSEED - Supports RDSEED instruction
CMOV * Supports CMOVcc instruction
CLFSH * Supports CLFLUSH instruction
CX8 * Supports compare and exchange 8-byte instructions
CX16 * Supports CMPXCHG16B instruction
BMI1 - Supports bit manipulation extensions 1
BMI2 - Supports bit manipulation extensions 2
ADX - Supports ADCX/ADOX instructions
DCA - Supports prefetch from memory-mapped device
F16C - Supports half-precision instruction
FXSR * Supports FXSAVE/FXSTOR instructions
FFXSR - Supports optimized FXSAVE/FSRSTOR instruction
MONITOR * Supports MONITOR and MWAIT instructions
MOVBE - Supports MOVBE instruction
ERMSB - Supports Enhanced REP MOVSB/STOSB
PCLULDQ - Supports PCLMULDQ instruction
POPCNT * Supports POPCNT instruction
LZCNT - Supports LZCNT instruction
SEP * Supports fast system call instructions
LAHF-SAHF * Supports LAHF/SAHF instructions in 64-bit mode
HLE - Supports Hardware Lock Elision instructions
RTM - Supports Restricted Transactional Memory instructions
DE * Supports I/O breakpoints including CR4.DE
DTES64 * Can write history of 64-bit branch addresses
DS * Implements memory-resident debug buffer
DS-CPL * Supports Debug Store feature with CPL
PCID * Supports PCIDs and settable CR4.PCIDE
INVPCID - Supports INVPCID instruction
PDCM * Supports Performance Capabilities MSR
RDTSCP * Supports RDTSCP instruction
TSC * Supports RDTSC instruction
TSC-DEADLINE - Local APIC supports one-shot deadline timer
TSC-INVARIANT * TSC runs at constant rate
xTPR * Supports disabling task priority messages
EIST * Supports Enhanced Intel Speedstep
ACPI * Implements MSR for power management
TM * Implements thermal monitor circuitry
TM2 * Implements Thermal Monitor 2 control
APIC * Implements software-accessible local APIC
x2APIC - Supports x2APIC
CNXT-ID - L1 data cache mode adaptive or BIOS
MCE * Supports Machine Check, INT18 and CR4.MCE
MCA * Implements Machine Check Architecture
PBE * Supports use of FERR#/PBE# pin
PSN - Implements 96-bit processor serial number
PREFETCHW * Supports PREFETCHW instruction
Maximum implemented CPUID leaves: 0000000B (Basic), 80000008 (Extended).
Logical to Physical Processor Map:
*-*- Physical Processor 0 (Hyperthreaded)
-*-* Physical Processor 1 (Hyperthreaded)
Logical Processor to Socket Map:
**** Socket 0
Logical Processor to NUMA Node Map:
**** NUMA Node 0
Logical Processor to Cache Map:
*-*- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64
*-*- Instruction Cache 0, Level 1, 32 KB, Assoc 4, LineSize 64
*-*- Unified Cache 0, Level 2, 256 KB, Assoc 8, LineSize 64
-*-* Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64
-*-* Instruction Cache 1, Level 1, 32 KB, Assoc 4, LineSize 64
-*-* Unified Cache 1, Level 2, 256 KB, Assoc 8, LineSize 64
**** Unified Cache 2, Level 3, 3 MB, Assoc 12, LineSize 64
Upvotes: 2
Views: 4499
Reputation: 941407
You cannot compare the two timing sources, they have drastically different implementations in PCs.
GetTickCount()
is derived from the clock tick interrupt, a signal that's generated by the real-time clock. Traditionally a dedicated chip, originally the Motorola MC146818, nowadays integrated in the south-bridge. It has the kind of oscillator that was used in watches, crystal stabilized and usually running at 32768 Hertz. This oscillator keeps running when the machine power is turned off, running off a lithium battery or a super-capacitor.
So resolution is quite poor, but it is made very accurate with very good long-term stability by periodically resynchronizing the clock with time provided by a time server, most Windows machines use time.windows.com. Review GetSystemTimeAdjustment() for details.
QueryPerformanceCounter()
uses a frequency source available in the chipset. Traditionally the 8053 counter running at 1193182 Hertz. Nowadays the HPET timer, the HAL (Hardware Abstraction Layer) allows a system integrator to pick any frequency source he's got available. Using the CPU clock is not unusual in cheaper designs.
So resolution is very high, but it is inaccurate and there is no mechanism to calibrate this timer. Being off by 800 ppm from the reported QPF is not unusual. This timer should only ever be used for short interval measurements, the kind that a profiler would use for example.
So no, using QueryPerformanceCounter() as an alternative for GetTickCount64() isn't a very good idea, unless you can live with the inaccuracy. Technically you can synthesize your own 64-bit counter, as long as you keep track of the value of GetTickCount() overflowing. You could, say, increment the course count when the previous value was negative and the new value is positive, indicating that it overflowed. The only requirement is that you sample GetTickCount() often enough to see the transition, at least once in 24 days.
Upvotes: 9
Reputation: 5477
In the same thread as the one you linked, the reply from Raymond Chen specifically says that you should consider neither to be the time 'since' anything. Only consider time differences (intervals) to be relevant quantities. Therefore, what you should be testing is intervals, for instance before your loop the start value(s), and each time in your loop the elapsed time since the start of the loop.
Upvotes: 0