Disclaimer: Words cannot describe how much I detest AT&T style syntax
I have a problem that I hope is caused by register clobbering. If not, I have a much bigger problem.
The first version I used was
static unsigned long long rdtscp(void)
{
unsigned int hi, lo;
__asm__ __volatile__("rdtscp" : "=a"(lo), "=d"(hi));
return (unsigned long long)lo | ((unsigned long long)hi << 32);
}
I notice there is no 'clobbering' stuff in this version. Whether or not this is a problem I don't know... I suppose it depends if the compiler inlines the function or not. Using this version causes me problems that aren't always reproducible.
The next version I found is
static unsigned long long rdtscp(void)
{
unsigned long long tsc;
__asm__ __volatile__(
"rdtscp;"
"shl $32, %%rdx;"
"or %%rdx, %%rax"
: "=a"(tsc)
:
: "%rcx", "%rdx");
return tsc;
}
This is reassuringly unreadable and official looking, but like I said my issue isn't always reproducible so I'm merely trying to rule out one possible cause of my problem.
The reason I believe the first version is a problem is that it is overwriting a register that previously held a function parameter.
What's correct... version 1, or version 2, or both?
Here's C++ code that will return the TSC and store the auxiliary 32-bits into the reference parameter
static inline uint64_t rdtscp( uint32_t & aux )
{
uint64_t rax,rdx;
asm volatile ( "rdtscp\n" : "=a" (rax), "=d" (rdx), "=c" (aux) : : );
return (rdx << 32) + rax;
}
It is better to do the shift
and add
to merge both 32-bit halves in C++ statement rather than inline, this allows the compiler to schedule those instructions as it sees fit.