How do I print a CFStringRef in a DTrace action?

TALlama picture TALlama · Sep 12, 2009 · Viewed 7.3k times · Source

I have a DTrace probe catching calls to a function, and one of the function's arguments is a CFStringRef. This is private structure that holds a pointer to a unicode string. But the CFStringRef is not itself a char*, so normal DTrace methods like copyinstr() just return ?cp?, which isn't exactly helpful.

So how can I print out the string in the DTrace action?

Answer

gavinb picture gavinb · Sep 22, 2009

As far as I know, there is not built-in support for this kind of thing. Usually a library would publish a probe that decodes the string for you (as Brad mentions). So since in your case you can't modify the library, you'll need to use the pid provider and hook into a user function, and decode it yourself.

The solution (which is very similar to the approach you would use in C++ to dump a std::string) is to dump out the pointer which is stored at an 2 word offset from the base CFStringRef pointer. Note that since a CFString can store strings internally in a variety of formats and representations, this is subject to change.

Given the trivial test application:

#include <CoreFoundation/CoreFoundation.h>

int mungeString(CFStringRef someString)
{
    const char* str = CFStringGetCStringPtr(someString, kCFStringEncodingMacRoman);
    if (str)
        return strlen(str);
    else
        return 0;
}

int main(int argc, char* argv[])
{
    CFStringRef data = CFSTR("My test data");

    printf("%u\n", mungeString(data));

    return 0;
}

The following dtrace script will print the string value of the first argument, assuming it is a CFStringRef:

#!/usr/sbin/dtrace -s

/*
    Dumps a CFStringRef parameter to a function,
    assuming MacRoman or ASCII encoding.
    The C-style string is found at an offset of
    2 words past the CFStringRef pointer.
    This appears to work in 10.6 in 32- and 64-bit
    binaries, but is an implementation detail that
    is subject to change.

    Written by Gavin Baker <gavinb.antonym.org>
*/

#pragma D option quiet

/* Uncomment for LP32 */
/* typedef long ptr_t; */
/* Uncomment for LP64 */
typedef long long ptr_t;

pid$target::mungeString:entry
{
    printf("Called mungeString:\n");
    printf("arg0 = 0x%p\n",arg0);

    this->str = *(ptr_t*)copyin(arg0+2*sizeof(ptr_t), sizeof(ptr_t));
    printf("string addr = %p\n", this->str);
    printf("string val  = %s\n", copyinstr(this->str));

}

And the output will be something like:

$ sudo dtrace -s dump.d -c ./build/Debug/dtcftest 
12
Called mungeString:
arg0 = 0x2030
string addr = 1fef
string val  = My test data

Simply uncomment the right typedef depending on whether you are running against a 32-bit or 64-bit binary. I have tested this against both architectures on 10.6 and it works fine.