pthread condition variables vs win32 events (linux vs windows-ce)

dacwe picture dacwe · Jan 16, 2012 · Viewed 9.6k times · Source

I am doing a performance evaluation between Windows CE and Linux on an arm imx27 board. The code has already been written for CE and measures the time it takes to do different kernel calls like using OS primitives like mutex and semaphores, opening and closing files and networking.

During my porting of this application to Linux (pthreads) I stumbled upon a problem which I cannot explain. Almost all tests showed a performance increase from 5 to 10 times but not my version of win32 events (SetEvent and WaitForSingleObject), CE actually "won" this test.

To emulate the behaviour I was using pthreads condition variables (I know that my implementation doesn't fully emulate the CE version but it's enough for the evaluation).

The test code uses two threads that "ping-pong" each other using events.


Windows code:

Thread 1: (the thread I measure)

HANDLE hEvt1, hEvt2;
hEvt1 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt1"));
hEvt2 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt2"));

ResetEvent(hEvt1);
ResetEvent(hEvt2);

for (i = 0; i < 10000; i++)
{
    SetEvent (hEvt1);
    WaitForSingleObject(hEvt2, INFINITE);
}        

Thread 2: (just "responding")

while (1)
{
    WaitForSingleObject(hEvt1, INFINITE);
    SetEvent(hEvt2);
}

Linux code:

Thread 1: (the thread I measure)

struct event_flag *event1, *event2;
event1 = eventflag_create();
event2 = eventflag_create();

for (i = 0; i < 10000; i++)
{
    eventflag_set(event1);
    eventflag_wait(event2);
}

Thread 2: (just "responding")

while (1)
{
    eventflag_wait(event1);
    eventflag_set(event2);
}

My implementation of eventflag_*:

struct event_flag* eventflag_create()
{
    struct event_flag* ev;
    ev = (struct event_flag*) malloc(sizeof(struct event_flag));

    pthread_mutex_init(&ev->mutex, NULL);
    pthread_cond_init(&ev->condition, NULL);
    ev->flag = 0;

    return ev;
}

void eventflag_wait(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    while (!ev->flag)
        pthread_cond_wait(&ev->condition, &ev->mutex);

    ev->flag = 0;

    pthread_mutex_unlock(&ev->mutex);
}

void eventflag_set(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    ev->flag = 1;
    pthread_cond_signal(&ev->condition);

    pthread_mutex_unlock(&ev->mutex);
}

And the struct:

struct event_flag
{
    pthread_mutex_t mutex;
    pthread_cond_t  condition;
    unsigned int    flag;
};

Questions:

  • Why doesn't I see the performance boost here?
  • What can be done to improve performance (e.g are there faster ways to implement CEs behaviour)?
  • I'm not used to coding pthreads, are there bugs in my implementation maybe resulting in performance loss?
  • Are there any alternative libraries for this?

Answer

Michael Burr picture Michael Burr · Jan 16, 2012

Note that you don't need to be holding the mutex when calling pthread_cond_signal(), so you might be able to increase the performance of your condition variable 'event' implementation by releasing the mutex before signaling the condition:

void eventflag_set(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    ev->flag = 1;

    pthread_mutex_unlock(&ev->mutex);

    pthread_cond_signal(&ev->condition);
}

This might prevent the awakened thread from immediately blocking on the mutex.