Facebook mutual friends and FQL 4999/5000 record limit

ow3n picture ow3n · Nov 25, 2010 · Viewed 7.8k times · Source

I'm trying to select all mutual friends' connections with PHP/FQL. Using my UID (540 friends), which means >12,000 connections, of which >6500 are unique. So this code should return all the connections but Facebook apparently has a 4999/5000 row limit on FQL queries.

// select mutual unique friends
 $unique_connections = $facebook->api_client->fql_query("

  SELECT uid1, uid2 FROM friend 
   WHERE uid1 IN 
   (SELECT uid2 FROM friend WHERE uid1=$uid)
   AND uid2 IN 
   (SELECT uid2 FROM friend WHERE uid1=$uid)
 ");

I know the numbers above because the original code I wrote loops through my friend list and sends a getMutualFriend query for each of them.

foreach ($friends as $key) 
{
    $mutual_friends = $facebook->api_client->friends_getMutualFriends($key);
    foreach ($mutual_friends as $f_uid)
    {
        array_push($all_connections, array($key,$f_uid)); 
    }
}

Of course it takes almost 3 minutes to run that script, while the FQL query returns in 5 seconds. After an hour of searching for this answer I've come to the conclusion the only way to get around this is to use a mixture of the two methods. Well that, and post here. Any ideas on a better way to write this script and beat the 4999/5000 row limit?

Here's an fql_multiquery that should do the same as above. It is also limited to 4999/5000.

$queries = '{
"user_friends":"SELECT uid2 FROM friend WHERE uid1 = '.$uid.'",
"mutual_friends":"SELECT uid1, uid2 FROM friend WHERE uid1 IN (SELECT uid2 FROM #user_friends) AND uid2 IN (SELECT uid2 FROM #user_friends)"
}';

$mq_test = $facebook->api_client->fql_multiquery(trim($queries));
print_r($mq_test);

Answer

ow3n picture ow3n · Nov 28, 2010

So, I'm posting the answer to my original question. I was able to circumvent the 5000 row limit on FQL queries by chunking the array of UIDs (using the appropriately-named array_chunk() PHP function) and looping through the chunks to execute mini-queries, and then appending it all back into one array. The whole script averages 14 seconds for over 12,000 rows so that is a huge improvement. You can see the application at work here: givememydata.com

Oh, and Facebook should reconsider their (still undocumented) FQL row limit. What is more taxing on their servers? A single query that executes in 5 seconds or 500 queries that take 180 seconds? Sorry, had to vent. ;-)