I'm trying to select all mutual friends' connections with PHP/FQL. Using my UID (540 friends), which means >12,000 connections, of which >6500 are unique. So this code should return all the connections but Facebook apparently has a 4999/5000 row limit on FQL queries.
// select mutual unique friends
$unique_connections = $facebook->api_client->fql_query("
SELECT uid1, uid2 FROM friend
WHERE uid1 IN
(SELECT uid2 FROM friend WHERE uid1=$uid)
AND uid2 IN
(SELECT uid2 FROM friend WHERE uid1=$uid)
");
I know the numbers above because the original code I wrote loops through my friend list and sends a getMutualFriend query for each of them.
foreach ($friends as $key)
{
$mutual_friends = $facebook->api_client->friends_getMutualFriends($key);
foreach ($mutual_friends as $f_uid)
{
array_push($all_connections, array($key,$f_uid));
}
}
Of course it takes almost 3 minutes to run that script, while the FQL query returns in 5 seconds. After an hour of searching for this answer I've come to the conclusion the only way to get around this is to use a mixture of the two methods. Well that, and post here. Any ideas on a better way to write this script and beat the 4999/5000 row limit?
Here's an fql_multiquery that should do the same as above. It is also limited to 4999/5000.
$queries = '{
"user_friends":"SELECT uid2 FROM friend WHERE uid1 = '.$uid.'",
"mutual_friends":"SELECT uid1, uid2 FROM friend WHERE uid1 IN (SELECT uid2 FROM #user_friends) AND uid2 IN (SELECT uid2 FROM #user_friends)"
}';
$mq_test = $facebook->api_client->fql_multiquery(trim($queries));
print_r($mq_test);
So, I'm posting the answer to my original question. I was able to circumvent the 5000 row limit on FQL queries by chunking the array of UIDs (using the appropriately-named array_chunk() PHP function) and looping through the chunks to execute mini-queries, and then appending it all back into one array. The whole script averages 14 seconds for over 12,000 rows so that is a huge improvement. You can see the application at work here: givememydata.com
Oh, and Facebook should reconsider their (still undocumented) FQL row limit. What is more taxing on their servers? A single query that executes in 5 seconds or 500 queries that take 180 seconds? Sorry, had to vent. ;-)