Mike Timberlake
Mike Timberlake

Reputation: 63

Fetching photos in which 2 or more people are tagged

As the title says, I'm trying to fetch up to 10 photos in which the currently logged user and one or more of his/her friends are tagged. Currently I'm trying to do this with the PHP API and FQL.

I'm new to FQL, but not new to programming etc. The only way I am currently able to achieve what I want is by dynamically building multiple queries which look like this:

SELECT pid, src_big FROM photo WHERE pid IN(  
  SELECT pid FROM photo_tag WHERE subject = me() 
) AND pid IN( 
  SELECT pid FROM photo_tag WHERE 
    subject = '1530195' OR 
    subject = '3612831' OR 
    subject = '6912041' OR 
    ...
)

Apart from being ugly, this is slow. Queries are limited to about the length shown above because they fail when they get much longer.

Multi-queries didn't help me because I can't use 'as', but SQL isn't my greatest strength and I'm really hoping I've missed something..

There must be a better way! Anyone?

Upvotes: 2

Views: 120

Answers (2)

phwd
phwd

Reputation: 19995

Just use two queries and do the intersection in your programming language of choice.

  • SELECT subject, pid FROM photo_tag WHERE subject = me()
  • SELECT pid, subject FROM photo_tag WHERE subject IN (SELECT uid2 FROM friend WHERE uid1=me())

Batch these two calls

fql?q={"userphotos":"SELECT subject, pid 
                     FROM photo_tag 
                     WHERE subject = me()",
       "friendphotos":"SELECT pid, subject 
                       FROM photo_tag 
                       WHERE subject IN 
                      (SELECT uid2 FROM friend WHERE uid1=me())"}

Then in your programming language do an intersection of these two sets.

a0 = data['data'][0]['fql_result_set'] a1 = data['data'][1]['fql_result_set']

For example in Python, something simple as

>>> photos = {}
>>> for p in a0:
...     for q in a1:
...         if p['pid'] == q['pid']:
...             pid =  p['pid']
...             if pid in photos:
...                 photos[pid].append(p['subject'])
...             else:
...                 photos[pid] = [q['subject']]
...                 photos[pid].append(p['subject'])

photos will give a dict with each key having a list of ids as a value. The just take 10 of these and supply it to the photo table FQL call

In Python with the facepy module it might look something like

photoquery = 'SELECT pid, src_big FROM photo WHERE '+
             'pid ='+pid1+' OR
             'pid ='+pid2+' OR
              ...

graph.fql(photoquery)

The slowest section overall will be the photo_tag query on friends. SELECT pid, subject FROM photo_tag WHERE subject IN (SELECT uid2 FROM friend WHERE uid1=me())

Upvotes: 0

Lix
Lix

Reputation: 47956

Not sure how much this will improve performance.. But at the very least, it might make your query more readable:

You could use the familiar WHERE IN (...) statement instead of your OR xxx OR yyy format for the list of user ids...

SELECT pid, src_big FROM photo WHERE pid IN( 
  SELECT pid FROM photo_tag WHERE subject = me() 
) AND pid IN(
  SELECT pid FROM photo_tag WHERE subject IN(
    '1530195', 
    '3612831',
    '36800240',
    ...
  )

These kinds of queries that involve the entire list of a users friends tend to be problematic for a number of reasons. Firstly there is possibly hundreds of friends so even if you manage to speed up performance for one query, you're going to have to execute a series of queries to iterate through the entire friend list of a user. Secondly, running these kinds of intensive queries one after the other on the API might trigger some throttling of your application...

The only thing I can suggest to you is to let your users know that you are performing an intensive operation and "it might take a moment" :P Be hind the scenes you'll have to execute the queries and even possibly place a limit on each query as to not trigger some mechanisms on Facebook's side that might limit your access to the API.

Another option (that might need a little change to your applications flow) is instead of selecting ALL the users friends, display a list of the users friends and let them decide which friends they want to scan for mutual photos. This might also allow you to limit the actual amount of queries.

Upvotes: 0

Related Questions