user2075354
user2075354

Reputation: 405

how to intersect a mult-dimensional array

i'm building a search engine for my site. I have an index of all the words contained in the pages of my site and their positions. I am using php arrays, and the info returned after a search looks like this:

'jeff' => 
    array
      1 => 
        array
          0 => int 0
          1 => int 259
          2 => int 444
          3 => int 461
          4 => int 486
'seka'
    array
      1 => 
        array
          0 => int 1
          1 => int 260
          2 => int 445
          3 => int 462
          4 => int 487

If i want to find all the postings list of jeff, i will look for "jeff" as a key; if it exists, then i insert it into a variable, like $v=index['jeff'].

Thats simple, but now what if i have a multi string query like "jeff and seka"? How do i check if they exist both and return them as different arrays (one for jeff and another for seka) so i can easily intersect them to find the document with both search strings?

Upvotes: -2

Views: 195

Answers (1)

Bob Sammers
Bob Sammers

Reputation: 3290

Edit: re-written after comments. Some feedback would be good, to see if we're going in the right direction!

Have you looked at the array_intersect_key() function? You should be able to do:

$common = array_intersect_key(index['jeff'], index['seka']);

This will give you a new array with just the keys (and values from 'jeff') of those pages common to Jeff and Seka. You can supply any number of additional arrays to the function, which will allow you to search for (for example) five different terms together and only retrieve pages which contain all five.

Your return array will contain a key for each page. Each key's value will come from the first argument in the array_intersect_key() call ("jeff", in my example). In other words, a subset of the index['jeff'] array is returned.

If you want to retrieve the positions of other terms on each page, you can either repeat the search with different terms at the start (don't: quite inefficient) or loop through the keys in your returned results (you can get an array of the keys with $pages = array_keys($common);) and use this as in index to the arrays for each of the other terms.

Upvotes: 1

Related Questions