Makio21
Makio21

Reputation: 13

Positional Index In python?

am able to create an inverted index but I cannot quite implement a positional index. Positional index has a format of [doc_ID, pos_1, pos_2, ...]

here doc_ID indicate which document the word appears in and which position it appears in that document.

Ex. index = positional_index(['a','b','a'], ['a','c']]) when user enters index['a'] it will return [[0,0,2], [1,0]]

The following code is for the mentioned inverted index. I have no idea what else to add to make it positional index:

def positional index(tokens):
    d = defaultdict(lambda:[])

    for docID, t_list in enumerate(tokens):
        for t in t_list:
            d[t].append(docID)

return d

All help would be much appreciated.

Upvotes: 0

Views: 1954

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180441

Using your own code you just need to add the indexes for each element and the docID using a set to avoid repeated keys:

def positional_index(tokens):
    d = defaultdict(lambda:[])
    for docID, sub_l in enumerate(tokens):
        for t in set(sub_l):
            d[t].append([docID] + [ind for ind, ele in enumerate(sub_l) if ele == t])
    return d

In [9]: index=  positional_index([['a','b','a'], ['a','c']])

In [10]: index["a"]
Out[10]: [[0, 0, 2], [1, 0]]
In [11]: index["b"]
Out[11]: [[0, 1]]

In [12]: index["c"]
Out[12]: [[1, 1]]

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107297

You can use the following function :

>>> def find_index(l,elem) :
...   return [[i]+[t for t,k in enumerate(j) if k==elem] for i,j in enumerate(l)]
... 
>>> find_index(l,'a')
[[0, 0, 2], [1, 0]]

All stuff that you need here is using enumerate within two list comprehension .

Upvotes: 1

Related Questions