user15874067
user15874067

Reputation: 81

Idexing in python

I am new in programming and I am trying to implement a function that creates a dictionary that maps each term to its inverted list. So, given the collection:

collection = ["apple orange milk" , "bread milk meat apple" , "apple orange"]

I want that for each element in the collection to get the index of the string on which it is. I am trying to get the following result:

inv_ls = { "apple": [0,1,2], "orange": [0,2], "milk":[0,1], "bread":[1], "bread":[1]}

Upvotes: 0

Views: 69

Answers (4)

d-k-bo
d-k-bo

Reputation: 662

This could also be done using list, set and dict comprehension:

{  # generate dict
    word: [
        string.split(" ").index(word)  # get index
        for string in collection
        if word in string.split(" ")  # if word in string
    ]
    for word in {  # generate set with all words
        word for string in collection for word in string.split(" ")
    }
}

You should consider using a fallback value for that all lists have the same length and the index can be used to find the words in the original list.

{  # generate dict
    word: [
        string.split(" ").index(word)  # get index
        if word in string.split(" ")  # if word in string
        else None  # if not in string use a fallback value instead
        for string in collection
    ]
    for word in {  # generate set with all words
        word for string in collection for word in string.split(" ")
    }
}

this would produce:

{'bread': [None, 0, None], 'apple': [0, 3, 0], 'milk': [2, 1, None], 'orange': [1, None, 1], 'meat': [None, 2, None]}

Upvotes: 0

You got the error because terms list is: [['apple', 'orange', 'milk'], ['bread', 'milk', 'meat', 'apple'], ['apple', 'orange']] which is a list if lists. So, when you try to find 'apple' in that list compiler gives that ValueError. Correct code should be like:

def create_index (C):
    index = {} 
    terms = []
    for element in C:
        terms.append(element.split())
    for element in terms:
        for term in element:
            if term in element:
                index.setdefault(term,[])
                index[term].append(terms.index(element))
    return index

Upvotes: 0

Corralien
Corralien

Reputation: 120399

Split each string in a list of words, then for each word record the index. If word is not yet appeared, the setdefault method creates a new entry in the dict and set an empty list ([]).

collection = ["apple orange milk" , "bread milk meat apple" , "apple orange"]
inv_ls = {}

for idx, lst in enumerate([s.split() for s in collection]):
    for item in lst:
        inv_ls.setdefault(item, []).append(idx)
>>> inv_ls
{'apple': [0, 1, 2],
 'orange': [0, 2],
 'milk': [0, 1],
 'bread': [1],
 'meat': [1]}

Upvotes: 1

DonKnacki
DonKnacki

Reputation: 427

You can update your dictionary directly in loop of element

def create_index (C):
        index = {} 
        terms = []
        for element in C:
            terms.append(element.split())
        for i, element in enumerate(terms):
            for term in element:
                if term not in index:
                    index[term] = [i]
                else:
                    index[term].append(i)
        return index

C = ["apple orange milk" , "bread milk meat apple" , "apple orange"]
index = create_index(C)
print(index)

Upvotes: 1

Related Questions