Term Document matrix manual implementation. Can we make it more efficient?

Question

The code below just produce the term-document matrix. Can we make it more efficient?

PREPROCESSED = ['He is a good boy','he loves studying']
DICTIONARY = ['He', 'is', 'a', 'good', 'boy', 'loves', 'studying']
MATRIX = []
for sent in PREPROCESSED:
    temp = []
    for i in DICTIONARY:
        count = 0
        for words in sent.split():
            if i == words:
                count = count + 1
        temp.append(count)
    test = 0
    for i in temp:
        if i != 0:
            test = 1
    if test == 1:
        MATRIX.append(temp)
    del temp

Term Document matrix manual implementation. Can we make it more efficient?

Answers (1)

Related Questions