kevin
kevin

Reputation: 2014

Modifying corpus in python

I have a dataset (customer review corpus) looking like this:

documents = 
[["I like the product", "5"],["the product is poor", "2.5"],["it is an okay product", "3"],["the quality is poor", "1"],["color is great", "3.5"]]

The first list value is corpus I would like to modify based on the second value, which is score. The score could be any number between 1 (lowest) and 5 (highest). What I want is to insert the word "GOOD" to the corpus, if its score is greater than 3, and the word "BAD" to the corpus, if the score is less than 3. So the output should look like this:

[["I like the product GOOD", "5"],["the product is poor BAD", "2.5"],["it is an okay product", "3"],["the quality is poor BAD", "1"],["color is great GOOD", "3.5"]]

I have developed a code causing 'str' object has no attribute 'insert'

for document in documents:
    if int(float(document[1])) > 3:
        document[0].insert('GOOD')
    elif int(float(document[1])) < 3:
        document[0].insert('BAD')
    else:
        document[0].insert()

Any suggestion? Thanks in advance.

Upvotes: 0

Views: 56

Answers (4)

Alexander
Alexander

Reputation: 109626

You can use a list comprehension structure together with conditionals (ternary):

docs = [[doc[0] + (" GOOD" if float(doc[1]) > 3 
                 else (" BAD" if float(doc[1]) < 3 else ""))] 
        for doc in documents]

>>> docs
[['I like the product GOOD'],
 ['the product is poor BAD'],
 ['it is an okay product'],
 ['the quality is poor BAD'],
 ['color is great GOOD']]

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180481

Apart from strings being immutable and not having an insert method, your else is redundant, the string can only be > < or == which if the first two are False means it has to be equal so nothing should be done to it:

for doc in documents:
    f = int(float(doc[1]))
    if f > 3:
        doc[0] += " GOOD"
    elif f < 3:
        doc[0] += " BAD"
 print(documents)

[['I like the product GOOD', '5'], ['the product is poor BAD', '2.5'],
 ['it is an okay product', '3'], ['the quality is poor BAD', '1'], 
['color is great', '3.5']]

Upvotes: 1

Cody Bouche
Cody Bouche

Reputation: 955

This can be accomplished with list comprehension

documents = [['I like the product', '5'],['the product is poor', '2.5'],['it is an okay product', '3'],['the quality is poor', '1'],['color is great', '3.5']]

documents = [[x[0] + (' GOOD' if float(x[1]) > 3 else ' BAD' if float(x[1]) < 3 else ''), x[1]] for x in documents]

Upvotes: 0

acushner
acushner

Reputation: 9946

yep, str objects don't have insert methods.

just add it:

document[0] += ' GOOD'

Upvotes: 2

Related Questions