Sandy
Sandy

Reputation: 21

Sentiment analysis Python TypeError: expected string or bytes-like object

I am doing a sentiment analysis and I want to Add NOT to every word between negation and following punctuation. I am performing the following code:

import re


fin=open("aboveE1.txt",'r', encoding='UTF-8')

transformed = re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
   lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
   fin,
   flags=re.IGNORECASE)

Traceback (most recent call last): line 14, in flags=re.IGNORECASE) line 182, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object

I dont know how to fix the error. Can you help me?

Upvotes: 1

Views: 1338

Answers (1)

oxalorg
oxalorg

Reputation: 2798

re.sub takes in a string, not a file object. Documentation here.

import re

fin=open("aboveE1.txt",'r', encoding='UTF-8')    
transformed = ''

for line in fin:
    transformed += re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
    lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
    line,
    flags=re.IGNORECASE)
    # No need to append '\n' to 'transformed'
    # because the line returned via the iterator includes the '\n'

fin.close()

Also remember to always close the file you open.

Upvotes: 1

Related Questions