Ananth Reddy
Ananth Reddy

Reputation: 317

SyntaxError: EOL while scanning string literal in Python

In the following code, I am trying to get elements that can be trained on SpaCy NER Model (in the 9th line of code).

from ast import literal_eval
import re

train_data_list = []

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + 
        str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'
        train_data_list.append(literal_eval(element))

But I am encountering the following error

 SyntaxError: EOL while scanning string literal

Thanks in Advance.

Upvotes: 1

Views: 25828

Answers (2)

user2864740
user2864740

Reputation: 61865

One (or more) of the element strings supplied to literal_eval cannot be parsed by literal_eval.. That is, the program syntax is valid (or else the program would fail without running anything!), and it is one or more of the element values supplied to literal_eval is not valid Python!

The first step is to identify some 'invalid' values, eg.

from ast import literal_eval
import re

train_data_list = []

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'
        try:
            data = literal_eval(element)
            train_data_list.append(data)
        except:
            print("Failed to parse element as a Python literal!")
            print(">>")
            print(repr(element))
            print("<<")

If the above "runs" (fsvo. "runs") then the proposed hypothesis holds the non-relevant answers can be ignored ;-)

Anyway, the solution is to not use literal_eval at all. Instead, create an object directly:

for i in range(len(train_data)):
    a = re.search(train_data.subtext[i], train_data.text[i])
    if a is not None:
        # might be a bit off.. YMMV.
        data = (train_data.text[i],
                {"entities": [(str(a.start()), str(a.end()), "SKILL")]})
        train_data_list.append(data)

Now, if values of train_data.text[i] contain a \n - that is, the literal two-character '\' and 'n' escape sequence - there may be additional work required to turn those into newline characters .. but one step at a time. And no step should be backward! :D

Upvotes: 0

Vineeth Sai
Vineeth Sai

Reputation: 3447

You cannot split a long line into multiple lines hitting enter. Either change your element= line to a single line like this

element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'

or add a \ at the end of the line

element = '("' +train_data.text[i] + '"' + ', {"entities": [(' + \
        str(a.start()) + ',' + str(a.end()) + ',"SKILL")]})'

Upvotes: 2

Related Questions