Kena
Kena

Reputation: 43

Convert a txt document to a list to a list starting with [[, instead of ['[

I have a .txt document with this type of text:

[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]

(which represents a sentence and the Parts of Speech tags for each word)

I want to have a list of list in python, like this:

[[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]]

But I obtain this:

['[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]\n']

The code I'm using is:

import io
f=io.open("test.txt", mode="r", encoding="utf-8-sig")
f_list = list(f)

How can I avoid the ['[ .... ]\n'] ?

Thank you!

Upvotes: 1

Views: 53

Answers (3)

Joran Beasley
Joran Beasley

Reputation: 114088

it looks like you can just do

import json
data = json.load(open('test.txt'))

this answer was wrong sorry... [("word","QQ")] is NOT valid json as json does not support tuples

instead you should be able to do

import ast
data = ast.literal_eval(io.open("test.txt", mode="r", encoding="utf-8-sig").read())

here is my version

import io,ast,requests

#text file available at
text_url = "https://gist.githubusercontent.com/joranbeasley/a50d940d9ac47e8458f027d3cc88e236/raw/3a65169d30e653e085284de16b1ee715f3596c95/example.txt"
with open("example.txt","wb") as f:
    # download and save textfile
    f.write(requests.get(text_url).content)

data = ast.literal_eval(io.open('example.txt',encoding='utf8').read())
print(data)
print(data[0])
print(data[0][0])

results in

[('Vazhdo', 'verb'), ('të', 'particle'), ('ecësh', 'verb'), ('!', 'excl.')]
('Vazhdo', 'verb')
Vazhdo

Upvotes: 3

Aditya Singh
Aditya Singh

Reputation: 41

io.open() reads the file as a list of strings so you'll need to evaluate each line of the .txt file to get a list of lists instead of your list of strings.

Here's how you can accomplish that:

temp = ['[("Vazhdo","verb"),("të","particle"),("ecësh","verb"),("!","excl.")]\n']
f_list = []
for i in temp:
  f_list.append(eval(i.strip()))

print(f_list)

#[[('Vazhdo', 'verb'), ('të', 'particle'), ('ecësh', 'verb'), ('!', 'excl.')]]


#OR

f_list = [eval(lst.strip()) for lst in f_list]

Upvotes: 1

user17285516
user17285516

Reputation:

you can delete a blank line with strip method, like:

f_list[0] = f_list[0].rstrip()

Upvotes: 0

Related Questions