Reputation: 1006
I have a text file as :
sample.txt
Hi I am student
I am from
What I've tried is
import string
import re
def read_to_list1(filepath):
text_as_string = open(filepath, 'r').read()
x = re.sub('['+string.punctuation+']', '', text_as_string).split("\n")
for i in x:
x_as_string = re.sub('['+string.punctuation+']', '', i).split()
print(x_as_string)
read_to_list1('sample.txt')
This results
['Hi,'I','am','student']
['I','am','from']
I want the result as:
[['Hi,'I','am','student'],['I','am','from']]
Upvotes: 0
Views: 304
Reputation: 183
For the specific example sample.txt, this should also work:
import string
import re
def read_to_list1(filepath):
text_as_string = open(filepath, 'r').read()
x = re.sub('['+string.punctuation+']', '', text_as_string).split("\n")
final_array=[]
for i in x:
x_as_string = re.sub('['+string.punctuation+']', '', i).split()
final_array.append(x_as_string)
return final_array
print(read_to_list1('sample.txt'))
Upvotes: 1
Reputation: 117886
After opening the file, you can use a list comprehension to iterate over lines, and for each line str.split
on whitespace to get tokens for each sublist.
def read_to_list1(filepath):
with open(filepath, 'r') as f_in:
return [line.split() for line in f_in]
Upvotes: 1