Reputation: 2387
The problem with reading in the contents of a file, is that when read into a list, it formats it as one big string. Students need to be able to work with this "read" in data from the file, to isolate the ID number, and return the Student (for example).
I am aware of several methods that this could be done, for instance, regular expressions, converting to string, and using the split method, but would be interested, for teaching purposes, of the easiest, most elegant method (and by elegant, I mean avoiding multiple and unnecessary steps). Ideally, is there a way to read it into the list, directly from the text file, in the required format:
For instance,
instead of the current format (which also includes \n that I would need to strip):
['001,Joe,Bloggs,Test1:99,Test2:100,Test3:33\n', '002,Ash,Smith,Test1:22,Test2:63,Test3:99\n']
Required format: Either a 1d or 2d list like below
[['001','Joe','Bloggs','Test1:99','Test2:100','Test3:33'],['002','Ash','Smith','Test1:22','Test2:63','Test3:99']]
I'd be happy for people to post solutions including reg ex and split string, as it will help others, but is there a way to do this more simply?
Full Code listing with text file (repl it online:
Code:
f = open("studentinfo.txt","r")
myList = []
for line in f:
myList.append(line)
print(myList)
print()
print()
print(myList[0])
myList.split(",")
print(myList)
#split the list where all the individual elements in the current string (in the list) are split up at the ","
Text file:
001,Joe,Bloggs,Test1:99,Test2:100,Test3:33
002,Ash,Smith,Test1:22,Test2:63,Test3:99
Upvotes: 1
Views: 2960
Reputation: 140138
Once the list is built (or directly with the file handle as l
, there's no need to store the list first) I would just rstrip
and split
in a list comprehension like this:
l = ['001,Joe,Bloggs,Test1:99,Test2:100,Test3:33\n', '002,Ash,Smith,Test1:22,Test2:63,Test3:99\n']
newl = [v.rstrip().split(",") for v in l]
print(newl)
result:
[['001', 'Joe', 'Bloggs', 'Test1:99', 'Test2:100', 'Test3:33'], ['002', 'Ash', 'Smith', 'Test1:22', 'Test2:63', 'Test3:99']]
for a flat list do a double loop instead (or use itertools.chain.from_iterable
, well there are a lot of ways to do that):
newl = [x for v in l for x in v.rstrip().split(",")]
without listcomp (just for "readability" when you're not used to listcomps, after that, switch to listcomps :)):
newl = []
for v in l:
newl.append(v.rstrip().split(","))
(use extend
instead of append
to get a flat list)
of course I always forget to mention csv
which has default separator as comma and strips the newlines:
import csv
newl = list(csv.reader(l))
flat (using itertools
this time):
newl = list(itertools.chain.from_iterable(csv.reader(l)))
(l
can be a file handle or a list of lines for the csv
module)
Upvotes: 4
Reputation: 148860
That is a good use case for the csv module:
import csv
with open("studentinfo.txt","r") as f:
rd = csv.reader(f)
lst = list(rd) # lst is a list of lists in expected format
... # further processing on lst
Alternatively, it is trivial to process the file line by line
with open("studentinfo.txt","r") as f:
rd = csv.reader(f)
for row in rd: # row is list of fields
... # further processing on row
Upvotes: 2