Reputation: 13
I am reading from a file that has the following in it.
87965164,Paris,Yu,6/27/1997
87965219,Heath,Moss,10/13/1996
87965187,Cale,Blankenship,10/22/1995
87965220,Terrence,Watkins,12/7/1996
87965172,Ansley,Padilla,3/30/1997
i need to split the the lines at the "," and the "/" and also removing the "\n" from the end.
i want my output to look like this when put into a list:
[['87965164', 'Paris', 'Yu', 6, 27, 1997], ['87965219', 'Heath', 'Moss', 10, 13, 1996], ['87965187', 'Cale', 'Blankenship', 10, 22, 1995], ['87965220', 'Terrence', 'Watkins', 12, 7, 1996], ['87965172', 'Ansley', 'Padilla', 3, 30, 1997]]
Upvotes: 0
Views: 210
Reputation: 21585
Simpler than regex:
[line.replace('/', ',').split(',') for line in text.split('\n')]
You can transform numbers into int
s afterwards.
However, I believe that you are looking for the wrong way to do it. The right way is to split by commas, then give special fields a dedicated treatment.
from datetime import datetime
from collections import namedtuple
Person = namedtuple('Row', ['idn', 'first', 'last', 'birth'])
def make_person(idn, first, last, birth):
return Person(idn, first, last,
datetime.strptime(birth, "%m/%d/%Y"))
records = [make_person(*line.split(',')) for line in text.split('\n')]
Upvotes: 1
Reputation: 49320
Rather than storing heterogeneous data in a homogeneous data type, I'd recommend using dictionaries or creating a class.
With dictionaries:
results = {}
with open('in.txt') as f:
for line in f:
id, first, last, day = line.split(',')
month, day, year = map(int, day.split('/'))
results[id] = {'id':id, 'first':first, 'last':last,
'month':month, 'day':day, 'year':year}
With a class:
class Person:
def __init__(self, id, first, last, day):
self.id = id
self.first = first
self.last = last
self.month, self.day, self.year = map(int, day.split('/'))
results = {}
with open('in.txt') as f:
for line in f:
id, first, last, day = line.split(',')
results[id] = Person(id, first, last, day)
Note that in each case I am storing each person's info as an entry in a dictionary, with a key of what looks like their ID number.
Upvotes: 1
Reputation: 36013
For each line:
parts = line.split(',')
parts[-1:] = map(int, parts[-1].split('/'))
This will correctly handle input that has any slashes in the non-date parts, and easily handles the conversion to integers at the same time.
Upvotes: 0
Reputation: 81988
You're going to want regular expressions.
import re
results = []
for line in fl:
# [,/] means "match if either a , or a / is present"
results.append(re.split('[,/]',line.strip()))
If you have a particularly big file, you can wrap it in a generator:
import re
def splitter(fl):
for line in fl:
# By using a generator, you are only accessing one line of the file at a time.
yield re.split('[,/]',line.strip())
Upvotes: 2