DST
DST

Reputation: 71

python split() issue on whitespace, can someone explain?

AA  vowel
AE  vowel
AH  vowel
AO  vowel
AW  vowel
AY  vowel
B   stop
CH  affricate
D   stop
DH  fricative
EH  vowel
ER  vowel
EY  vowel
F   fricative
G   stop
HH  aspirate
IH  vowel
IY  vowel
JH  affricate
K   stop
L   liquid
M   nasal
N   nasal
NG  nasal
OW  vowel
OY  vowel
P   stop
R   liquid
S   fricative
SH  fricative
T   stop
TH  fricative
UH  vowel
UW  vowel
V   fricative
W   semivowel
Y   semivowel
Z   fricative
ZH  fricative

This is the content in a file, I then separate them into lines and parse them. The problem is when I use line.split() or even re.split(r'\t+', line), seeing that the whitespace in between them resemble a tab, I get a problem that it splits them into characters. Help please, I don't understand where I am going wrong.

code for split

try:
        datafile = open(filename,'r')
    except IOError:
        print('Could not open ' + filename)
        sys.exit()
        pass

    stypes = {}

    for line in datafile.readlines():
        if line:
            re.split(r'\t+', line)
            phone = line[0]
            type = line[1]
        print(line[0] + ' ' + line[1] + ' ' + line[2])

Upvotes: 1

Views: 92

Answers (1)

Mike Müller
Mike Müller

Reputation: 85482

You are printing the original line not the list with the split results. This should work better:

with open('mywords.txt') as fobj:
    for line in fobj:
        res = line.split()
        print(res)

Output:

['AA', 'vowel']
['AE', 'vowel']

The with statement opens a file and will close it as soon as you dedent to the level of with, i.e. fobj will only be open until you write more code on the same level of with (or end your function or program there). This is called a context manager. The context is the indented lines below with.

Example:

with open('mywords.txt') as fobj:
    print('closed', fobj.closed)
print('closed', fobj.closed)

Output:

closed False
closed True

Upvotes: 4

Related Questions