Reputation: 2897
I have an input file which consists of these lines:
['Some Name__________2.0 2.0 1.3\n', 'Some Name__________1.0 9.0 1.0\n', # and so on....]
I have formatted it with readlines
, to this:
['Some Name', '', '', '', '2.0 2.0 1.3\n']
['Another Name', '', '', '', '1.0 9.0 1.0\n']
['Another Name', '', '', '', '1.0 9.0 1.0\n']
# and so on
What I wanted to do, is to get the names beneath each other, while I am getting rid of the _ signs.
This is my code:
def openFile():
fileFolder = open('TEXTFILE', 'r')
readMyFile = fileFolder.readlines()
for line in readFile:
line = line.split("_")
personNames = line[0]
print personNames
print openFile()
So what I get now, is:
Some Name
Another Name
Another Name
That is cool, but I want to go further and that is where I am getting stuck. What I want to do now, is to get rid of the empty strings (""
) and print the numbers you can see, just beside the names I've already formatted.
I thought that I could just do this:
for line in readFile:
line = line.split("_")
get_rid_of_spaces = line.split() #getting rid of spaces too
personNames = line[0]
But this gives me this error:
AttributeError: 'list' object has no attribute 'split'
How can I do this? I want to learn this.
I also tried incrementing the index number, but this failed and I read it's not the best way to do this, so now I am going this way.
Beside that, I'd expect that when I'd do line[1]
, that it would give me the empty strings, but it doesn't.
What am I missing here?
Upvotes: 2
Views: 3514
Reputation: 8927
Use a list comprehension to remove the empty strings.
for line in read_file:
tokens = [x for x in line.split("_") if x != ""]
person_name = tokens[0]
Upvotes: 2
Reputation: 14955
Just use re
split to get advantage of a multiple char delimiter:
>>> import re
>>>
>>> line = 'Some Name__________2.0 2.0 1.3\n'
>>> re.split(r'_+', line)
['Some Name', '2.0 2.0 1.3\n']
Example in a for loop:
>>> lines = ['Some Name__________2.0 2.0 1.3\n', 'Some Name__________1.0 9.0 1.0\n']
>>> for dat in [re.split(r'_+|\n', line) for line in lines]:
... person = dat[0]
... id = dat[1]
... print person, id
...
Some Name 2.0 2.0 1.3
Some Name 1.0 9.0 1.
Upvotes: 4
Reputation: 12037
The output of str.split
is a list
list
doesn't have a split
method, that's why you get that error.
You can instead do:
with open('yourfile') as f:
for line in f:
split = line.split('_')
name, number = split[0], split[-1]
print '{}-{}'.format(number, name)
Several things to note:
1) Don't use camel case
2) Use context managers for files, aka the with
statement, it handles file status nicely if something fails
3) Pay attention to this line: for line in f:
. It has the benefit of iterating through each line, never having the whole file in memory
Upvotes: 1
Reputation: 859
readfile=['Some name____2.0 2.1 1.3','Some other name_____2.2 3.4 1.1']
data=[]
for line in readfile:
first_split=list(part for part in line.split('_') if part!='')
data.append(list([first_split [0],first_split [1].split(' ')]))
print(data)
I think this does what you wanted if I understood you correctly. It prints out:
[['Some name', ['2.0', '2.1', '1.3']], ['Some other name', ['2.2', '3.4', '1.1']]]
Upvotes: 0
Reputation: 11134
>>> a =['Some Name__________2.0 2.0 1.3\n', 'Some Name__________1.0 9.0 1.0\n']
>>> import re
>>> [re.search(r'_+(.+)$', i.rstrip()).group(1) for i in a]
['2.0 2.0 1.3', '1.0 9.0 1.0']
Upvotes: 1
Reputation: 11486
You could do something like this:
for line in readFile:
line = line.split("_")
line = filter(bool, line)
This will remove all the empty string in the line
list.
Upvotes: 1