Reputation: 179
I have a data file with entries that look like this:
6->26:32
10->39:30
26->28:24
3->16:19
10->35:35
10->37:19
10->31:36
10->33:32
This is how I was trying to read them into a list, but it doesn't work for double-digit numbers.
import sys, re
data = []
for line in sys.stdin.readlines():
data.append(line.strip())
for i in range(len(data)):
cleandata = re.findall(r"[\w']", data[i])
print(cleandata)
The output I get is this:
['6', '2', '6', '3', '2']
['1', '0', '3', '9', '3', '0']
['2', '6', '2', '8', '2', '4']
['3', '1', '6', '1', '9']
['1', '0', '3', '5', '3', '5']
['1', '0', '3', '7', '1', '9']
['1', '0', '3', '1', '3', '6']
['1', '0', '3', '3', '3', '2']
What I want is:
[6, 26, 32]
[10, 39, 30]
[26, 28, 24]...etc
Any suggestions?
Upvotes: 3
Views: 60
Reputation: 305
I assume that the variable data that you get is a list of string:
data = ["6->26:32","10->39:30","26->28:24","3->16:19","10->35:35","10->37:19","10->31:36","10->33:32"]
If all entries are positive integers, this code may help:
for line in data:
entries =re.split("[^0-9]+",line)
print(entries)
[^0-9]+ is regex pattern that matches all non-digit characters. Using re.split helps us remove "->" and ":". The output that I get is:
['6', '26', '32']
['10', '39', '30']
... etc
Upvotes: 0
Reputation: 71580
Additionally to @blhsing's answer, you can use [0-9]
too:
cleandata = re.findall(r"[0-9]", data[i])
If care about them as strings:
print(list(map(int,cleandata)))
Upvotes: 0
Reputation: 26315
Here's a basic approach using str.replace()
:
with open('data.txt') as file:
for line in file:
line = line.replace('->', ' ').replace(':', ' ')
print(list(map(int, line.split())))
Which Outputs:
[6, 26, 32]
[10, 39, 30]
[26, 28, 24]
[3, 16, 19]
[10, 35, 35]
[10, 37, 19]
[10, 31, 36]
[10, 33, 32]
You can also use re.split()
:
from re import split
with open('data.txt') as file:
for line in file:
print(list(map(int, split('->|:', line.strip()))))
Upvotes: 0
Reputation: 106553
You can use the following regex instead:
cleandata = re.findall(r"\d+", data[i])
Upvotes: 4