user
user

Reputation: 97

how to extract numbers between square brackets?

I have a text file containing text like

index of cluster is 18585 points index are [18585, 14290, 18503, 7220, 6835, 10009,6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312,.... ] 
index of the cluster is 3014 points index are [ ....] and so on .. 

I need to extract numbers between "[" until "]" in every cluster in a single file. i tried to check if line has "[" then get the numbers but didn't work right

import os 
f = open("cluster.txt","r")
for line in f.readlines():
    if "[" in line:
        print("true")

Upvotes: 0

Views: 1078

Answers (3)

Moiap13
Moiap13

Reputation: 19

You could alternatively do the following:

f = open("cluster.txt","r")
lst=[]

for line in f.readlines():
    lst += list(map(int, line.split("[").[1].split("]")[0].split(",")))

print(lst)

The list will get all lines of your file. The map just serves as transforming the recovered values into integers. You just have to convert the map to a list and append it to the main one.

Upvotes: -1

Xenty
Xenty

Reputation: 71

You can do something like this:

f = open("cluster.txt","r")
for line in f.readlines():
    numbers_only = line.split('[')[1].split(']')[0]
    list_of_number_strings = numbers_only.split(',')
    list_of_numbers = [int(number) for number in list_of_number_strings]

With this, you will have the numbers converted to integers in the list_of_numbers list in the end. First, this splits the line to only get the part between [ and ] and then it just splits the remainder and converts them to integers. This assumes that each line will contain a list. If some lines would have a different format, you would need to add some additional logic for such cases.

Upvotes: 2

Adon Bilivit
Adon Bilivit

Reputation: 27043

For each line in the file you can use a regular expression to identify the data within the brackets. Then you can split the resulting string and use a list comprehension (or a map as shown here) to give you a list of all the numbers.

For example:

import re
line = '''index of cluster is 18585 points index are [18585, 14290, 18503, 7220, 6835, 10009,6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312]'''
a = re.findall('\[(.*?)\]', line)
if a:
    nums = list(map(int, a[0].split(',')))
    print(nums)

Output:

[18585, 14290, 18503, 7220, 6835, 10009, 6615, 1269, 14161, 26545, 18140, 9292, 20355, 16401, 7713, 582, 1865, 17247, 26256, 19034, 7282, 1847, 19293, 16944, 27748, 29312]

Upvotes: 2

Related Questions