Reputation: 119
I have a sequence of numbers that I somehow obtain from an external source, not separated by any commas and not in a data structure, e.g: 1 1.5 120202.4343 58 -2442.5
Where distinct numbers are separated by a space(s).
Is there a way for me to write a program to quickly convert this sequence into a list or numpy array [1, 1.5, 120202.4343, 58 ,-2442.5].
Upvotes: 0
Views: 255
Reputation: 53029
I can't believe nobody came up with the obvious:
np.array(your_string.split(),dtype=float)
Upvotes: 1
Reputation: 8122
You can use split()
and split the spaces to get a list of strings. Then simply convert each string to a float using type casting. This is achieved in one line using list comprehension.
For example:
x = '1 1.5 120202.4343 58 -2442.5'
output = [float(i) for i in x.split(" ")]
Output:
[1, 1.5, 120202.4343, 58 ,-2442.5]
If you input numbers come in one after another, then you can simply append to an existing list:
output = []
# Loop until an escape string is provided and get append input number to list
while True:
x = input() # Next input number
if x == 'escape_string_of_your_choice':
break
else:
output.append(x)
Likewise, if your sequence length is known in advance, you can initialize the list to a certain length and use indexing to assign the next input number in the sequence (you need a counter to keep track of where you are):
counter = 0 # First index has value 0
output = [0]*N # N is the length of the sequence
# Now looping is better defined (no need to provide escape strings)
while counter < N:
x = input() # Next input number
output[counter] = x
counter += 1 # Increment after element added to list
Lastly, comparing list comprehension to astype array forming in numpy provided in some answers we see that list comprehension is vastly superior in terms of execution speed.
import timeit
code_to_test = '''
import numpy as np
number_string = "1 1.5 120202.4343 58 -2442.5"
number_list = number_string.split(" ")
numbers_array = np.array(number_list).astype(np.float)'''
code_comp = '''
import numpy as np # Not needed but just to compare fairly
x = '1 1.5 120202.4343 58 -2442.5'
output = [float(i) for i in x.split(" ")]'''
test_time = timeit.timeit(code_to_test, number=10000)
compt_time = timeit.timeit(code_comp, number=10000)
print(test_time) # 0.6834872080944479
print(compt_time) # 0.028420873917639256
print(test_time/compt_time) # 24.048775209204436
Obviously these numbers will change each run but you can see that most of the time comprehension will be faster.
Upvotes: 0
Reputation: 4586
>>> in_str = '1 1.5 120202.4343 58 -2442.5'
>>> list(map(float, in_str.split(' ')))
[1, 1.5, 120202.4343, 58, -2442.5]
Upvotes: 2
Reputation: 95948
I'm not sure what you mean by "not in a data structure", that doesn't make much sense. But assuming you have a string, then numpy
even provides a utility method for this:
>>> import numpy as np
>>> data = '1 1.5 120202.4343 58 -2442.5'
>>> np.fromstring(data, sep=' ')
array([ 1.00000000e+00, 1.50000000e+00, 1.20202434e+05, 5.80000000e+01,
-2.44250000e+03])
Upvotes: 2
Reputation: 202
Like the other answers say using split() can be used for this problem once you get the data as a string. I feel like it is valuable to show that
with open(filename,'r') as fil:
f = fil.read().split()
will let you put your external source file in a variable filename and then split that data into a list saved as f.
Upvotes: 1
Reputation: 1864
You're on the right track. You can load the numbers as a single string, then split
the string by spaces. This will give you a list of strings:
number_string = "1 1.5 120202.4343 58 -2442.5"
number_list = number_string.split(" ")
Then, you can easily convert that list of strings into a numpy array of floats using astype
:
numbers_array = np.array(number_list).astype(np.float)
Upvotes: 0