A B
A B

Reputation: 305

convert string to 2d numpy array

I am trying to convert 'b' (a string in which the column entries are separated by one delimiter and the the rows are separated by another delimiter) to 'a' (a 2d numpy array), like:

b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
a=numpy.array([[191.25,0,0,1],[191.251,0,0,1],[191.252,0,0,1]])

The way I do it is (using my knowledge that there are 4 columns in 'a'):

a=numpy.array(filter(None,re.split('[\n\t]+',b)),dtype=float).reshape(-1,4)

Is there a better way?

Upvotes: 3

Views: 15101

Answers (2)

Alex Riley
Alex Riley

Reputation: 176978

Instead of splitting and filtering, you could use np.fromstring:

>>> np.fromstring(b, sep='\t').reshape(-1, 4)
array([[ 191.25 ,    0.   ,    0.   ,    1.   ],
       [ 191.251,    0.   ,    0.   ,    1.   ],
       [ 191.252,    0.   ,    0.   ,    1.   ]])

This always returns a 1D array so reshaping is necessary.

Alternatively, to avoid reshaping, if you already have a string of bytes (as strings are in Python 2), you could use np.genfromtxt (with the help of the standard library's io module):

>>> import io
>>> np.genfromtxt(io.BytesIO(b))
array([[ 191.25 ,    0.   ,    0.   ,    1.   ],
       [ 191.251,    0.   ,    0.   ,    1.   ],
       [ 191.252,    0.   ,    0.   ,    1.   ]])

genfromtxt handles missing values, as well as offering much more control over how the final array is created.

Upvotes: 6

Josh Smith
Josh Smith

Reputation: 89

Here's what I did to get the result you're looking for:

import numpy as np

b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
a = np.array([[float(j) for j in i.split('\t')] for i in b.splitlines()])

Upvotes: 2

Related Questions