convert 2d numpy array to string

Question

I am new to Python and am trying to convert a 2d numpy array, like:

a=numpy.array([[191.25,0,0,1],[191.251,0,0,1],[191.252,0,0,1]])

to a string in which the column entries are separated by one delimiter ' ' and the the rows are separated by another delimiter ' ' with control over the precision of each column, to get:

b='191.250	0.00	0	1
191.251	0.00	0	1
191.252	0.00	0	1
'

First, I create the array by:

import numpy as np

col1=np.arange(191.25,196.275,.001)[:, np.newaxis]
nrows=col1.shape[0]

col2=np.zeros((nrows,1),dtype=np.int)
col3=np.zeros((nrows,1),dtype=np.int)
col4=np.ones((nrows,1),dtype=np.int)

a=np.hstack((col1,col2,col3,col4))

Then I produce b, by one of 2 methods:

Method 1:

b=''
for i in range(0,a.shape[0]):
    for j in range(0,a.shape[1]-1):
        b+=str(a[i,j])+'	'
    b+=str(a[i,-1])+'
'
b

Method 2:

b=''
for i in range(0,a.shape[0]):
    b+='	'.join(['%0.3f' %x for x in a[i,:]])+'
'
b

However, I'm guessing there are better ways of producing a and b. I am looking for the most efficient ways (i.e. memory, time, code compactness) to create a and b.

Follow up questions

Thank you Mike,

b = '
'.join('	'.join('%0.3f' %x for x in y) for y in a)+'
'

worked for me but I have a few follow up questions (this couldn't fit in the comment section):

Though this is more compact, is the speed the same as executing a nested for loop, as this what seems to be going on within the parentheses?
I understand that x and y are iterators across the 2 dimensions of y, however, how does Python "know" they are and which dimensions they are supposed to iterate across? In Matlab, for example, these things have to be explicitly stated.
Is there a way to independently set the precision for each column (e.g. I'd like %0.3f for the first three columns and %0.0f for the last column)?
Is there an easy way to do the reverse procedure- i.e. given b, produce a? I have come up with 2 methods:

Method 1

y=b.split('
')[:-1]
z=[y[i].split('	') for i in range(0,len(y))]
a=numpy.array(z,dtype=float)

Method 2

import re
a=numpy.array(filter(None,re.split('[
	]+',b)),dtype=float).reshape(-1,4)

Is there a better way?

Mike M&#252;ller · Accepted Answer

Solution

A one-liner will do:

b = '
'.join('	'.join('%0.3f' %x for x in y) for y in a)

Using a simpler example:

>>> a = np.arange(25, dtype=float).reshape(5, 5)
>>> a
array([[  0.,   1.,   2.,   3.,   4.],
       [  5.,   6.,   7.,   8.,   9.],
       [ 10.,  11.,  12.,  13.,  14.],
       [ 15.,  16.,  17.,  18.,  19.],
       [ 20.,  21.,  22.,  23.,  24.]])

This:

b = '
'.join('	'.join('%0.3f' %x for x in y) for y in a)
print(b)

prints this:

0.000   1.000   2.000   3.000   4.000
5.000   6.000   7.000   8.000   9.000
10.000  11.000  12.000  13.000  14.000
15.000  16.000  17.000  18.000  19.000
20.000  21.000  22.000  23.000  24.000

Explanation

You already used a list comprehension in your second method. Here we have a generator expression, which looks exactly like a list comprehension. The only syntactical difference is that the [] are replaced by (). A generator expression does not build the list but hands a so called generator to join. In the end it has the same effect but skips the step of building this intermediate list.

There can be multiple for in such an expression, which makes it nested. This:

b = '
'.join('	'.join('%0.3f' %x for x in y) for y in a)

is equivalent to:

res = []
for y in a:
    res.append('	'.join('%0.3f' %x for x in y))
b = '
'.join(res)

Performance

I use %%timeit in the IPython Notebook:

%%timeit
b = '
'.join('	'.join('%0.3f' %x for x in y) for y in a)

10 loops, best of 3: 42.4 ms per loop


%%timeit
b=''
for i in range(0,a.shape[0]):
    for j in range(0,a.shape[1]-1):
        b+=str(a[i,j])+'	'
    b+=str(a[i,-1])+'
'

10 loops, best of 3: 50.2 ms per loop


%%timeit
b=''
for i in range(0,a.shape[0]):
    b+='	'.join(['%0.3f' %x for x in a[i,:]])+'
'

10 loops, best of 3: 43.8 ms per loop

Looks like they are all about the same speed. Actually, the += is optimized in CPython. Otherwise, it would be much slower, than the join() approach. Other Python implementations such as Jython or PyPy can show much bigger time differences and can make the join() much faster compared to +=.

convert 2d numpy array to string

Answers (2)

Solution

Explanation

Performance

Related Questions