Reputation: 23938
How can I convert an ndarray to a matrix in numpy? I'm trying to import data from a csv and turn it into a matrix.
from numpy import array, matrix, recfromcsv
my_vars = ['docid','coderid','answer1','answer2']
toy_data = matrix( array( recfromcsv('toy_data.csv', names=True)[my_vars] ) )
print toy_data
print toy_data.shape
But I get this:
[[(1, 1, 3, 3) (1, 2, 4, 1) (1, 3, 7, 2) (2, 1, 3, 3) (2, 2, 4, 4)
(2, 4, 3, 1) (3, 1, 3, 3) (3, 2, 4, 3) (3, 3, 3, 4) (4, 4, 5, 1)
(4, 5, 6, 2) (4, 2, 4, 3) (5, 2, 5, 4) (5, 3, 3, 1) (5, 4, 7, 2)
(6, 1, 3, 3) (6, 5, 4, 1) (6, 2, 5, 2)]]
(1, 18)
What do I have to do to get a 4 by 18 matrix out of this code? There's got to be an easy answer to this question, but I just can't find it.
Upvotes: 1
Views: 4703
Reputation: 879939
If the ultimate goal is to make a matrix, there's no need to create a recarray with named columns. You could use np.loadtxt
to load the csv into an ndarray, then use np.asmatrix
to convert it to a matrix:
import numpy as np
toy_data = np.asmatrix(np.loadtxt('toy_data.csv',delimiter=','skiprows=1))
print toy_data
print toy_data.shape
yields
[[ 1. 1. 3. 3.]
[ 1. 2. 4. 1.]
[ 1. 3. 7. 2.]
[ 2. 1. 3. 3.]
[ 2. 2. 4. 4.]
[ 2. 4. 3. 1.]
[ 3. 1. 3. 3.]
[ 3. 2. 4. 3.]
[ 3. 3. 3. 4.]
[ 4. 4. 5. 1.]
[ 4. 5. 6. 2.]
[ 4. 2. 4. 3.]
[ 5. 2. 5. 4.]
[ 5. 3. 3. 1.]
[ 5. 4. 7. 2.]
[ 6. 1. 3. 3.]
[ 6. 5. 4. 1.]
[ 6. 2. 5. 2.]]
(18, 4)
Note: the skiprows argument is used to skip over the header in the csv.
Upvotes: 5
Reputation: 43024
You can just read all your values into a vector, then reshape it.
fo = open("toy_data.csv")
def _ReadCSV(fileobj):
for line in fileobj:
for el in line.split(","):
yield float(el)
header = map(str.strip, fo.readline().split(","))
a = numpy.fromiter(_ReadCSV(fo), numpy.float64)
a.shape = (-1, len(header))
But there may be an even more direct way with newer numpy.
Upvotes: 0