Stefano Borini
Stefano Borini

Reputation: 143935

Matrices in Python

Yesterday I had the need for a matrix type in Python.

Apparently, a trivial answer to this need would be to use numpy.matrix(), but the additional issue I have is that I would like a matrix to store arbitrary values with mixed types, similarly to a list. numpy.matrix does not perform this. An example is

>>> numpy.matrix([[1,2,3],[4,"5",6]])
matrix([['1', '2', '3'],
        ['4', '5', '6']], 
       dtype='|S4')
>>> numpy.matrix([[1,2,3],[4,5,6]])
matrix([[1, 2, 3],
        [4, 5, 6]])

As you can see, the numpy.matrix must be homogeneous in content. If a string value is present in my initialization, every value gets implicitly stored as a string. This is also confirmed by accessing the single values

>>> numpy.matrix([[1,2,3],[4,"5",6]])[1,1]
'5'
>>> numpy.matrix([[1,2,3],[4,"5",6]])[1,2]
'6'

Now, the Python list type can instead accept mixed types. You can have a list containing an integer and a string, both conserving their type. What I would need is something similar to a list, but operating in a matrix-like behavior.

Therefore, I had to implement my own type. I had two choices for the internal implementation: list containing lists, and dictionaries. Both solutions have shortcomings:

Edit: clarification. The concrete reason on why I need this functionality is because I am reading CSV files. Once I collect the values from a CSV file (values that can be string, integers, floats) I would like to perform swapping, removal, insertion and other operations alike. For this reason I need a "matrix list".

My curiosities are:

Upvotes: 6

Views: 59144

Answers (6)

ifmihai
ifmihai

Reputation: 89

Maybe it's a late answer, but, why not use pandas?

Upvotes: 2

ivan
ivan

Reputation: 348

Check out sympy -- it does quite a good job at polymorphism in its matrices and you you have operations on sympy.matrices.Matrix objects like col_swap, col_insert, col_del, etc...

In [2]: import sympy as s 
In [6]: import numpy as np

In [11]: npM = np.array([[1,2,3.0], [4,4,"abc"]], dtype=object)
In [12]: npM
Out[12]: 
 [[1 2 3.0]
 [4 4 abc]]

In [14]: type( npM[0][0] )
Out[14]: 
In [15]: type( npM[0][2] )
Out[15]: 
In [16]: type( npM[1][2] )
Out[16]: 


In [17]: M = s.matrices.Matrix(npM)
In [18]: M
Out[18]: 
⎡1  2  3.0⎤
⎢         ⎥
⎣4  4  abc⎦


In [27]: type( M[0,2] )
Out[27]: 
In [28]: type( M[1,2] )
Out[28]: 

In [29]: sym= M[1,2] 
In [32]: print sym.name
abc

In [34]: sym.n
Out[34]: 
In [40]: sym.n(subs={'abc':45} )
Out[40]: 45.0000000000000

Upvotes: 1

Autoplectic
Autoplectic

Reputation: 7676

You can have inhomogeneous types if your dtype is object:

In [1]: m = numpy.matrix([[1, 2, 3], [4, '5', 6]], dtype=numpy.object)
In [2]: m
Out[2]: 
matrix([[1, 2, 3],
        [4, 5, 6]], dtype=object)
In [3]: m[1, 1]
Out[3]: '5'
In [4]: m[1, 2]
Out[4]: 6

I have no idea what good this does you other than fancy indexing, because, as Don pointed out, you can't do math with this matrix.

Upvotes: 11

saffsd
saffsd

Reputation: 24322

Have you considered the csv module for working with csv files?

Python docs for csv module

Upvotes: 0

Vicki Laidler
Vicki Laidler

Reputation:

Have you looked at the numpy.recarray capabilities?

For instance here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.recarray.html

It's designed to allow arrays with mixed datatypes.

I don't know if an array will suit your purposes, or if you really need a matrix - I haven't worked with the numpy matrices. But if an array is good enough, recarray might work.

Upvotes: 3

Don Werve
Don Werve

Reputation: 5120

I'm curious why you want this functionality; as I understand it, the reason for having matrices (in numpy), is primarily for doing linear math (matrix transformations and so on).

I'm not sure what the mathematical definition would be for the product of a decimal and a String.

Internally, you'll probably want to look at sparse matrix implementations (http://www.inf.ethz.ch/personal/arbenz/pycon03_contrib.pdf). There are lots of ways to do this (hash, list, linked list), and each has its own advantages and drawbacks. If your matrix isn't going to have a lot of nulls or zeroes, then you can ditch the sparse implementations.

Upvotes: 5

Related Questions