Reputation: 397
I build a class with some iteration over coming data. The data are in an array form without use of numpy objects. On my code I often use .append
to create another array. At some point I changed one of the big array 1000x2000 to numpy.array. Now I have an error after error. I started to convert all of the arrays into ndarray but comments like .append
does not work any more. I start to have a problems with pointing to rows, columns or cells. and have to rebuild all code.
I try to google an answer to the question: "what is and advantage of using ndarray over normal array" I can't find a sensible answer. Can you write when should I start to use ndarrays and if in your practice do you use both of them or stick to one only.
Sorry if the question is a novice level, but I am new to python, just try to move from Matlab and want to understand what are pros and cons. Thanks
Upvotes: 5
Views: 2600
Reputation: 7242
Another great advantage of using NumPy arrays over built-in lists is the fact that NumPy has a C API that allows native C and C++ code to access NumPy arrays directly. Hence, many Python libraries written in low-level languages like C are expecting you to work with NumPy arrays instead of Python lists.
Reference: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Upvotes: 1
Reputation: 94495
NumPy and Python arrays share the property of being efficiently stored in memory.
NumPy arrays can be added together, multiplied by a number, you can calculate, say, the sine of all their values in one function call, etc. As HYRY pointed out, they can also have more than one dimension. You cannot do this with Python arrays.
On the other hand, Python arrays can indeed be appended to. Note that NumPy arrays can however be concatenated together (hstack()
, vstack()
,…). That said, NumPy arrays are mostly meant to have a fixed number of elements.
It is common to first build a list (or a Python array) of values iteratively and then convert it to a NumPy array (with numpy.array()
, or, more efficiently, with numpy.frombuffer()
, as HYRY mentioned): this allows mathematical operations on arrays (or matrices) to be performed very conveniently (simple syntax for complex operations). Alternatively, numpy.fromiter()
might be used to construct the array from an iterator. Or loadtxt()
to construct it from a text file.
Upvotes: 8
Reputation: 97301
array.array
can change size dynamically. If you are collecting data from some source, it's better to use array.array
. But array.array
is only one dimension, and there is no calculation functions to do with it. So, when you want to do some calculation with your data, convert it to numpy.ndarray
, and use functions in numpy.
numpy.frombuffer
can create a numpy.ndarray
that shares the same data buffer with array.array
objects, it's fast because it don't need to copy the data.
Here is a demo:
import numpy as np
import array
import random
a = array.array("d")
# a for loop that collects 10 channels data
for x in range(100):
a.extend([random.random() for _ in xrange(10)])
# create a ndarray that share the same data buffer with array a, and reshape it to 2D
na = np.frombuffer(a, dtype=float).reshape(-1, 10)
# call some numpy function to do the calculation
np.sum(na, axis=0)
Upvotes: 5
Reputation: 879691
There are at least two main reasons for using NumPy arrays:
Yes, you can not simply convert lists to NumPy arrays and expect your code to continue to work. The methods are different, the bool semantics are different. For the best performance, even the algorithm may need to change.
However, if you are looking for a Python replacement for Matlab, you will definitely find uses for NumPy. It is worth learning.
Upvotes: 7