Canther
Canther

Reputation: 41

Is it possible to access a numpy array with strings?

I have a table like this:

Matrix

And I would like to access it like this in python

table["Water"]["Rice"] = 3

Is this possible with numpy.ndarray ?

Upvotes: 2

Views: 2459

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30971

Assume that you created the source Numpy array as:

arr = np.array([[1, 3, 1],
                [2, 5, 3],
                [5, 6, 7]])

But "ordinary" Numpy arrays have no "symbolic" indices (names). They have only integer indices, starting in each dimension from 0.

You can define in Numpy so called structured arrays, but they allow for symbolic names only within a row. Such an array is composed of rows (indexed by an integer index) and each row is a collection of named fields. But you want symbolic names for both rows and columns, so Numpy structured arrays are not an option for you.

In my opinion, the only reasonable choice is to use pandasonic DataFrames, where you can have symbolic names for both columns and rows.

In Pandas:

  • index is a collection of "names" for each row,
  • columns is actually also an index, but it holds names of columns.

To create a DataFrame from the above Numpy array, you can run e.g.:

# Names of columns and rows (the same)
names = ['Water', 'Rice', 'Sauce']
# Actual creation of a DataFrame
table = pd.DataFrame(arr, index=names, columns=names)

Its content is:

       Water  Rice  Sauce
Water      1     3      1
Rice       2     5      3
Sauce      5     6      7

To read an element from this DataFrame, you can use your code, i.e.:

table['Water']['Rice']

In the above code:

  • Water (the first index) is a column name,
  • Rice (the second index) is a row index (name),

so the value read is 2.

But a more pandasonic way to access elements of a DataFrame is:

table.loc['Water', 'Rice']

This time however:

  • Water (the first index) is a row index,
  • Rice (the second index) is a column name,

so the value read is 3.

You can also save a new value in an indicated cell, e.g.:

table.loc['Water', 'Rice'] = 12

Now, when you print(table), the result is:

       Water  Rice  Sauce
Water      1    12      1
Rice       2     5      3
Sauce      5     6      7

If you perform some operations on this DataFrame (table), but you want then to integrate it with other code, expecting to work with a Numpy array (not a DataFrame), you can pass it as:

table.values

i.e. you pass the underlying Numpy array.

But then you can refer to elements of this array again only using integer indices.

Upvotes: 7

Related Questions