exsonic01
exsonic01

Reputation: 637

Multidimension array indexing and column-accessing

I have a 3 dimensional array like

[[[   1    4    4 ...,  952    0    0]
  [   2    4    4 ...,   33    0    0]
  [   3    4    4 ..., 1945    0    0]
  ..., 
  [4079    1    1 ...,    0    0    0]
  [4080    2    2 ...,    0    0    0]
  [4081    1    1 ...,    0    0    0]]

 [[   1    4    4 ...,  952    0    0]
  [   2    4    4 ...,   33    0    0]
  [   3    4    4 ..., 1945    0    0]
  ..., 
  [4079    1    1 ...,    0    0    0]
  [4080    2    2 ...,    0    0    0]
  [4081    1    1 ...,    0    0    0]]

  .....

 [[   1    4    4 ...,  952    0    0]
  [   2    4    4 ...,   33    0    0]
  [   3    4    4 ..., 1945    0    0]
  ..., 
  [4079    1    1 ...,    0    0    0]
  [4080    2    2 ...,    0    0    0]
  [4081    1    1 ...,    0    0    0]]]

This array has total 5 data blocks. Each data blocks have 4081 lines and 9 columns.

My question here is about accessing to column, in data-block-wise.
I hope to index data-blocks, lines, and columns, and access to the columns, and do some works with if loops. I know how to access to columns in 2D array, like:

column_1 = [row[0] for row in inputfile]

but how can I access to columns per each data block?

I tried like ( inputfile = 3d array above )

for i in range(len(inputfile)):
    AAA[i] = [row[0] for row in inputfile]
    print AAA[2]

But it says 'name 'AAA' is not defined. How can I access to the column, for each data blocks? Should I need to make [None] arrays? Are there any other way without using empty arrays?

Also, how can I access to the specific elements of the accessed columns? Like AAA[i][j] = i-th datablock, and j-th line of first column. Shall I use one more for loop for line-wise accessing?

ps) I tried to analyze this 3d array in a way like

for i in range(len(inputfile)):      ### number of datablock = 5
    for j in range(len(inputfile[i])):  ### number of lines per a datablock = 4081
        AAA = inputfile[i][j]        ### Store first column for each datablocks to AAA
        print AAA[0]                 ### Working as I intended to access 1st column. 
        print AAA[0][1]              ### Not working, invalid index to scalar variable. I can't access to the each elemnt. 

But this way, I cannot access to the each elements of 1st column, AAA[0]. How can I access to the each elements in here?

I thought maybe 2 indexes were not enough, so I used 3 for-loops as:

for i in range(len(inputfile)):                ### number of datablock = 5
    for j in range(len(inputfile[i])):         ### number of lines per a datablock = 4081
        for k in range(len(inputfile[i][j])):  ### number of columns per line = 9
           AAA = inputfile[i][j][0]
           print AAA[0]

Still, I cannot access to the each elements of 1st column, it says 'invalid index to scalar variable'. Also, AAA contains nine of each elements, just like

>>> print AAA
1
1
1
1
1
1
1
1
1
2
2
...
4080
4080
4080
4081
4081
4081
4081
4081
4081
4081
4081
4081

Like this, each elements repeats 9 times, which is not what I want.

I hope to use indices during my analysis, will use index as element during analysis. I want to access to the columns, and access to the each elements with all indices, in this 3d array. How can I do this?

Upvotes: 0

Views: 164

Answers (2)

eswald
eswald

Reputation: 8406

Unless you're using something like NumPy, Python doesn't have multi-dimensional arrays as such. Instead, the structure you've shown is a list of lists of lists of integers. (Your choice of inputfile as the variable name is confusing here; such a variable would usually contain a file handle, iterating over which would yield one string per line, but I digress...)

Unfortunately, I'm having trouble understanding exactly what you're trying to accomplish, but at one point, you seem to want a single list consisting of the first column of each row. That's as simple as:

column = [row[0] for block in inputfile for row in block]

Granted, this isn't really a column in the mathematical sense, but it might possibly perhaps be what you want.

Now, as to why your other attempts failed:

for i in range(len(inputfile)):
    AAA[i] = [row[0] for row in inputfile]
    print AAA[2]

As the error message states, AAA is not defined. Python doesn't let you assign to an index of an undefined variable, because it doesn't know whether that variable is supposed to be a list, a dict, or something more exotic. For lists in particular, it also doesn't let you assign to an index that doesn't yet exist; instead, the append or extend methods are used for that:

AAA = []
for i, block in enumerate(inputfile):
    for j, row in enumerate(block):
        AAA.append(row[0])
print AAA[2]

(However, that isn't quite as efficient as the list comprehension above.)

for i in range(len(inputfile)):    ### number of datablock = 5
    for j in range(len(inputfile)):     ### number of lines per a datablock = 4081
        AAA = inputfile[i][j]          ### Store first column for each datablocks to AAA
        print AAA[0]      ### Working as I intended to access 1st column. 
        print AAA[0][1]   ### Not working, invalid index to scalar variable. I can't access to the each elemnt. 

There's an obvious problem with the range in the second line, and an inefficiency in looking up inputfile[i] multiple times, but the real problem is in the last line. At this point, AAA refers to one of the rows of one of the blocks; for example, on the first time through, given your dataset above,

AAA == [   1    4    4 ...,  952    0    0]

It's a single list, with no references to the data structure as a whole. AAA[0] works to access the number in the first column, 1, because that's how lists operate. The second column of that row will be in AAA[1], and so on. But AAA[0][1] throws an error, because it's equivalent to (AAA[0])[1], which in this case is equal to (1)[1], but numbers can't be indexed. (What's the second element of the number 1?)

for i in range(len(inputfile)):    ### number of datablock = 5
    for j in range(len(inputfile[i])):     ### number of lines per a datablock = 4081
        for k in range(len(inputfile[i][j])):      ### number of columns per line = 9
           AAA = inputfile[i][j][0]
           print AAA[0]

This time, your for loops, though still inefficient, are at least correct, if you want to iterate over every number in the whole data structure. At the bottom, you'll find that inputfile[i][j][k] is integer k in row j in block i of the data structure. However, you're throwing out k entirely, and printing the first element of the row, once for each item in the row. (The fact that it's repeated exactly as many times as you have columns should have been a clue.) And once again, you can't index any further once you get to the integers; there is no inputfile[i][j][0][0].

Granted, once you get to an element, you can look at nearby elements by changing the indexes. For example, a three-dimensional cellular automaton might want to look at each of its neighbors. With proper corrections for the edges of the data, and checks to ensure that each block and row are the right length (Python won't do that for you), that might look something like:

for i, block in enumerate(inputfile):
    for j, row in enumerate(block):
        for k, num in enumerate(row):
            neighbors = sum(
                inputfile[i][j][k-1],
                inputfile[i][j][k+1],
                inputfile[i][j-1][k],
                inputfile[i][j+1][k],
                inputfile[i-1][j][k],
                inputfile[i+1][j][k],
            )
            alive = 3 <= neigbors <= 4

Upvotes: 1

DevLounge
DevLounge

Reputation: 8437

A good practice in to leverage zip:

For example:

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> for i in a:
...  for j in b:
...   print i, b
... 
1 [4, 5, 6]
1 [4, 5, 6]
1 [4, 5, 6]
2 [4, 5, 6]
2 [4, 5, 6]
2 [4, 5, 6]
3 [4, 5, 6]
3 [4, 5, 6]
3 [4, 5, 6]
>>> for i,j in zip(a,b):
...  print i,j
... 
1 4
2 5
3 6

Upvotes: 1

Related Questions