Moritz
Moritz

Reputation: 5408

numpy reshape and tile

I am trying to convert a vector from A-L to something like this with pandas and numpy built in functions without loops (tile, repeat and reshape). But I cannot wrap my head around

    0   1   2   3   4   5   6   7   8   9   10  11
0   A   A   A   A   E   E   E   E   I   I   I   I
1   B   B   B   B   F   F   F   F   J   J   J   J
2   C   C   C   C   G   G   G   G   K   K   K   K
3   D   D   D   D   H   H   H   H   L   L   L   L
4   A   A   A   A   E   E   E   E   I   I   I   I
5   B   B   B   B   F   F   F   F   J   J   J   J
6   C   C   C   C   G   G   G   G   K   K   K   K
7   D   D   D   D   H   H   H   H   L   L   L   L

Do you have any ideas how I could do that without loops ?

what I have tried so far:

a = np.array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K',  'L'])
b = a.reshape(3,4)

np.repeat(b, 4).reshape(4,12)

gives me:

array([['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'],
       ['D', 'D', 'D', 'D', 'E', 'E', 'E', 'E', 'F', 'F', 'F', 'F'],
       ['G', 'G', 'G', 'G', 'H', 'H', 'H', 'H', 'I', 'I', 'I', 'I'],
       ['J', 'J', 'J', 'J', 'K', 'K', 'K', 'K', 'L', 'L', 'L', 'L']],
      dtype='<U1')

EDIT: Some background. Depending on the number of samples and the layout we choose. A machine, creates plates (like in this image). We can do consecutive operations (add more chemicals etc.) and based on the previous layout, unique combinations are obtained. Afterwards the machine measures e.g. concentration in each well and I would like to link the output to the conditions in each well. Because the machine can measure e.g. concentration after each step, a lot of data can be generated and I am trying to find a generic solution without too many loops.

Upvotes: 3

Views: 1147

Answers (2)

MSeifert
MSeifert

Reputation: 152725

You could use:

>>> import numpy as np
>>> x = np.array(list('abcdefghijkl'.upper()))  # your "vector"
>>> np.repeat(np.tile(x.reshape(-1, 4), 2).T, 4, axis=1)
array([['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L'],
       ['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L']],
      dtype='<U1')

It first reshapes it so that you have 4 characters in each column, then duplicates them. Then you transpose it so you have the correct rows/columns and finally you just repeat every character 4 times.

Step-by-step it looks like this:

>>> import pandas as pd
>>> x
array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L'],
      dtype='<U1')
>>> x.reshape(-1, 4)
array([['A', 'B', 'C', 'D'],
       ['E', 'F', 'G', 'H'],
       ['I', 'J', 'K', 'L']],
      dtype='<U1')
>>> np.tile(_, 2)
array([['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D'],
       ['E', 'F', 'G', 'H', 'E', 'F', 'G', 'H'],
       ['I', 'J', 'K', 'L', 'I', 'J', 'K', 'L']],
      dtype='<U1')
>>> _.T
array([['A', 'E', 'I'],
       ['B', 'F', 'J'],
       ['C', 'G', 'K'],
       ['D', 'H', 'L'],
       ['A', 'E', 'I'],
       ['B', 'F', 'J'],
       ['C', 'G', 'K'],
       ['D', 'H', 'L']],
      dtype='<U1')
>>> np.repeat(_, 4, axis=1)
array([['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L'],
       ['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L']],
      dtype='<U1')
>>> pd.DataFrame(_)
   0  1  2  3  4  5  6  7  8  9  10 11
0  A  A  A  A  E  E  E  E  I  I   I  I
1  B  B  B  B  F  F  F  F  J  J   J  J
2  C  C  C  C  G  G  G  G  K  K   K  K
3  D  D  D  D  H  H  H  H  L  L   L  L
4  A  A  A  A  E  E  E  E  I  I   I  I
5  B  B  B  B  F  F  F  F  J  J   J  J
6  C  C  C  C  G  G  G  G  K  K   K  K
7  D  D  D  D  H  H  H  H  L  L   L  L

Upvotes: 3

akuiper
akuiper

Reputation: 215047

a = np.array(list("ABCDEFGHIJKL"))

a
# array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L'], 
#       dtype='<U1')

np.repeat(np.tile(a.reshape(3,4), 2).T, 4, axis=1)
#array([['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
#       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
#       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
#       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L'],
#       ['A', 'A', 'A', 'A', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I'],
#       ['B', 'B', 'B', 'B', 'F', 'F', 'F', 'F', 'J', 'J', 'J', 'J'],
#       ['C', 'C', 'C', 'C', 'G', 'G', 'G', 'G', 'K', 'K', 'K', 'K'],
#       ['D', 'D', 'D', 'D', 'H', 'H', 'H', 'H', 'L', 'L', 'L', 'L']], 
#      dtype='<U1')

Upvotes: 2

Related Questions