Reputation: 61
I have a nparray shows below.
df=np.array([[None, 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', None],
[None, None, None, None, None, None, None, None, None, None, None, None],
['E ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'E '],
['E ', 'G ', None, 'H ', 'B ', 'H ', None, 'H ', None, 'H ', 'I ', 'E '],
['E ', None, 'B ', 'A ', None, 'G ', 'C ', None, 'C ', 'G ', None, 'E '],
['E ', 'C ', 'D ', None, 'H ', None, 'I ', 'D ', None, 'J ', 'G ', 'E '],
['E ', 'A ', None, 'I ', None, 'A ', 'B ', None, 'G ', 'H ', None, 'E '],
['E ', 'F ', 'C ', None, 'I ', None, None, 'F ', None, None, 'J ', 'E '],
['E ', 'B ', None, 'D ', None, 'C ', 'B ', None, 'J ', 'J ', None, 'E '],
['E ', 'H ', 'C ', None, 'G ', None, 'H ', 'A ', 'C ', None, 'H ', 'E '],
['E ', 'C ', None, 'A ', None, 'G ', None, None, 'I ', 'D ', None, 'E '],
['E ', None, 'G ', 'F ', 'B ', None, 'I ', None, 'G ', None, 'G ', 'E '],
['E ', 'B ', None, 'C ', None, 'H ', None, 'J ', None, 'I ', None, 'E '],
['E ', 'C ', 'D ', None, 'F ', 'C ', 'D ', None, 'B ', 'F ', 'G ', 'E ']])
Now I want to get a new dataframe or nparray that contains coordinates of each value. For example:
id c x y
1 A 1 0
2 B 2 0
...
11 E 0 2
12 F 1 2
...
How to achieve it?
Thank you very much!
Upvotes: 1
Views: 516
Reputation: 88276
You can use ndix_unique
from the linked answer for a vectorized approach. Then construct a dataframe from the result, explode
the (x,y)
coordinate lists and assign back:
vals, ixs = ndix_unique(a)
df = pd.DataFrame({'c':vals, 'xy':ixs}).explode('xy')
x, y = zip(*df.xy.values.tolist())
df = df[['c']].assign(x=x, y=y).reset_index(drop=True)
print(df)
c x y
0 A 0 1
1 A 6 1
2 B 0 2
3 B 8 1
4 B 12 1
5 B 4 2
6 C 5 1
7 C 9 2
....
Upvotes: 4
Reputation: 2897
This is one straight forward way:
import pandas as pd
import numpy as np
data = np.array([[None, 'A', 'B'], ['E', 'A', 'B']])
values = []
for y, row in enumerate(data):
for x, char in enumerate(row):
if char is not None:
values.append({
"id": 1 + len(values),
"c": char,
"x": x,
"y": y
})
df = pd.DataFrame(values)
df.set_index('id', inplace=True)
df
Output:
c x y
id
1 A 1 0
2 B 2 0
3 E 0 1
4 A 1 1
5 B 2 1
Upvotes: 1