From scatter plot to 2D array

Question

My mind has gone completely blank on this one.

I want to do what I think is very simple.

Suppose I have some test data:

import pandas as pd
import numpy as np
k=10
df = pd.DataFrame(np.array([range(k), 
                           [x + 1 for x in range(k)],
                           [x + 4 for x in range(k)], 
                           [x + 9 for x in range(k)]]).T,columns=list('abcd'))

where rows correspond to time and columns to angles, and it looks like this:

   a   b   c   d
0  0   1   4   9
1  1   2   5  10
2  2   3   6  11
3  3   4   7  12
4  4   5   8  13
5  5   6   9  14
6  6   7  10  15
7  7   8  11  16
8  8   9  12  17
9  9  10  13  18

Then for reasons I convert it to and ordered dictionary:

def highDimDF2Array(df):
    from collections import OrderedDict # Need to preserve order

    vels = [1.42,1.11,0.81,0.50]

    # Get dataframe shapes
    cols = df.columns

    trajectories = OrderedDict()
    for i,j in enumerate(cols):
        x = df[j].values
        x = x[~np.isnan(x)]

        maxTimeSteps = len(x)
        tmpTraj = np.empty((maxTimeSteps,3))
        # This should be fast
        tmpTraj[:,0] = range(maxTimeSteps) 
        # Remove construction nans
        tmpTraj[:,1] = x
        tmpTraj[:,2].fill(vels[i])

        trajectories[j] = tmpTraj

    return trajectories

Then I plot it all

import matplotlib.pyplot as plt
m = highDimDF2Array(df)
M = np.vstack(m.values())
plt.scatter(M[:,0],M[:,1],15,M[:,2])
plt.title('Angle $[^\circ]$ vs. Time $[s]$')
plt.colorbar()
plt.show()

Now all I want to do is to put all of that into a 2D numpy array with the properties:

Time is mapped to the x-axis (or y doesn't matter)
Angle is mapped to the y-axis
The entries in the matrix correspond to the values of the coloured dots in the scatter plot
All other entries are treated as NaNs (i.e. those that are undefined by a point in the scatter plot)

In 3D the colour would correspond to the height.

I was thinking of using something like this: 3d Numpy array to 2d but am not quite sure how.

Molly · Accepted Answer

You can convert the values in M[:,1] and M[:,2] to integers and use them as indices to a 2D numpy array. Here's an example using the value for M you defined.

out = np.empty((20,10))
out[:] = np.NAN
N = M[:,[0,1]].astype(int)
out[N[:,1], N[:,0]] = M[:,2]
plt.scatter(M[:,0],M[:,1],15,M[:,2])
plt.scatter(M[:,0],M[:,1],15,M[:,2])
plt.title('Angle $[^\circ]$ vs. Time $[s]$')
plt.colorbar()
plt.imshow(out, interpolation='none', origin = 'lower')

Here you can convert M to integers directly but you might have to come up with a function to map the columns of M to integers depending on the resolution of the array you are creating.

From scatter plot to 2D array

Answers (2)

Related Questions