Reputation: 11100
I have a 4-D dataset (as xr.DataArray
) with dimensions temperature
, datasource
, time
, and altitude
.
How can I create a scatter plot with of temperature(src0, z)
vs. temperature(src1, z)
, so that I can select the altitude via a slider?
I'm currently having the problem that when I convert the data to a hv.Table
, I have among others one column datasource
and one column temperature
, and I cannot figure out how to plot temperature(datasource=='src0')
vs. temperature(datasource=='src1')
EDIT:
I try to clarify: I have a 4-D dataset DATA
(which is a xr.DataArray
) with dimensions data_variable
, datasource
, time
, and altitude
.
data_variable
has 2 entries, temperature
and humidity
.
datasource
has 2 entries, model
and measurement
There are 6 altitudes and ~2000 times.
How can I create a scatter plot which has
datasource
model
datasource
measurement
such that altitude
and data_variable
can be selected with a slider?
Upvotes: 0
Views: 1025
Reputation: 4080
If I'm understanding your question correctly you want to plot scatter values for temperature over time comparing between the two datasources and indexed by different altitudes?
# Load the data into a holoviews Dataset
ds = hv.Dataset(data_array)
# Create Scatter objects plotting time vs. temperature
# and group by altitude and datasource
scatter = ds.to(hv.Scatter, 'time', 'temperature',
groupby=['altitude', 'datasource'], dynamic=True)
# Now overlay the datasource dimension and display
scatter.overlay('datasource')
Hopefully I understood your question correctly but based on this basic pattern you should be able to plot the data in whatever arrangement you want.
Edit: Based on your edit the main problem is that HoloViews expects each data_variable to be in a separate array, in pandas terms you need to do the equivalent as pd.melt
.
# Define data array like yours
dataarray = xr.DataArray(np.random.rand(10, 10, 2, 2), name='variable',
coords=[('time', range(10)), ('altitude', range(10)),
('datasource', ['model', 'measurement']),
('data_variable', ['humidity', 'temperature'])])
# Groupby datasource and data_variable, combining the resultant array into a Dataset with 4 data variables
group_dims = ['datasource', 'data_variable']
grouped = hv.Dataset(dataarray, datatype=['xarray']).groupby(group_dims)
dataset = xr.merge([da.data.rename({'variable': ' '.join(key)}).drop(group_dims)
for key, da in grouped.items()])
ds = hv.Dataset(dataset)
scatter = ds.to(hv.Scatter, 'model temperature', 'measurement temperature', 'altitude')
Note however that while testing this I ran into a bug, which I've now opened a PR for (see here)
Upvotes: 1