David P
David P

Reputation: 3

Mixture parameters from a TensorFlow Probability mixture density network

How do you get the mixture parameters from a mixture density network created using TensorFlow Probability?

I'm trying to learn a bit about mixture density networks and came across an example in the TensorFlow Probability documentation here. By the way, I'm a beginner with this stuff.

See below for my complete code using the above example as a starting point. I had to make a change to the original concerning the AdamOptimizer, and I added a model.predict() at the end. Calling predict(X) seems to draw samples from the conditional distribution P(Y|X), but instead I want to get to the parameters of the mixture model for supplied values of X i.e., the weight, mean, and std deviation for each of the num_components mixture components. Any ideas?

I have seen the convert_to_tensor_fn argument for the MixtureNormal layer, and tried adding:

convert_to_tensor_fn=tfp.distributions.Distribution.sample - to confirm that predict() draws samples

and

convert_to_tensor_fn=tfp.distributions.Distribution.mean - looks like predict() returns the conditional expectation

so I was then hoping that there would be some other option to get the mixture components, but I haven't been able to find it so far.

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np

tfd = tfp.distributions
tfpl = tfp.layers
tfk = tf.keras
tfkl = tf.keras.layers

# Load data -- graph of a [cardioid](https://en.wikipedia.org/wiki/Cardioid).
n = 2000
t = tfd.Uniform(low=-np.pi, high=np.pi).sample([n, 1])
r = 2 * (1 - tf.cos(t))
x = r * tf.sin(t) + tfd.Normal(loc=0., scale=0.1).sample([n, 1])
y = r * tf.cos(t) + tfd.Normal(loc=0., scale=0.1).sample([n, 1])

# Model the distribution of y given x with a Mixture Density Network.
event_shape = [1]
num_components = 5
params_size = tfpl.MixtureNormal.params_size(num_components, event_shape)
model = tfk.Sequential([
  tfkl.Dense(12, activation='relu'),
  tfkl.Dense(params_size, activation=None),
  tfpl.MixtureNormal(num_components=num_components,           
    event_shape=event_shape
  )
])

# Fit.
batch_size = 100
epochs=20

#model.compile(optimizer=tf.train.AdamOptimizer(learning_rate=0.02),
#              loss=lambda y, model: -model.log_prob(y))
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.02), 
    loss=lambda y, model: -model.log_prob(y))

history = model.fit(x, y,
          batch_size=batch_size,
          epochs=epochs,
          steps_per_epoch=n // batch_size)

#
# use the model to make prediction (draws samples from the conditional distribution)
# but how do you get to the mixture parameters for each value of x_pred???
#
x_pred = tf.convert_to_tensor(np.linspace(-2.7,+2.7,1000))
y_pred = model.predict(x_pred)

Now that we have an answer, the complete code is as follows:

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np

tfd = tfp.distributions
tfpl = tfp.layers
tfk = tf.keras
tfkl = tf.keras.layers

# Load data -- graph of a [cardioid](https://en.wikipedia.org/wiki/Cardioid).
n = 2000
t = tfd.Uniform(low=-np.pi, high=np.pi).sample([n, 1])
r = 2 * (1 - tf.cos(t))
x = r * tf.sin(t) + tfd.Normal(loc=0., scale=0.1).sample([n, 1])
y = r * tf.cos(t) + tfd.Normal(loc=0., scale=0.1).sample([n, 1])

# Model the distribution of y given x with a Mixture Density Network.
event_shape = [1]
num_components = 5
params_size = tfpl.MixtureNormal.params_size(num_components, event_shape)
model = tfk.Sequential([
  tfkl.Dense(12, activation='relu'),
  tfkl.Dense(params_size, activation=None),
  tfpl.MixtureNormal(num_components=num_components,           
    event_shape=event_shape
  )
])

# Fit.
batch_size = 100
epochs=20

#model.compile(optimizer=tf.train.AdamOptimizer(learning_rate=0.02),
#              loss=lambda y, model: -model.log_prob(y))
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.02), 
    loss=lambda y, model: -model.log_prob(y))

history = model.fit(x, y,
          batch_size=batch_size,
          epochs=epochs,
          steps_per_epoch=n // batch_size)

#
# use the model to get parameters of the conditional distribution:
#
x = np.linspace(-2.7,+2.7,1000)
x_pred = tf.convert_to_tensor(x[:,np.newaxis])

#
# compute the mixture parameters at each x:
#
gm = model(x_pred)

#
# get the mixture parameters:
#
gm_weights = gm.mixture_distribution.probs_parameter().numpy()
gm_means = gm.components_distribution.mean().numpy()
gm_vars = gm.components_distribution.variance().numpy()

print(gm_weights)

Upvotes: 0

Views: 1153

Answers (1)

mtb96
mtb96

Reputation: 36

I struggled with this too. From looking at the source code on Github (here) I found a way to get the parameters of a given output distribution.

E.g. if I have a model named 'model' and call it at a particular input 'x_star', a distribution object is returned - the attributes you want can be accessed like so:

x_star = 1
model_star = model(np.array([x_star]))
comp_weights = np.array(model_star.mixture_distribution.probs_parameter())
comp_means = np.array(model_star.components_distribution.mean())
comp_vars = np.array(model_star.components_distribution.variance())

I'm not sure why they don't advertise how to access this. Maybe they expect these models to be used as black boxes.

Upvotes: 2

Related Questions