Deep Bayes: Discrete Latent Variables

Introduction:

这篇笔记会记录一些离散隐变量模型，转载请注明。
Reference：Deep Bayes

Motivation

Easier to interpret discrete categories than continuous spectrum
example: discrete variational autoencoder
Allow the model to make a discrete choice
example: hard attention
An attention module generates binary mask of where to look at
The network classifies masked images
We want attention module to attend only important areas of the image.

Reinforce Estimator

图片说明
However, this typically has large variance
Requires sophisticated Variance Reduction methods
Just taking bigger M gives only a modest improvement.
$图片说明$

Idea: Relax the objective over discrete random samples z into an objective oven continuous random samples $\hat z$ during training and use the reparametrization trick:
$图片说明$

Gumbel-Max trick

图片说明

Some ideas about Gumbel Distribution：
https://qinqianshan.com/math/probability_distribution/gumbel-distribution/

Variance Reduction

Control Variates
Consider some $b(z)$ with tractable expectation $\mu = E_{q(z)}b(z)$ . Then $图片说明$

Simple Baselines:
Constant baseline $b(z)=c$
$图片说明$
Variance Minimization:
$图片说明$
Gumbel-Relaxed Baselines: