GAN中有不可微分的地方

Gumbel softmax

Continuous input for discriminator:

如果直接one hot和continuous比较太容易区分,所以可以比较embedding和weighted sum,这样不太容易区分,可以混淆discriminator。

Reinforcement learning

Environment会变,导致强化学习训练困难。

Tip:

Reward for every generation step

  1. Monto Carlo Search
  2. Discriminator for partially decoded sequences
  3. Step-wise evaluation

文本风格转换思路可以用做unsupervised translation model