文章目录
模型结构
<center>Hierarchical Attention
Word Encoder
Encoder采用的双向GRU
<center>Word Attention
<center>这里 <math> <semantics> <mrow> <msub> <mi> u </mi> <mi> w </mi> </msub> </mrow> <annotation encoding="application/x-tex"> u_w </annotation> </semantics> </math>uw作为context vector, 用来衡量原句每个词重要性, 其实是一个随机初始化后需要学习的参数.
Sentence Encoder
<center>Sentence Attention
<center>Document Classification
<center>实验
数据集
- Yelp reviews
- IMDB reviews
- Yahoo answers
- Amazon reviews
参数
首先, 把文档切分成句子, 并用CoreNLP分词.
- train: val: test = 80%: 10%: 10%
- word_embedding: 200
- GRU dimension: 50(双向之后拼接成100)
- batch_size: 64
- optimizer: SGD with momentum(0.9)