Attention Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
Transformer
The Illustrated Transformer
Transformer实现
Bert
一文看懂Bert原理 文本分类实践 Pytroch的Bert微调教程