Deep Neural networks for Youtube recommendations

遇到的问题：

尺度:数据量大
新颖度:上传量大
噪声:很难得到ground truth,需要隐式建模

模型：

两阶段模型：
candidate generation
recommendation as classification

$P(w_t =i|U,C) = \frac{e^{v_iu}}{\Sigma_{j \in V} e^{v_ju}}$
efficient extreme multiclass
解决方法：负采样,重要性采样
topN搜索方法: hashing methods
工程经验:把样本的年龄(从产生到现在的时间)作为特征传入样本,在serving的时候,这个特征被设定为0
label and context selection
样本是从所有youtube观看中选取而不是仅仅从我们产生的推荐中选取
每个用户产生相同数量的样本
experiments with features and depth
Tower pattern, 参数逐渐减少一半
ranking
特征工程
embedding离散特征,归一化连续特征( $\sqrt x,\bar x,x^2$ )