论文部分内容阅读
Neural Machine Translation(NMT)has drawn much attention due to its promising translation performance in recent years.The conventional optimiza-tion algorithm for NMT sets a unified learning rate for each gold target word dur-ing training.However,words under different probability distributions should be handled differently.Thus,we propose a cost-aware learning rate method,which can produce different learning rates for words with different costs.Specifically,for the gold word which ranks very low or has a big probability gap with the best candidate,the method can produce a larger learning rate and vice versa.The extensive experiments demonstrate the effectiveness of our proposed method.