Mongolian-Chinese Unsupervised Neural Machine Translation with Lexical Feature

来源 :第十八届中国计算语言学大会暨中国中文信息学会2019学术年会 | 被引量 : 0次 | 上传用户：guocheng2244

【摘要】

：

【作者】

：

Ziyu Wu Hongxu Hou Ziyue Guo Xuejiao Wang Shuo Sun

【机构】

：

Department of Computer Science,Inner Mongolia University,China

【出处】

：

第十八届中国计算语言学大会暨中国中文信息学会2019学术年会

【发表日期】

：

2019年8期

【关键词】

：

Mongolian-Chinese neural machine translation unsupervised method Stem-Affix Segm

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Machine translation has achieved impressive performance with the advances in deep learning and rely on large scale parallel corpora.There have been a large number of attempts to extend these successes to low-resource lan-guage,yet requiring large parallel sentences.In this study,we build the Mongo-lian-Chinese neural machine translation model based on unsupervised methods.Cross-lingual word embedding training plays a crucial role in unsupervised ma-chine translation which generative adversarial networks(GANs)training meth-ods only perform well between two closely-related languages,yet the self-learn-ing method can learn high-quality bilingual embedding mappings without any parallel corpora in low-source language.In this work,apply the self-learning method is better than using GANs to improve the BLEU score of 1.0.On this basis,we analyze the Mongolian word lexical features and use stem-affixes seg-mentation in Mongolian to replace the Bytes-Pair-Encoding(BPE)operation,so that the cross-lingual word embedding training is more accurate,and obtain higher quality bilingual words embedding to enhance translation performance.We reporting BLEU score of 15.2 on the CWMT2017 Mongolian-Chinese da-taset,without using any parallel corpora during training.

其他文献

Natural Language Inference based on the LIC architecture with DCAE Feature

Natural Language Inference(NLI),which is also known as Recognizing Textual Entailment(RTE),aims to identify the logical relationship between a premise and a hypothesis.In this paper,a DCAE(Directly-Co

会议

Natural Language InferenceRecognizing Textual EntailmentAttentionBi-LSTMReco

Comparative Investigation of Deep Learning Components for End-to-end Implicit Discourse Relationship

The neural components in deep learning framework are crucial for the performance of many natural language processing tasks.So far there is no systematic work to investigate the influence of neural com

会议

Deep learningImplicit discourse relation classificationWord embeddingNeural n

Legal Cause Prediction with Inner Descriptions and Outer Hierarchies

Legal Cause Prediction(LCP)aims to determine the charges in criminal cases or types of disputes in civil cases according to the fact descriptions.The research to date takes LCP as a text classificatio

会议

Syntax-Aware Attention for Natural Language Inference with Phrase-Level Matching

Natural language inference(NLI)aims to predict whether a premise sentence can infer another hypothesis sentence.Models based on tree structures have shown promising results on this task,but the perfor

会议

Natural language inferenceSyntax-aware attentionTree-structured semantic compo

CJRC:A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

We present a Chinese judicial reading comprehension(CJRC)dataset which contains approximately 10K documents and almost 50K questions with answers.The documents come from judgment documents and the que

会议

Neural CTR Prediction for Native Ad

Native ad is an important kind of online advertising which has similar form with the other content in the same platform.Compared with search ad,predicting the click-through rate(CTR)of native ad is mo

会议

Native adUser modelingCTR prediction

基于UWB技术的煤矿井下人员精确定位系统研究

学位

露天煤矿支撑体空间效应对端帮稳定性的影响研究

学位

基于表面肌电信号的智能仿生手设计与研究

学位

Endangered Tujia Language Speech Enhancement Research Based on Improved DCGAN

As an endangered language,Tujia language only rely on oral communication.There must exist noises in the process of collecting Tujia language corpus.This paper studies an end-to-end speech enhancement

会议

Tujia languageSpeech EnhancementDeep Convolutional Generative Adversarial Netw

Mongolian-Chinese Unsupervised Neural Machine Translation with Lexical Feature

与本文相关的学术论文