Khalil A. Nagi


The study aims at performing an error analysis as well as providing an evaluation of the quality of neural machine translation (NMT) represented by Google Translate when translating relative clauses. The study uses two test suites are composed of sentences that contain relative clauses. The first test suite composes of 108 pair sentences that are translated from English to Arabic whereas the second composes of 72 Arabic sentences that are translated into English. Errors annotation is performed by 6 professional annotators. The study presents a list of the annotated errors divided into accuracy and fluency errors that occur based on MQM. Manual evaluation is also performed by the six professionals along with a BLEU automatic evaluation using the Tilde Me platform. The results show that fluency errors are more frequent than accuracy errors. They also show that the frequency of errors and MT quality when translating from English into Arabic is lower than the frequency of errors and MT quality when translating from Arabic into English is also presented. Based on the performed error analysis and both manual and automatic evaluation, it is pointed out that the gap between MT and professional human translation is still large.


Download data is not yet available.



fluency, accuracy, error analysis, evaluation, test suite.

English Language (Translation)