Early-Detection System for Cross-Language (Translated) Plagiarism

Khabib Mustofa; Yosua Albert Sir

doi:10.1007/978-3-642-36818-9_3

Communication Dans Un Congrès Année : 2013

Early-Detection System for Cross-Language (Translated) Plagiarism

(1) , (1)

Khabib Mustofa

Fonction : Auteur
PersonId : 993453

Universitas Gadjah Mada

Yosua Albert Sir

Fonction : Auteur
PersonId : 1003092

Universitas Gadjah Mada

Résumé

The implementation of internet applications has already crossed the language border. It has, for sure, brought lots of advantages, but to some extent has also introduced some side-effect. One of the negative effects of using these applications is cross-languages plagiarism, which is also known as translated plagiarism.In academic institutions, translated plagiarism can be found in various cases, such as: final project, theses, papers, and so forth. In this paper, a model for web-based early detection system for translated plagiarism is proposed and a prototype is developed. The system works by translating the input document (written in Bahasa Indonesian) into English using Google Translate API components, and then search for documents on the World Wide Web repository which have similar contents to the translated document. If found, the system downloads these documents and then do some preprocessing steps such as: removing punctuations, numbers, stop words, repeated words, lemmatization of words, and the final process is to compare the content of both documents using the modified sentence-based detection algorithm (SBDA). The results show that the proposed method has smaller error rate leading to conclusion that it has better accuracy.

Mots clés

translated plagiarism sentence-based detection algorithm (SBDA) modified-SDBA Google API

Domaines

Informatique [cs] Sciences de l'information et de la communication

Fichier principal

978-3-642-36818-9_3_Chapter.pdf (350.19 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Hal Ifip : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01480193

Soumis le : mercredi 1 mars 2017-11:05:34

Dernière modification le : jeudi 2 mars 2017-01:04:26

Archivage à long terme le : mardi 30 mai 2017-14:44:46

Dates et versions

hal-01480193 , version 1 (01-03-2017)

Licence

Paternité

Identifiants

HAL Id : hal-01480193 , version 1
DOI : 10.1007/978-3-642-36818-9_3

Citer

Khabib Mustofa, Yosua Albert Sir. Early-Detection System for Cross-Language (Translated) Plagiarism. 1st International Conference on Information and Communication Technology (ICT-EurAsia), Mar 2013, Yogyakarta, Indonesia. pp.21-30, ⟨10.1007/978-3-642-36818-9_3⟩. ⟨hal-01480193⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IFIP-LNCS IFIP IFIP-TC IFIP-TC5 IFIP-TC8 IFIP-ICT-EURASIA IFIP-LNCS-7804

64 Consultations

694 Téléchargements

Early-Detection System for Cross-Language (Translated) Plagiarism

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager