episciences.org ISSN 2416-5999, an open-access journal Journal of Data Mining and Digital Humanities http://jdmdh.episciences.org ISSN 2416-5999, an open-access journal Another perspective is the study of the efficiency and scalability of statistico-semantic methods. We are currently working on the implementation of methods such as word2vec, 2013) and the improvement of memory efficiency of the algorithm of (Mousselly-Sergiehet al.,2013) based on stream processing methods ,
« A fast document copy detection model ». Soft Computing -A Fusion of Foundations, Methodologies and Applications, vol.10, pp.1-41, 2006. ,
DOI : 10.1007/s00500-005-0463-2
« Unsupervised Detection and Visualisation of Textual Reuse on Ancient Greek Texts, Journal of the Chicago Colloquium on Digital Humanities and Computer Science, vol.1, pp.2-3, 2010. ,
Increasing Recall for Text Re-use in Historical Documents to Support Research in the Humanities, Lecture Notes in Computer Science, vol.7489, pp.95-100, 2012. ,
DOI : 10.1007/978-3-642-33290-6_11
Step Closer To Paraphrase Detection On Historical Texts: About The Quality of Text Re-use Techniques and the Ability to Learn Paradigmatic Relations, Journal of the Chicago Colloquium on Digital Humanities and Computer Science, 2011. ,
lemmatisation des textes grecs et byzantins : une approche particulière de la langue et des auteurs, Byzantion : revue internationale des études byzantines, pp.35-54, 1996. ,
APPLYING SIMILARITY MEASURES FOR AUTOMATIC LEMMATIZATION: A CASE STUDY FOR MODERN GREEK AND ENGLISH, International Journal on Artificial Intelligence Tools, vol.18, issue.05, p.1043, 1142. ,
DOI : 10.1075/jgl.4.09ral
« Identifying quotations in reference works and primary materials », Research and Advanced Technology for Digital Libraries, pp.78-87, 2008. ,
DOI : 10.1007/978-3-540-87599-4_9
Abdel-Aal, « Analysis and extraction of sentence-level paraphrase sub-corpus in CS education, Proceedings of the 2012 ACM SIGITE Conference, pp.49-54 ,
New Functions for Unsupervised Asymmetrical Paraphrase Detection, Journal of Software, vol.2, issue.4, pp.4-12, 2007. ,
DOI : 10.4304/jsw.2.4.12-23
A computational model of text reuse in ancient literary texts, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp.472-479, 2007. ,
Meme-tracking and the dynamics of the news cycle, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pp.497-506 ,
DOI : 10.1145/1557019.1557077
Computer-based plagiarism detection methods and tools, Proceedings of the 2007 international conference on Computer systems and technologies , CompSysTech '07, pp.1-6, 2007. ,
DOI : 10.1145/1330598.1330642
Efficient Estimation of Word Representations in Vector Space. Computation and Language, 2013. ,
Distributed Representations of Words and their Compositionality Retrieved from http, NIPS, pp.3111-3119, 2013. ,
« Tag Similarity in Folksonomies, pp.319-334, 2013. ,
« Detecting Text Reuse with Modified and Weighted N-grams, Proceedings of the ACM First Joint Conference on Lexical and Computational Semantics, pp.54-58, 2012. ,