R. Agrawal, S. Gollapudi, A. Kannan, and K. Kenthapadi, Enriching textbooks with images, Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11, 2011.

N. Alm, M. Iwabuchi, P. N. Andreasen, and K. Nakamura, A Multi-lingual Augmentative Communication System, Lecture Notes in Computer Science, pp.398-408, 2003.

C. Stephanidis, ERCIM workshop on ?User interfaces for all?, ACM SIGCAPH Computers and the Physically Handicapped, issue.54, pp.20-23, 1996.
URL : https://hal.archives-ouvertes.fr/hal-00175659

K. (. Barnard, Computational Methods for Integrating Vision and Language, Synthesis Lectures on Computer Vision, vol.6, issue.1, pp.1-227, 2016.

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta, The WaCky wide web: a collection of very large linguistically processed web-crawled corpora, Language Resources and Evaluation, vol.43, issue.3, pp.209-226, 2009.

D. M. Blei and M. I. Jordan, Variational methods for the Dirichlet process, Twenty-first international conference on Machine learning - ICML '04, 2004.

L. Cao and . Li-fei-fei, Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes, 2007 IEEE 11th International Conference on Computer Vision, 2007.

R. N. Carney and J. R. Levin, Pictorial illustrations still improve students' learning from text

F. Chollet, Keras

B. Coyne and R. Sproat, WordsEye, Proceedings of the 28th annual conference on Computer graphics and interactive techniques - SIGGRAPH '01, 2001.

B. Coyne, R. Sproat, and J. Hirschberg, Spatial relations in textto-scene conversion, Computational Models of Spatial Language Interpretation, Workshop at Spatial Cognition

D. Pero, P. Luca, J. Lee, E. Magahern, K. Hartley et al., Fusing object detection and region appearance for image-text alignment, Proceedings of the th ACM international conference on Multimedia

D. Delgado, J. Magalhaes, and N. Correia, Assisted news reading with automated illustration, Proceedings of the international conference on Multimedia - MM '10, 2010.

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, IEEE conference on computer vision and pattern recognition

P. (. Dwivedi, Understanding and Coding a ResNet in Keras

A. Ferraresi, S. Bernardini, G. Picci, and M. Baroni, Web corpora for bilingual lexicography: A pilot study of English/French collocation extraction and translation, Using Corpora in Contrastive and Translation Studies

C. J. Fillmore and C. Baker, A frames approach to semantic analysis, The Oxford handbook of linguistic analysis

N. Firoozeh, A. Nazarenko, F. Alizon, and B. Daille, Keyword extraction: Issues and methods, Natural Language Engineering, vol.26, issue.3, pp.259-291, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02401629

K. Fujii, H. Nanba, T. Takezawa, and A. Ishino, Enriching Travel Guidebooks with Travel Blog Entries and Archives of Answered Questions, Information and Communication Technologies in Tourism 2016, pp.157-171, 2016.

C. Goutte and E. Gaussier, A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation, Lecture Notes in Computer Science, pp.345-359, 2005.

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

N. Hernandez, F. Poulard, M. Vernier, and J. Rocheteau, Bard, Nicolas Vernier, 2011.

M. Hodosh, P. Young, and J. Hockenmaier, Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics, Journal of Artificial Intelligence Research, vol.47, pp.853-899, 2013.

M. Honnibal and I. Montani, A PDP approach to deterministic natural language parsing, Neural Networks, vol.1, p.305, 1988.

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang et al., Mobilenets: Ecient convolutional neural networks for mobile vision applications

C. Huang, C. Li, and M. Shan, VizStory: Visualization of Digital Narrative for Fairy Tales, 2013 Conference on Technologies and Applications of Artificial Intelligence, 2013.

. Inria-(-) and . Inria, Proceedings 1995 INRIA/IEEE Symposium on Emerging Technologies and Factory Automation. ETFA'95, Proceedings 1995 INRIA/IEEE Symposium on Emerging Technologies and Factory Automation. ETFA'95, 1995.

Y. Itabashi and Y. Masunaga, Correlating Scenes as Series of Document Sentences with Images, 21st International Conference on Data Engineering Workshops (ICDEW'05), p.p, 2005.

Y. Jiang, J. Liu, Z. Li, C. Xu, and H. Lu, Chat with illustration, Proceedings of the 4th International Conference on Internet Multimedia Computing and Service - ICIMCS '12, 2012.

R. Johansson, D. Williams, A. Berglund, and P. Nugues, Carsim, Proceedings of the 2nd Workshop on Text Meaning and Interpretation - TextMean '04, 2004.

K. Spärck-jones, A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, vol.60, issue.5, pp.493-502, 2004.

A. Kelman, M. Sofka, and C. V. Stewart, Keypoint Descriptors for Matching Across Multiple Image Modalities and Non-linear Intensity Variations, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.

K. Kesorn and S. Chimlek,

K. Kesorn, S. Chimlek, S. Poslad, and P. Piamsa-nga, Visual content representation using semantically similar visual words, Expert Systems with Applications, vol.38, issue.9, pp.11472-11481, 2011.

K. Krawczak, M. Fabiszak, and M. Hilpert, A corpus-based, cross-linguistic approach to mental predicates and their complementation: Performativity and descriptivity vis-à-vis boundedness and picturability, Folia Linguistica, vol.50, issue.2, 2016.

M. (. Kunda, Visual mental imagery: A view from artificial intelligence, Cortex, vol.105, pp.155-172, 2018.

T. H. Lam and R. S. Lee, iJADE FreeWalker ? An Intelligent Ontology Agent-based Tourist Guiding System, Studies in Computational Intelligence, pp.103-125, 2007.

C. Lau, D. Tjondronegoro, J. Zhang, S. Geva, and Y. Liu, Fusing Visual and Textual Retrieval Techniques to Effectively Search Large Collections of Wikipedia Images, Comparative Evaluation of XML Information Retrieval Systems, pp.345-357

H. Li, J. Tang, G. Li, and T. Chua, Word2Image, Proceeding of the 16th ACM international conference on Multimedia - MM '08, 2008.

E. Loper and S. Bird, NLTK, Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics -, 2002.

D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.

H. Ma, J. Zhu, M. R. Lyu, and I. King, Bridging the Semantic Gap Between Image Contents and Tags, IEEE Transactions on Multimedia, vol.12, issue.5, pp.462-473, 2010.

H. Ma, J. Zhu, M. Lyu, and I. King, Bridging the Semantic Gap Between Image Contents and Tags, IEEE Transactions on Multimedia, vol.12, issue.5, pp.462-473, 2010.

W. May, S. Fidler, A. Fazly, S. Dickinson, and S. Stevenson, Unsupervised disambiguation of image captions, Proceedings of the main conference and the shared task, and Volume : Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval )

R. Mcdonald, J. Nivre, Y. Quirmbach-brundage, Y. Goldberg, D. Das et al., Universal dependency annotation for multilingual parsing, Proceedings of the st Annual Meeting of the Association for Computational Linguistics

A. (. Medelyan, A python implementation of the Rapid Automatic Keyword Extraction

I. Medhi, A. Sagar, and K. Toyama, Text-Free User Interfaces for Illiterate and Semi-Literate Users, 2006 International Conference on Information and Communication Technologies and Development, 2006.

R. Mihalcea and C. W. Leong, Toward communicating simple sentences using pictorial representations, Machine Translation, vol.22, issue.3, pp.153-173, 2008.

R. Mihalcea and P. Tarau, Textrank: Bringing order into text, Proceedings of the conference on empirical methods in natural language processing

G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller, Introduction to WordNet: An On-line Lexical Database*, International Journal of Lexicography, vol.3, issue.4, pp.235-244, 1990.

A. Moro, A. Raganato, and R. Navigli, Entity Linking meets Word Sense Disambiguation: a Unified Approach, Transactions of the Association for Computational Linguistics, vol.2, pp.231-244, 2014.

R. Navigli and S. P. Ponzetto, BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network, Artificial Intelligence, vol.193, pp.217-250, 2012.

P. Norvig and . Russel, A modern approach

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research
URL : https://hal.archives-ouvertes.fr/hal-00650905

F. Pezoa, J. L. Reutter, F. Suarez, M. Ugarte, and D. Vrgo?, Foundations of JSON Schema, Proceedings of the 25th International Conference on World Wide Web - WWW '16, 2016.

D. Qin, C. Wengert, and L. Van-gool, Query Adaptive Similarity for Large Scale Object Retrieval, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.

L. (. Richardson, Beautiful soup documentation

S. Rose, D. Engel, N. Cramer, and W. Cowley, Automatic Keyword Extraction from Individual Documents, Text Mining, pp.1-20, 2010.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol.115, issue.3, pp.211-252, 2015.

S. (. Santini and A. Gupta, A wavelet data model for image databases, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001., 2001.

H. Sarma, R. Porzel, J. D. Smeddinck, R. Malaka, and A. B. Samaddar, A Text to Animation System for Physical Exercises, The Computer Journal, 2018.

N. Serrano, A. E. Savakis, and J. Luo, Improved scene classification using efficient low-level features and semantic cues, Pattern Recognition, vol.37, issue.9, pp.1773-1784, 2004.

E. B. Sudderth, A. Torralba, W. T. Freeman, and A. S. Willsky, Depth from Familiar Objects: A Hierarchical Model for 3D Scenes, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR'06)

E. B. Sudderth, A. Torralba, W. T. Freeman, and A. S. Willsky, Learning hierarchical models of scenes, objects, and parts, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, 2005.

E. Blakesley-lindsay, Stands4.com200324Yigal Ben Efraim. Stands4.com. URL: http://www.stands4.com: E?mail: yigal@stands4.com Gratis Last visited September 2002, Reference Reviews, vol.17, issue.1, pp.27-28, 2003.

N. Uzzaman, J. P. Bigham, and J. F. Allen, Multimodal summarization of complex sentences, Proceedings of the 15th international conference on Intelligent user interfaces - IUI '11, 2011.

(. Wikipedia and ). Wikipedia,

L. Yu, E. Park, A. C. Berg, and T. L. Berg, Visual Madlibs: Fill in the Blank Description Generation and Question Answering, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.

X. Zhu, A. B. Goldberg, M. Eldawy, R. Charles, B. Dyer et al., A text-to-picture synthesis system for augmenting communication, In: AAAI

C. L. Zitnick, D. Parikh, and L. Vanderwende, Learning the Visual Interpretation of Sentences, 2013 IEEE International Conference on Computer Vision, 2013.