E. L. Moel-de, Expanding the usability of recorded lectures, p.59431, 2010.

M. M. Waldrop and N. Magazine, Massive open online courses, aka MOOCs, transform higher education and science, 2014.

D. C. Gibbon and Z. Liu, Introduction to Video Search Engines, 2008.

D. R. Cutting, J. O. Pedersen, D. R. Karger, and J. W. Tukey, Scatter/Gather: a cluster-based approach to browsing large document collections, Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '92, pp.318-329, 1992.
DOI : 10.1145/133160.133214

J. Yang, Q. Li, L. Wenyin, and Y. Zhuang, Searching for Flash Movies on the Web: A Content and Context Based Framework, World Wide Web, vol.1, issue.3, pp.495-517, 2005.
DOI : 10.1007/s11280-005-0905-x

S. C. Cheung and A. Zakhor, Efficient video similarity measurement with video signature, Proceedings. International Conference on Image Processing, pp.59-74, 2003.
DOI : 10.1109/ICIP.2002.1038101

A. Hindle, J. Shao, D. Lin, J. Lu, and R. Zhang, Clustering Web video search results based on integration of multiple features, World Wide Web, vol.31, issue.11???16, pp.53-73, 2011.
DOI : 10.1007/s11280-010-0097-x

A. Amine, Z. Elberrichi, and M. Simonet, Evaluation of text clustering methods using wordnet, Int. Arab J. Inf. Technol, vol.7, issue.4, pp.349-357, 2010.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu et al., Natural language processing (almost) from scratch, The Journal of Machine Learning Research, vol.12, pp.2493-2537, 2011.

A. D. Friederici, The Brain Basis of Language Processing: From Structure to Function, Physiological Reviews, vol.91, issue.4, pp.1357-1392, 2011.
DOI : 10.1152/physrev.00006.2011

M. Jin and M. Murakami, AUTHORS'' CHARACTERISTIC WRITING STYLES AS SEEN THROUGH THEIR USE OF COMMAS, Behaviormetrika, vol.20, issue.1, pp.3-76, 1992.
DOI : 10.2333/bhmk.20.63

S. Shehata, F. Karray, and M. S. Kamel, An efficient concept-based mining model for enhancing text clustering. Knowledge and Data Engineering, IEEE Transactions on, vol.22, issue.10, pp.1360-1371, 2010.

C. Passini, M. Luiza, K. B. Estébanez, G. P. Figueredo, F. Ebecken et al., A Strategy for Training Set Selection in Text Classification Problems, International Journal of Advanced Computer Science & Applications, vol.4, issue.6, 2013.

S. Kiritchenko and S. Matwin, Email classification with co-training, Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research, pp.301-312, 2011.
DOI : 10.1007/978-3-540-39985-8_61

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.62.1192

K. Filippova and K. B. Hall, Improved video categorization from text metadata and user comments, Proceedings of the 34th international ACM SIGIR conference on Research and development in Information, SIGIR '11, pp.835-842, 2011.
DOI : 10.1145/2009916.2010028

M. Gilroy, Higher education migrates to YouTube and social networks, Education Digest, vol.75, issue.7, pp.18-22, 2010.

N. Selwyn, Social media in higher education, The Europa World of Learning, 2012.

G. Chatzopoulou, C. Sheng, and M. Faloutsos, A First Step Towards Understanding Popularity in YouTube, 2010 INFOCOM IEEE Conference on Computer Communications Workshops, pp.1-6, 2010.
DOI : 10.1109/INFCOMW.2010.5466701

F. Figueiredo, J. M. Almeida, M. A. Gonçalves, and F. Benevenuto, On the Dynamics of Social Media Popularity, ACM Transactions on Internet Technology, vol.14, issue.4, pp.1402-1777, 2014.
DOI : 10.1145/2665065

B. Liu, Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies, vol.5, issue.1, pp.1-167, 2012.
DOI : 10.2200/S00416ED1V01Y201204HLT016

H. Chen and D. Zimbra, AI and Opinion Mining, IEEE Intelligent Systems, vol.25, issue.3, pp.74-80, 2010.
DOI : 10.1109/MIS.2010.75

M. Wattenhofer, R. Wattenhofer, and Z. Zhu, The YouTube Social Network, In: ICWSM, 2012.

M. Moran, J. Seaman, and H. Tinti-kane, Teaching, Learning, and Sharing: How Today's Higher Education Faculty Use Social Media, Babson Survey Research Group, 2011.

. Weka, Data Mining Software in Java, 2014.

T. Kanungo, D. M. Mount, N. S. Netanyahu, C. Piatko, R. Silverman et al., -means clustering algorithm, Proceedings of the sixteenth annual symposium on Computational geometry , SCG '00, pp.100-109, 2000.
DOI : 10.1145/336154.336189

URL : https://hal.archives-ouvertes.fr/in2p3-01325998

T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman et al., An efficient k-means clustering algorithm: Analysis and implementation. Pattern Analysis and Machine Intelligence, IEEE Transactions, issue.7, pp.24-881, 2002.

P. Vora and B. Oza, A Survey on K-mean Clustering and Particle Swarm Optimization, International Journal of Science and Modern Engineering, pp.24-26, 2013.

I. Färber, S. Günnemann, H. P. Kriegel, P. Kröger, E. Müller et al., On using class-labels in evaluation of clusterings, MultiClust: 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with KDD, 2010.

N. Wiegand, Creating Complex Sentence Structure, Proceedings of the Annual Meeting of the Berkeley Linguistics Society, 2011.
DOI : 10.3765/bls.v10i0.1932

G. Beliakov, H. Bustince, and J. Fernandez, The median and its extensions. Fuzzy sets and systems, pp.36-47, 2011.

P. Grzybek, E. Stadlober, and E. Kelih, The Relationship of Word Length and Sentence Length: The Inter-Textual Perspective, Advances in Data Analysis, pp.611-618, 2007.
DOI : 10.1007/978-3-540-70981-7_70

K. Mahowald, E. Fedorenko, S. T. Piantadosi, and E. Gibson, Info/information theory: Speakers choose shorter words in predictive contexts, Cognition, vol.126, issue.2, pp.313-318, 2013.
DOI : 10.1016/j.cognition.2012.09.010

R. L. Hill and W. S. Murray, Commas and spaces: The point of punctuation, 11th Annual CUNY Conference on Human Sentence Processing, 1998.

D. D. Palmer, Tokenisation and sentence segmentation, chapter 2 Handbook of natural language processing, 2000.