Skip to Main content Skip to Navigation
Journal articles

Advanced Preprocessing for intersites Web Usage Mining

Doru Tanasa 1 Brigitte Trousse 1 
1 AxIS - Usage-centered design, analysis and improvement of information systems
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Paris-Rocquencourt
Abstract : Web usage mining applies data mining procedures to analyze user access of Web sites. As with any KDD (knowledge discovery and data mining) process, WUM contains three main steps: preprocessing, knowledge extraction, and results analysis. We focus on data preprocessing, a fastidious, complex process. Analysts aim to determine the exact list of users who accessed the Web site and to reconstitute user sessions-the sequence of actions each user performed on the Web site. Intersites WUM deals with Web server logs from several Web sites, generally belonging to the same organization. Thus, analysts must reassemble the users' path through all the different Web servers that they visited. Our solution is to join all the log files and reconstitute the visit. Classical data preprocessing involves three steps: data fusion, data cleaning, and data structuration. Our solution for WUM adds what we call advanced data preprocessing. This consists of a data summarization step, which will allow the analyst to select only the information of interest. We've successfully tested our solution in an experiment with log files from INRIA Web sites. Published in:
Document type :
Journal articles
Complete list of metadata
Contributor : Brigitte Trousse Connect in order to contact the contributor
Submitted on : Saturday, February 22, 2014 - 3:46:31 PM
Last modification on : Wednesday, April 6, 2022 - 3:48:11 PM




Doru Tanasa, Brigitte Trousse. Advanced Preprocessing for intersites Web Usage Mining. IEEE Intelligent Systems, Institute of Electrical and Electronics Engineers, 2004, 19 (2), pp.59-65. ⟨10.1109/MIS.2004.1274912⟩. ⟨hal-00950763⟩



Record views