Skip to Main content Skip to Navigation
Journal articles

Advanced Preprocessing for intersites Web Usage Mining

Doru Tanasa 1 Brigitte Trousee 1
1 AxIS - Usage-centered design, analysis and improvement of information systems
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Paris-Rocquencourt
Abstract : Web usage mining applies data mining procedures to analyze user access of Web sites. As with any KDD (knowledge discovery and data mining) process, WUM contains three main steps: preprocessing, knowledge extraction, and results analysis. We focus on data preprocessing, a fastidious, complex process. Analysts aim to determine the exact list of users who accessed the Web site and to reconstitute user sessions-the sequence of actions each user performed on the Web site. Intersites WUM deals with Web server logs from several Web sites, generally belonging to the same organization. Thus, analysts must reassemble the users' path through all the different Web servers that they visited. Our solution is to join all the log files and reconstitute the visit. Classical data preprocessing involves three steps: data fusion, data cleaning, and data structuration. Our solution for WUM adds what we call advanced data preprocessing. This consists of a data summarization step, which will allow the analyst to select only the information of interest. We've successfully tested our solution in an experiment with log files from INRIA Web sites. Published in:
Document type :
Journal articles
Complete list of metadata

https://hal.inria.fr/hal-00950763
Contributor : Brigitte Trousse <>
Submitted on : Saturday, February 22, 2014 - 3:46:31 PM
Last modification on : Friday, May 25, 2018 - 12:02:04 PM

Identifiers

Collections

Citation

Doru Tanasa, Brigitte Trousee. Advanced Preprocessing for intersites Web Usage Mining. IEEE Intelligent Systems, Institute of Electrical and Electronics Engineers, 2004, 19 (2), pp.59-65. ⟨10.1109/MIS.2004.1274912⟩. ⟨hal-00950763⟩

Share

Metrics

Record views

177