Abstract : Every day the global media system produces an abundance of news stories, all containing many references to people. An important task is to automatically generate reliable lists of people by analysing news content. We describe a system that leverages large amounts of data for this purpose. Lack of structure in this data gives rise to a large number of ways to refer to any particular person. Entity matching attempts to connect references that refer to the same person, usually employing some measure of similarity between references. We use information from multiple sources in order to produce a set of similarity measures with differing strengths and weaknesses. We show how their combination can improve precision without decreasing recall.
https://hal.inria.fr/hal-01060664 Contributor : Hal IfipConnect in order to contact the contributor Submitted on : Friday, November 17, 2017 - 3:59:28 PM Last modification on : Thursday, December 9, 2021 - 10:44:07 AM Long-term archiving on: : Sunday, February 18, 2018 - 4:20:36 PM
Omar Ali, Nello Cristianini. Information Fusion for Entity Matching in Unstructured Data. 6th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations (AIAI), Oct 2010, Larnaca, Cyprus. pp.162-169, ⟨10.1007/978-3-642-16239-8_23⟩. ⟨hal-01060664⟩