Data cleaning: Problems and current approaches, IEEE Data Engineering Bulletin, vol.23, pp.3-13, 2000. ,
An Introduction to Duplicate Detection, Synthesis Lectures on Data Management, vol.2, issue.1, 2010. ,
DOI : 10.2200/S00262ED1V01Y201003DTM003
Eliminating Fuzzy Duplicates in Data Warehouses, Conference on Very Large Databases (VLDB), pp.586-597, 2002. ,
DOI : 10.1016/B978-155860869-6/50058-5
Domain-independent data cleaning via analysis of entity-relationship graph, ACM Transactions on Database Systems, vol.31, issue.2, pp.716-767, 2006. ,
DOI : 10.1145/1138394.1138401
DogmatiX tracks down duplicates in XML, Proceedings of the 2005 ACM SIGMOD international conference on Management of data , SIGMOD '05, pp.431-442, 2005. ,
DOI : 10.1145/1066157.1066207
Structure-based inference of xml similarity for fuzzy duplicate detection, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management , CIKM '07, pp.293-302, 2007. ,
DOI : 10.1145/1321440.1321483
Matching XML documents in highly dynamic applications, Proceeding of the eighth ACM symposium on Document engineering, DocEng '08, pp.191-198, 2008. ,
DOI : 10.1145/1410140.1410178
Structure aware XML object identification, VLDB Workshop on Clean Databases (CleanDB), 2006. ,
An Overview of XML Duplicate Detection Algorithms, Soft Computing in XML Data Management, Studies in Fuzziness and Soft Computing, 2010. ,
DOI : 10.1007/978-3-642-14010-5_8
XML duplicate detection using sorted neigborhoods, Conference on Extending Database Technology (EDBT), pp.773-791, 2006. ,
Finding similar identities among objects from multiple web sources, Proceedings of the fifth ACM international workshop on Web information and data management , WIDM '03, pp.90-93, 2003. ,
DOI : 10.1145/956699.956719
The merge/purge problem for large databases, Conference on the Management of Data (SIGMOD), pp.127-138, 1995. ,
Probabilistic Reasoning in Intelligent Systems: Networks of plausible inference, 1988. ,
Duplicate detection through structure optimization, Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pp.443-452, 2011. ,
DOI : 10.1145/2063576.2063644
Measurement of Diversity, Nature, vol.163, issue.4148, p.688, 1949. ,
DOI : 10.1038/163688a0
Support vector regression machines, Advances in Neural Information Processing Systems (NIPS), pp.155-161, 1996. ,
Optimization by Simulated Annealing, Science, vol.220, issue.4598, pp.671-680, 1983. ,
DOI : 10.1126/science.220.4598.671
Making large-scale support vector machine learning practical, pp.169-184, 1999. ,
Object-level ranking, Proceedings of the 14th international conference on World Wide Web , WWW '05, pp.567-574, 2005. ,
DOI : 10.1145/1060745.1060828
Ranking web objects from multiple communities, Proceedings of the 15th ACM international conference on Information and knowledge management , CIKM '06, pp.377-386, 2006. ,
DOI : 10.1145/1183614.1183670