Probabilistic Record Matching for Entity Resolution Using Markov Logic Networks

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Entity resolution (ER) is a problem in identifying objects referring to the same real-world entity into a single representation. In the context of the database, ER is also known as record linkage to determine records that refer to the same entities in which statistical probabilistic approach of this type of ER is called probabilistic record linkage (PRL). In addition, PRL has been used for various ER problems, including derivatives that use machine learning as an improvement. However, this probabilistic approach has one problem in ER for dealing with missing data that commonly occur in unreliable datasets. Such unreliable data can lead to more uncertainty and can reduce the quality of the final result. This paper discusses an alternative approach of PRL using a Markov logic networks (MLN) to infer the matching of record pairs in unreliable datasets, especially for datasets with a high rate of missing data. The proposed approach was inspired by a model of matching dependencies (MDS) that has been formally introduced to address unreliable datasets. Experimentation on real-world datasets taken from State Islamic University of Maulana Malik Ibrahim Malang Indonesia was done with 0.977 accuracy approaching 0.986 in the previous method.

Original languageEnglish
Title of host publication2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar, EECCIS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages360-364
Number of pages5
ISBN (Electronic)9781538652510
DOIs
Publication statusPublished - 2 Jul 2018
Event2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar, EECCIS 2018 - Batu, East Java, Indonesia
Duration: 9 Oct 201811 Oct 2018

Publication series

Name2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar, EECCIS 2018

Conference

Conference2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar, EECCIS 2018
Country/TerritoryIndonesia
CityBatu, East Java
Period9/10/1811/10/18

Keywords

  • data integration
  • entity resolution
  • markov logic networks
  • matching dependencies
  • probabilistic record linkage

Fingerprint

Dive into the research topics of 'Probabilistic Record Matching for Entity Resolution Using Markov Logic Networks'. Together they form a unique fingerprint.

Cite this