TY - GEN
T1 - Edit distance weighting modification using phonetic and typographic letter grouping over homomorphic encrypted data
AU - Ahmad, Tohari
AU - Indrayana, Kukuh
AU - Wibisono, Waskitho
AU - Ijtihadie, Royyana M.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - Edit Distance string matching algorithm gives same weight for every single mismatching character. In fact, mismatching can be caused by phonetic error, mistyping error, or unknown error. An improvement has been made by Editex which modifies that algorithm. However, it tolerates only the phonetic error. In this paper, we increase its performance by proposing new weighting and distance calculation of that algorithm. Here, the source of mismatching is grouped into phonetic and typographic errors. Characters are divided into groups of phoneticity and typography, which have their own weight. By using this letter grouping, our proposed method is also suitable for implementation in homomorphic encrypted data. Experimental results show that this method produces lower false positive rates than the Edit Distance and Editex algorithms. The proposed method generates 2.2 false positives per experiment, while Edit Distance and Editex produce 8.24 and 3.12, respectively. It can be inferred that this proposed method is able to produce a relatively low error rate.
AB - Edit Distance string matching algorithm gives same weight for every single mismatching character. In fact, mismatching can be caused by phonetic error, mistyping error, or unknown error. An improvement has been made by Editex which modifies that algorithm. However, it tolerates only the phonetic error. In this paper, we increase its performance by proposing new weighting and distance calculation of that algorithm. Here, the source of mismatching is grouped into phonetic and typographic errors. Characters are divided into groups of phoneticity and typography, which have their own weight. By using this letter grouping, our proposed method is also suitable for implementation in homomorphic encrypted data. Experimental results show that this method produces lower false positive rates than the Edit Distance and Editex algorithms. The proposed method generates 2.2 false positives per experiment, while Edit Distance and Editex produce 8.24 and 3.12, respectively. It can be inferred that this proposed method is able to produce a relatively low error rate.
KW - edit distance
KW - homomorphic encryption
KW - information security
KW - string matching
UR - http://www.scopus.com/inward/record.url?scp=85046644487&partnerID=8YFLogxK
U2 - 10.1109/ICSITech.2017.8257147
DO - 10.1109/ICSITech.2017.8257147
M3 - Conference contribution
AN - SCOPUS:85046644487
T3 - Proceeding - 2017 3rd International Conference on Science in Information Technology: Theory and Application of IT for Education, Industry and Society in Big Data Era, ICSITech 2017
SP - 408
EP - 412
BT - Proceeding - 2017 3rd International Conference on Science in Information Technology
A2 - Riza, Lala Septem
A2 - Pranolo, Andri
A2 - Wibawa, Aji Prasetyo
A2 - Junaeti, Enjun
A2 - Wihardi, Yaya
A2 - Hashim, Ummi Raba'ah
A2 - Horng, Shi-Jinn
A2 - Drezewski, Rafal
A2 - Lim, Heui Seok
A2 - Chakraborty, Goutam
A2 - Hernandez, Leonel
A2 - Nazir, Shah
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference on Science in Information Technology, ICSITech 2017
Y2 - 25 October 2017 through 26 October 2017
ER -