TY - GEN
T1 - Conflict of Interest based Features for Expert Classification in Bibliographic Network
AU - Purwitasari, Diana
AU - Ilmi, Akhmad Bakhrul
AU - Fatichah, Chastine
AU - Fauzi, Willy Achmat
AU - Sumpeno, Surya
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Countless approaches of feature extraction in the expert classification problem employ text contents and network structures from bibliographic metadata of published articles. The content part often use title and abstract while the structure part utilize co-authorship and citation. On citation data, the classifier method works on a feature of citation quantity since a frequently cited author is presumed to have more expertise. Citation misconduct occurs if there is no subject relation between citing and cited articles. Therefore, the misconduct becomes a challenge for evaluation of citation quality. Here, the problem is to classify experts with features that can indicate citation misconduct. To address this problem, our contribution exploited the quality and the quantity of citations in feature extraction designed for classifying experts. Co-authorship that influence the misconducts is called as Conflict of Interest (CoI) situation. Accordingly, the class labels are experts with or without CoI indication. We proposed three ratio features of (1) self-citation to represent the citation quantity, then (2) subject similarity of author interests and article contents, as well as (3) subject similarity of citing and cited articles to determine the citation quality. There are various word phrases used in subjects with similar contexts. Therefore the proposed CoI-based features for the citation quality took on deep learning approaches for understanding natural language. Our experiments exercised a selection of data from one of the common datasets in bibliographic related problems called as AMiner. We selected ± 15K articles from the original data of ± 2M articles in the experiments. The results showed that our proposed features classified experts with CoI indication by accuracy value of ± 60%. Although the first feature of citation quantity was not significant for categorizing experts, other features of citation quality confirmed more profound evidence.
AB - Countless approaches of feature extraction in the expert classification problem employ text contents and network structures from bibliographic metadata of published articles. The content part often use title and abstract while the structure part utilize co-authorship and citation. On citation data, the classifier method works on a feature of citation quantity since a frequently cited author is presumed to have more expertise. Citation misconduct occurs if there is no subject relation between citing and cited articles. Therefore, the misconduct becomes a challenge for evaluation of citation quality. Here, the problem is to classify experts with features that can indicate citation misconduct. To address this problem, our contribution exploited the quality and the quantity of citations in feature extraction designed for classifying experts. Co-authorship that influence the misconducts is called as Conflict of Interest (CoI) situation. Accordingly, the class labels are experts with or without CoI indication. We proposed three ratio features of (1) self-citation to represent the citation quantity, then (2) subject similarity of author interests and article contents, as well as (3) subject similarity of citing and cited articles to determine the citation quality. There are various word phrases used in subjects with similar contexts. Therefore the proposed CoI-based features for the citation quality took on deep learning approaches for understanding natural language. Our experiments exercised a selection of data from one of the common datasets in bibliographic related problems called as AMiner. We selected ± 15K articles from the original data of ± 2M articles in the experiments. The results showed that our proposed features classified experts with CoI indication by accuracy value of ± 60%. Although the first feature of citation quantity was not significant for categorizing experts, other features of citation quality confirmed more profound evidence.
KW - bibliographic data
KW - citation analysis
KW - conflict of interest feature
KW - deep learning
KW - expert classification
KW - word embedding
UR - http://www.scopus.com/inward/record.url?scp=85066503487&partnerID=8YFLogxK
U2 - 10.1109/CENIM.2018.8710931
DO - 10.1109/CENIM.2018.8710931
M3 - Conference contribution
AN - SCOPUS:85066503487
T3 - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018 - Proceeding
SP - 54
EP - 59
BT - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018 - Proceeding
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018
Y2 - 26 November 2018 through 27 November 2018
ER -