TY - GEN
T1 - Enhancing MBTI Personality Trait Prediction from Imbalanced Social Media Data Using Hybrid Query Expansion Ranking and Glo Ve- BiLSTM
AU - Pradnyana, Gede Aditra
AU - Anggraeni, Wiwik
AU - Yuniarno, Eko Mulyanto
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The usefulness of information obtained from social media data using machine learning methods is increasingly widespread, including predicting a person's personality. One of the personality type theories that is often used today in describing a person's personality is the Myers-Briggs Type Indicator. The challenges faced in processing text data from social media by machine learning methods are the imbalanced data for each personality type and the high dimensional features extracted from the data. Handling the problem of imbalanced data with oversampling techniques will increase the high dimension of features, which has an impact on increasing computation time. On the other hand, reducing feature dimensions will affect the quality of the prediction results because the machine learning process requires an adequate amount of data. This study develops a hybrid QER and GloVe-BiLSTM model by combining the Bidi-rectional Long Short-Term Memory (BiLSTM) classifier layer with the Global Vectors for Word Representation (GloVe) and Query Expansion Ranking(QER) as an input layer. The model works on data that has previously gone through a balancing process using the Synthetic Minority Oversampling Technique (SMOTE). The experimental findings show that the proposed model can, in fact, significantly enhance personality prediction performance in terms of prediction accuracy and computation time.
AB - The usefulness of information obtained from social media data using machine learning methods is increasingly widespread, including predicting a person's personality. One of the personality type theories that is often used today in describing a person's personality is the Myers-Briggs Type Indicator. The challenges faced in processing text data from social media by machine learning methods are the imbalanced data for each personality type and the high dimensional features extracted from the data. Handling the problem of imbalanced data with oversampling techniques will increase the high dimension of features, which has an impact on increasing computation time. On the other hand, reducing feature dimensions will affect the quality of the prediction results because the machine learning process requires an adequate amount of data. This study develops a hybrid QER and GloVe-BiLSTM model by combining the Bidi-rectional Long Short-Term Memory (BiLSTM) classifier layer with the Global Vectors for Word Representation (GloVe) and Query Expansion Ranking(QER) as an input layer. The model works on data that has previously gone through a balancing process using the Synthetic Minority Oversampling Technique (SMOTE). The experimental findings show that the proposed model can, in fact, significantly enhance personality prediction performance in terms of prediction accuracy and computation time.
KW - bidirectional long short-term memory
KW - global vectors for word embedding
KW - personality prediction
KW - query expansion ranking
KW - social media data
UR - http://www.scopus.com/inward/record.url?scp=85178515006&partnerID=8YFLogxK
U2 - 10.1109/FUZZ52849.2023.10309718
DO - 10.1109/FUZZ52849.2023.10309718
M3 - Conference contribution
AN - SCOPUS:85178515006
T3 - IEEE International Conference on Fuzzy Systems
BT - 2023 IEEE International Conference on Fuzzy Systems, FUZZ 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE International Conference on Fuzzy Systems, FUZZ 2023
Y2 - 13 August 2023 through 17 August 2023
ER -