TY - GEN
T1 - Feature Location Using Extraction of Code Documentation
AU - Arwan, Achmad
AU - Rochimah, Siti
AU - Fatichah, Chastine
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/24
Y1 - 2023/10/24
N2 - Feature location is a set of procedures of how to seek and find between software artifacts in pieces of source code. Code documentation is the breadcrumb of a puzzle which could help to get the developer's actual intention of specific code purpose. The token documents are extracted from the class name, method name, and Javadoc using natural language processing. The class name, method name, and Javadoc combine as documents. The VSM Lucene employed to index the document. The noun or verb of the use case scenario was transformed using a generic-specific relationship or replacement using the same meaning code name which was used as a query to identify the feature location. Each token from the use case scenario is categorized as class, method, or Javadoc to determine what is the best field to match. The query tokens are also preprocessed using NLP. The number of use case scenarios was 15, and the number of queries was about 150. The number of relevant files was 93, while the number of retrieved were 259. The number of relevant and retrieved was 65 files. As a result, the research could get the average precision rate was 24%, the average recall was 70%. The best precision was 50% and the best recall was 100%.
AB - Feature location is a set of procedures of how to seek and find between software artifacts in pieces of source code. Code documentation is the breadcrumb of a puzzle which could help to get the developer's actual intention of specific code purpose. The token documents are extracted from the class name, method name, and Javadoc using natural language processing. The class name, method name, and Javadoc combine as documents. The VSM Lucene employed to index the document. The noun or verb of the use case scenario was transformed using a generic-specific relationship or replacement using the same meaning code name which was used as a query to identify the feature location. Each token from the use case scenario is categorized as class, method, or Javadoc to determine what is the best field to match. The query tokens are also preprocessed using NLP. The number of use case scenarios was 15, and the number of queries was about 150. The number of relevant files was 93, while the number of retrieved were 259. The number of relevant and retrieved was 65 files. As a result, the research could get the average precision rate was 24%, the average recall was 70%. The best precision was 50% and the best recall was 100%.
KW - Feature Location
KW - Information Retrieval
KW - Software Maintenance
KW - VSM Lucene
UR - http://www.scopus.com/inward/record.url?scp=85182396266&partnerID=8YFLogxK
U2 - 10.1145/3626641.3627149
DO - 10.1145/3626641.3627149
M3 - Conference contribution
AN - SCOPUS:85182396266
T3 - ACM International Conference Proceeding Series
SP - 481
EP - 488
BT - SIET 2023 - Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology
PB - Association for Computing Machinery
T2 - 8th International Conference on Sustainable Information Engineering and Technology, SIET 2023
Y2 - 24 October 2023 through 25 October 2023
ER -