Application of the Nazief & Adriani Stemming Algorithm in the News Clustering Process Based on Thematics on the Directorate General of Human Rights Website Using Rapidminer
DOI:
https://doi.org/10.35706/syji.v11i02.7192Abstract
Abstract. Website is a medium used to convey information. Currently the news on the website of the Directorate General of Human Rights is not well categorized. there are only three news categories, namely news highlights, news, activities, and regional office info, but there is no information related to news categories based on thematics. this study aims to cluster news on the ham.go.id website based on the thematic using rapidminer, in rapidminer there is a stemporter feature but it is not yet available in Indonesian, therefore the author carried out the stemming process by utilizing the Nazief & Adriani stemming algorithm to improve clustering performance. To determine the best number of clusters, the author uses the lowest DBI value and performs external testing using the Confusion Matrix. From this study, the DBI value without going through the stemming process was 4,351 with an accuracy of 81.58%, recall 83.15%, precision 80.59%. After stemming using the Nazief & Adriani algorithm, the DBI value was 3,935 with an accuracy value of 86.84%, recall 85.71%, precision 82.50%.