Applying TF-IDF and K-NN for Clickbait Detection in Indonesian Online News Headlines

Muhammad Athallah Afif, Munirul Ula, Lidya Rosnita, Rizal Rizal

Abstract


This research explores the application of TF-IDF (Term Frequency-Inverse Document Frequency) and K-Nearest Neighbor (K-NN) in constructing a clickbait detection system for Indonesian online news headlines. The TF-IDF method is employed to ascertain the significance of words in news headlines, utilizing a tokenization process to generate numeric representations. The TF-IDF matrix serves as features in the K-NN classification model, with k=1 determining the most similar class. Model evaluation yields outstanding results, achieving accuracy, precision, recall, and F1-Score all reaching 1.0. The confusion matrix unveils no misclassifications, affirming the model's adeptness in correctly classifying all samples.

Keywords


TF-IDF; k-Nearest Neighbor; Clickbait; Online News Headlines; Indonesian

Full Text:

PDF (English)

References


Y. Devianto and S. Dwiasnati. (2021). “Rancang Bangun Web Portal Berita Sebagai Sumber Informasi Berita Tentang Pertanian,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 2, pp. 534–546.

P. T. Fajarini, N. Kadek, A. Wirdiani, and I. P. A. Dharmaadi. (2020). “Evaluasi Portal Berita Online Pada Aspek Usability Online Portal Evaluation On Usability Aspect Using Heuristic,” vol. 7, no. 5, pp. 905–910.

A. Daradinanti and V. Karunia Mulia Putri, “Clickbait: Pengertian, Jenis, dan Contohnya,” kompas.com, 2022. [Online]. Available: https://www.kompas.com/skola/read/2022/05/18/103000469/clickbait--pengertian-jenis-dan-contohnya?page=all. Accessed on Dec. 23, 2023.

Retno, S., Rosnita, L., Anshari, S.F. (2023). Sistem Informasi Pelayanan Cuti Berbasis Web Pada PT Pupuk Iskandar Muda Menggunakan PHP dan MySQL. TECHSI-Jurnal Teknik Informatika, 14(1), 33-41.

R. Sagita, U. Enri, and A. Primajaya. (2020) “Klasifikasi Berita Clickbait Menggunakan K-Nearest Neighbor (KNN),” JOINS (Journal Inf. Syst., vol. 5, no. 2, pp. 230–239.

Mustakim, G. O. F. (2016). Algoritma K-Nearest Neighbor Classification Sebagai Sistem Prediksi Predikat Prestasi Mahasiswa, 13(2), 195–202.

Retno, S., Dinata, R.K., Hasdyna, N. (2023). Evaluasi model data chatbot dalam natural language processing menggunakan k-nearest neighbor. Jurnal CoSciTech (Computer Science and Information Technology. 4(1): 146-153.

I. Jaya, A. Hizriadi, and E. S. Purba. (2018). “Klasifikasi Surat Laporan Kehilangan Kepolisian Menggunakan Algoritma K – Nearest Neighbor,” TECHSI - J. Tek. Inform., vol. 10, no. 2, p. 120.

A. Asrianda, R. Risawandi, and G. Gunarwan. (2019). “Determining Lectural Evaluation in Faculty of Engineering Malikussaleh University Using K-NN,” TECHSI - J. Tek. Inform., vol. 11, no. 2, p. 307.

Tang, Y., Jing, L., Li, H., & Atkinson, P. M. (2016). A multiple-point spatially weighted k-NN method for object-based classification. International Journal of Applied Earth Observation and Geoinformation, 52, 263–274. https://doi.org/10.1016/j.jag.2016.06.017




DOI: https://doi.org/10.29103/jacka.v1i2.15810

Article Metrics

 Abstract Views : 163 times
 PDF (English) Downloaded : 42 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Muhammad Athallah Afif, Munirul Ula, Lidya Rosnita, Rizal

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Journal of Advanced Computer Knowledge and Algorithms


JACKA indexed by

EuroPub_logoGoogle_Scholar_logogaruda_logodimension_logocrossref_logobase_logoworldcat_logoscilit_logoleiden_logo


Berkas:Logo-Unimal-Aceh Utara.png - Wikipedia bahasa Indonesia,  ensiklopedia bebas
Department of Informatics
Faculty of Engineering
Universitas Malikussaleh
Website : UNIVERSITAS MALIKUSSALEH
Journal Email : jacka@unimal.ac.id


Location


Creative Commons License
Journal of Advanced Computer Knowledge and Algorithms is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.