K-NN with Purity Algorithm to Enhance the Classification of the Air Quality Dataset

Sujacka Retno, Novia Hasdyna, Balqis Yafis

Abstract


The large number of attributes in a large dataset can cause a decrease in the level of classification accuracy. Attribute reduction can be a solution to improve classification performance, especially in the K-NN algorithm. This research discusses the classification results of K-NN with attribute reduction using Purity. Based on the results of testing carried out on the Air Quality Dataset, the level of accuracy obtained after attribute reduction was 70.71%, while the level of accuracy obtained before attribute reduction was 56.44%, the increase in accuracy obtained from testing this dataset was equal to 14.27%. The proposed Purity method for attribute reduction can increase the accuracy level of the K-NN classification process.

Keywords


K-NN, Purity, Reduction, Attributes

Full Text:

PDF (Indonesian)

References


Anisah, S., Honggowibowo, A. S., & Pujiastuti, A. (2016). Klasifikasi Teks Menggunakan Chi Square Feature Selection Untuk Menentukan Komik Berdasarkan Periode, Materi Dan Fisik dengan Algoritma Naivebayes. Compiler, 5(2), 59–66. https://doi.org/10.28989/compiler.v5i2.171

Arifin, M. (2015). Ig-Knn Untuk Prediksi Customer Churn Telekomunikasi. Simetris : Jurnal Teknik Mesin, Elektro Dan Ilmu Komputer, 6(1), 1. https://doi.org/10.24176/simet.v6i1.230

Bertini, J. R., Zhao, L., Motta, R., & Lopes, A. D. A. (2011). A nonparametric classification method based on K-associated graphs. Information Sciences, 181(24), 5435–5456. https://doi.org/10.1016/j.ins.2011.07.043

Fawcett, T. (2006). An Introduction to ROC Analysis. Pattern Recognition Letters, 27(8), 861-874. https:/doi.org/10.1016/j.patrec.2005.10.010

Forestier, G., Wemmert, C., & Gançarski, P. (2010). Background knowledge integration in clustering using purity indexes. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6291 LNAI, 28–38. https://doi.org/10.1007/978-3-642-15280-1_6

Gorunescu, F. 2011. Data Mining Concepts, Models and Techniques. Verlah Berlon Heidelberg: Spinger.

Harikumar, S., & Surya, P. V. (2015). K-Medoid Clustering for Heterogeneous DataSets. Procedia Computer Science, 70, 226–237. https://doi.org/10.1016/j.procs.2015.10.077

Manning, C. D., Raghavan, P., Schutze, H., Manning, C. D., Raghavan, P., & Schutze, H. (2012). Flat clustering. Introduction to Information Retrieval, c, 321–345. https://doi.org/10.1017/cbo9780511809071.017

Munzir, A. F. H., Adiwijaya & Aditsania, A. (2018). Analisis Reduksi Dimensi Pada Klasifikasi Microarray Menggunakan Mbp Powell Beale. E-Jurnal Matematika, 7(1), 17. https://doi.org/10.24843/mtk.2018.v07.i01.p179

Nasution, M. Z. (2019). Penerapan Principal Component Analysis (PCA) Dalam Penentuan Faktor Dominan Yang Mempengaruhi Prestasi Belajar Siswa (Studi Kasus : SMK Raksana 2 Medan). Jurnal Teknologi Informasi, 3(1), 41. https://doi.org/10.36294/jurti.v3i1.686

Park, S., & Park, N. W. (2019). Effects of class purity of training data on crop classification using 2D-CNn. 40th Asian Conference on Remote Sensing, ACRS 2019, 1–5.

Park, S., & Park, N. W. (2020). Effects of class purity of training patch on classification performance of crop classification with convolutional neural network. Applied Sciences (Switzerland), 10(11). https://doi.org/10.3390/app10113773

Patil, S. S., & Sonavane, S. P. (2017). Improved classification of large imbalanced data sets using rationalized technique: Updated Class Purity Maximization Over_Sampling Technique (UCPMOT). Journal of Big Data, 4(1), 1–32. https://doi.org/10.1186/s40537-017-0108-1

Prasetyo, E (2014). Reduksi Dimensi Set Data dengan DRC pada Metode Klasifikasi SVM dengan Upaya Penambahan Komponen Ketiga. Prosiding SNATIF. 293–300.

Retno, S., Nababan, E. B., & Efendi, S (2019). Initial Centroid of K-Means Algorithm using Purity to Enhance the Clustering Results. International Journal of Trend in Research and Development (IJTRD), 6(3), 348–351.

Sahu, M., Nagwani, N. K., Verma, S., & Shirke, S. (2015). Performance Evaluation of Different Classifier for Eye State Prediction Using EEG Signal. International Journal of Knowledge Engineering-IACSIT, 1(2), 141–145. https://doi.org/10.7763/ijke.2015.v1.24

Sripada, S. C. (2011). Comparison of Purity and Entropy of K-Means Clustering and Fuzzy C Means Clustering. Indian Journal of Computer Science and Engineering, 2(3), 343–346. http://www.ijcse.com/docs/IJCSE11-02-03-105.pdf

Wahyuni, E. S. (2016). Penerapan Metode Seleksi Fitur Untuk Meningkatkan Hasil Diagnosis Kanker Payudara. Simetris : Jurnal Teknik Mesin, Elektro Dan Ilmu Komputer, 7(1), 283. https://doi.org/10.24176/simet.v7i1.516.




DOI: https://doi.org/10.29103/jacka.v1i2.15890

Article Metrics

 Abstract Views : 50 times
 PDF (Indonesian) Downloaded : 9 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Sujacka Retno, Novia Hasdyna, Balqis Yafis

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Journal of Advanced Computer Knowledge and Algorithms


JACKA indexed by

Google_Scholar_logogaruda_logo


Berkas:Logo-Unimal-Aceh Utara.png - Wikipedia bahasa Indonesia,  ensiklopedia bebas
Department of Informatics
Faculty of Engineering
Universitas Malikussaleh
Website : UNIVERSITAS MALIKUSSALEH
Journal Email : jacka@unimal.ac.id


Location


Creative Commons License
Journal of Advanced Computer Knowledge and Algorithms is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.