Through the years, environmental health and protection has been ignored. However, because of recent phenomena such as climate change, people are slowly becoming aware of the environment. One of the main concern nowadays is air pollution. To this avail, the U.S. Environmental Protection Agency (EPA) standardized air quality with the use of air quality index (AQI). However, AQI requires accurate sensor readings and complex calculations to obtain. Hence, the objective of this paper is to solve that problem by characterizing the air quality with regards to AQI through the use of k-nearest neighbors machine learning algorithm. The proposed methodology is implemented using a prototype of integrated gas sensors for data gathering. R programming, focusing on classification and regression training (caret) package for data processing, model development, and algorithm tuning, is utilized. The system is evaluated, and an accuracy of 99.56% is obtained.
Keywords: air quality characterization, AQI, KNN machine learning, sensor networks, r programming, caret[1] World Health Organization (WHO), “7 million premature deaths annu- ally linked to air pollution,” Mar. 2014.
[Online]. Available: http://www. who.int/mediacentre/news/releases/2014/air-pollution/en/.
[2] U.S. Environmental Protection Agency, “Air quality guide for particle pollution,” US EPA, 2015.
[3] Y.C. Wang and G.W. Chen, “Efficient data gathering and estimation for metropolitan air quality monitoring by using vehicular sensor networks,” IEEE Trans. Veh. Technol., vol. 66, no. 8, pp. 7234–7248, 2017.
[4] Y. Li and J. He, “Design of an intelligent indoor air quality monitoring and purification device,” in 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), 2017, pp. 1147–1150.
[5] J. Molka-Danielsen, P. Engelseth, V. Olesnanikova, P. Sarafin, and R. Zalman, “Big data analytics for air quality monitoring at a logistics shipping base via autonomous
wireless sensor network technologies,” 2017 5th Int. Conf. Enterp. Syst., pp. 38–45, 2017.
[6] Y. Wu et al., “Mobile microscopy and machine learning provide accurate and high-throughput monitoring of air quality,” in 2017 IEEE Conference on Lasers and ElectroOptics, 2017.
[7] T. M. Chiwewe and J. Ditsela, “Machine learning based estimation of Ozone using spatio-temporal data from air quality monitoring stations,” in 2016 IEEE 14th International Conference on Industrial Informatics (INDIN), 2016, pp. 58–63.
[8] U.S. Environmental Protection Agency, “Technical assistance document for the reporting of daily air quality—The air quality index (AQI),” US EPA, Dec. 2013.
[9] R. Agrawal, “k-nearest neighbor for uncertain data,” Int. J. of Computer Applications, vol. 105, no. 11, pp. 13-16, 2014.
[10] J. M. Cadenas, M. C. Garrido, R. Martinez-Espana, and A. Munoz, “A more realistic k-nearest neighbors method and its possible applications to everyday problems,” in 2017 IEEE
International Conference on Intelligent Environments (IE), 2017, pp. 52–59.
[11] S. Borra and A. Di Ciaccio, “Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods,” Comput. Stat. Data Anal., vol. 54, no. 12,
pp. 2976–2989, 2010.
[12] W. S. Cleveland, E. Grosse, and W. M. Shyu. “Local regression models,” in Statistical Models in S, J.M. Chambers and T.J. Hastie, Eds. Wadsworth & Brooks/Cole,
Pacific Grove, California, 1992.