This paper presents the development of a multi-class vehicle and pedestrian detection and classification using convolutional neural network (CNN) for the analysis of traffic flow and congestion. The study focused on analyzing the traffic flow and volume at different time intervals in a microscopic scale traffic network by decomposing it into eight separate classes of vehicles and pedestrians. Traffic videos in low altitude view T-type intersection (with pedestrian lane and yellow box area), medium altitude view bus stop area, and high altitude view wide intersection are used in the analysis of different traffic flow and congestion scenarios. The CNN model used has a 78.41% training accuracy with 0.6570 loss, and 73.83% validation accuracy with 0.7083 loss for the eight output multi-object classification. The results also showed how each component (class) contributes to the overall road traffic. Private cars constitute about 55-70% of the total traffic volume at any given time, while public utility vehicles (PUVs, jeepneys, buses) only take approximately 15%. They showed that the implementation of CNN for classification is effective.
Keywords: convolutional neural network, computer vision, multi-class object classification, traffic flow and congestion analysis[1] R. Boel and L. Mihaylova, “Modelling Freeway Networks by Hybrid Stochastic Models,” 2004 IEEE Intelligent Vehicles Symposium , pp. 182-187, 2004.
[2] R. Danescu, F. Oniga and S. Nedevschi, “Modeling and Tracking the Driving Environment With a ParticleBased Occupancy Grid,” IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, vol. 12, no. 4, pp. 1331-1342, 2011.
[3] W. Xiao-xiong, D. Gao-li, Y. Li-ping and W. Dan, “Decouple Analysis on Distributed Architecture of Urban Traffic System,” ICARCV 2006 , pp. 1-6, 2006.
[4] C. Zhang, H. Li, X. Wang and X. Yang, “Cross-scene Crowd Counting via Deep Convolutional Neural Networks,” IEEE Conference Publication, pp. 833- 841, 2015.
[5] J. Shao, K. Kang, C. C. Loy and X. Wang, “Deeply Learned Attributes for Crowded Scene Understanding,” IEEE Conference Publication, pp. 4657-4666, 2015.
[6] N. Buch, S. A. Velastin and J. Orwell, “A Review of Computer Vision Techniques for the Analysis of Urban Traffic,” IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, vol. 12, no. 3, p. 920, 2011.
[7] J. Liu, Q. Yu, O. Javed, S. Ali, A. Tamrakar, A. Divakaran, H. Cheng and H. Sawhne, “Video Event Recognition Using Concept Attributes,” IEEE Conference Publication, pp. 339-346, 2013.
[8] Y. Zhu, N. M. Nayak and A. K. Roy-Chowdhury, “Context-Aware Activity Recognition and Anomaly Detection in Video,” IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, vol. 7, no. 1, pp. 91-101, 2013.
[9] R. K. C. Billones, A. A. Bandala, E. Sybingco, L. A. G. Lim and E. P. Dadios, “Intelligent system architecture for a vision-based contactless apprehension of traffic violations,” 2016 IEEE Region 10 Conference (TENCON), pp. 1871 - 1874, 2016.
[10] R. K. C. Billones, A. A. Bandala, E. Sybingco, L. A. G. Lim, A. D. Fillone and E. P. Dadios, “Vehicle Detection and Tracking using Corner Feature Points and Artificial Neural Networks for a Vision-based Contactless Apprehension System,” Computing Conference 2017, pp. 688-691, 2017.
[11] R. K. C. Billones, A. A. Bandala, L. A. G. Lim, E. Sybingco, A. D. Fillone and E. P. Dadios, “Microscopic Road Traffic Scene Analysis Using Computer Vision and Traffic Flow Modelling,” Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 22, no. 5, pp. 1-6, 2018.
[12] M. Oualla, A. Sadiq and M. S., “A survey of Haar-Like feature representation,” International Conference in Multimedia Computing and Systems (ICMCS), pp. 1101-1106, 2014.
[13] V. Jones, “Rapid object detection using a boosted cascade of simple features,” in Computer Vision and Pattern Recognition, 2001.
[14] R. Lienhart and J. Maydt, “An Extended Set of Haar-like Features for Rapid Object Detection,” IEEE ICIP 2002, vol. 1, pp. 900-903, 2002.
[15] Nazeer, A. S, N. Omar, K. Jumari and M. Khalid, “Face detecting using artificial neural network approach,” First Asia International Conference Modelling & Simulation, pp. 394-399, 2007.
[16] A. Mohamed, A. Issam, B. Mohamed and B. Abdellatif, “Real-time detection of vehicles using the haarlike features and artificial neuron networks,” The International Conference on Advanced Wireless, Information, and Communication Technologies (AWICT 2015), pp. 24-31, 2015.
[17] I. Arel, D. C. Rose and T. P. Karnowski, “Deep Machine Learning—A New Frontier in Artificial Intelligence Research,” IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE , pp. 13-18, 2010.
[18] F.-J. Huang and Y. LeCun, “Large-scale learning with SVM and convolutional nets for generic object categorization,” Proc. Computer Vision and Pattern Recognition Conf. (CVPR 06), 2006.
[19] R. Girshick, “FastR-CNN,” 2015 IEEE International Conference on Computer Vision, vol. 8, no. 1, pp. 1440- 1448, 2015.
[20] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going Deeper with Convolutions,” IEEE Conference Publication, pp. 1-9, 2015.
[21] karpathy@cs.stanford.edu, “CS231n Convolutional Neural Networks for Visual Recognition,” [Online]. Available: http://cs231n.github.io/convolutionalnetworks/. [Accessed 23 August 2017].
[22] J. Wu, Z. Liu, J. Li, C. Gu, M. Si and F. Tan, “An Algorithm for Automatic Vehicle Speed Detection using Video Camera,” Proceedings of 2009 4th International Conference on Computer Science & Education, pp. 193-196, 2009.
[23] K. Qian, “Simple guide to confusion matrix terminology,” Data School, 25 March 2014. [Online]. Available: http:// www.dataschool.io/simple-guide-to-confusion-matrixterminology/. [Accessed 26 March 2017].
[24] Keras, “Keras documentation,” [Online]. Available: https://keras.io. [Accessed July 2018].