Downloads 479

..............................

Views 1k

..............................

Cited by

..............................

Received date March 8, 2010

Accepted date October 24, 2010 1.

Environment Recognition for Digital Audio Forensics Using MPEG-7 and Mel Cepstral Features

Author compared to other branches of audio forensics, it i s a less researched one. Especially less attention has been given to detect ,

Abstract Environment recognition from digital audio for fore nsics application is a growing area of interest. However, compared to other branches of audio forensics, it i s a less researched one. Especially less attention has been given to detect environment from files where foreground speech is p resent, which is a forensics scenario. In this paper, we perform several experiments focusing on the problems of environment recognition from audio particularly for forensics application. Experimental results show that the task is easier w hen audio files contain only environmental sound th an when they contain both foreground speech and background environment. We propose a full set of MPEG-7 audio features comb ined with Mel Frequency Cepstral Coefficients (MFCCs) to improve the accuracy. In the experiments, the proposed approach significantly increases the recognition accuracy of environment s ound even in the presence of high amount of foregro und human speech.

References

[1] AES AES43-2000: AES Standard for Forensics Purposes-Criteria for the Authentication of Analog Audio Tape Recordings, Journal of the Audio Engineering Society , vol. 48, no. 3, pp. 204-214, 2000.

[2] Broeders A., Forensics Speech and Audio Analysis: the State of the Art in 2000 AD, in Proceedings of Actas Del I Congreso de la Sociedad Espanola de Acustica Forense , Spain, pp. 13-24, 2000.

[3] Campbell P., Shen W., Campbell M., Schwartz R., Bonastre F., and Matrouf D., Forensics Speaker Recognition: A Need for Caution, IEEE Signal Processing Magazine , vol. 26, no. 2, pp. 95-103, 2009.

[4] Campbell W., Brady K., Campbell J., Reynolds D., and Granville R., Understanding Scores in Forensics Speaker Recognition, in Proceedings of Speaker Recognition Workshop , San Juan, pp. 1-8, 2006.

[5] Champod C. and Meuwly D., The Inference of Identity in Forensics Speaker Recognition, Speech Communication , vol. 31, no. 2-3, pp. 193- 203, 2000.

[6] Delp E., Memon N., and Wu M., Digital Forensics, IEEE Signal Processing Magazine , vol. 3, no. 1, pp. 14-15, 2009.

[7] Duda R., Hart P., and Stork D., Pattern Classification , 2 nd Edition, John Wiley & Sons, NY, 2001.

[8] Eronen J., Peltonen T., Tuomi T., Klapuri P., Fagerlund S., Sorsa T., Lorho G., and Huopaniemi J., Audio-Based Context Recognition, IEEE Transactions Audio, Speech and Language Processing , vol. 14, no. 1, pp. 321-329, 2006.

[9] Kraetzer C., Oermann A., Dittmann J., and Lang A., Digital Audio Forensics: A First Practical Evaluation on Microphone and Environmental Classification, in Proceedings of ACM Multi Media and Security , USA, pp. 63-73, 2007.

[10] Ma L., Smith D., and Milner B., Context Awareness Using Environmental Noise Classification, in Proceedings of 8 th European Conference on Speech Communication and Technology , Switzerland, pp. 2237-2240, 2003.

[11] Maher C., Audio Enhancement Using Nonlinear Time-Frequency Filtering, in Proceedings of 26 th Conference , Audio Forensics in the Digital Age , Denver, pp. 104-112, 2005.

[12] Malkin G. and Waibel A., Classifying User Environment for Mobile Applications Using Linear Autoencoding of Ambient Audio, in Proceedings of IEEE Acoustics , Speech , and Signal Processing , USA, pp. 509-512, 2005.

[13] Mallat S. and Zhang Z., Matching Pursuits with Time-Frequency Dictionaries, IEEE Transactions Signal Processing , vol. 41, no. 12, pp. 3397-3415, 1993.

[14] Musialik C. and Hatje U., Frequency-Domain Processors for Efficient Removal of Noise and Unwanted Audio Events, in Proceedings of 26 th Conference , Audio Forensics in the Digital Age , Denver, pp. 65-77, 2005.

[15] Ntalampira S., Potamitis N., and Fakotakis N., Automatic Recognition of Urban Environmental Sounds Events, in Proceedings of Workshop on Cognitive Information Processing , Greece, pp. 110-113, 2008.

[16] Rabiner L. and Juang B., Fundamentals of Speech Recognition , Prentice Hall, USA, 1993.

[17] Selina C., Narayanan S., and Kuo J., Environmental Sound Recognition Using MP- Based Features, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing , Las Vegas, pp. 1-4, 2008.

[18] Selina C., Narayanan S., Kuo J., and Mataric M., Where am I? Scene Recognition for Mobile Robots Using Audio Features, in Proceedings of IEEE International Conference on Multimedia Expo , Ontario, pp. 885-888, 2006.

[19] TU-Berlin MPEG-7 Audio Analyzer, available at: http://mpeg7lld.nue.tu-berlin.de/, last visited 2004.

[20] Wang C., Wang F., He K., and Hsu C., Environmental Sound Classification Using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor, in Proceedings of IEEE International Conference on Neural Networks , Vancouver, pp. 1731-1735, 2006.

[21] Yassine B., Mona D., and Paolo R., Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition, The International Arab Journal for Information Technology , vol. 6, no. 5, pp. 464- 472, 2009.

[22] Zeng Z., Li X., Ma X., and Ji Q., Adaptive Context Recognition Based on Audio Signal, in Proceedings of 19 th International Conference on Pattern Recognition , Tampa, pp. 1-4, 2008. 50 The International Arab Journal of Information Te chnology, Vol. 10, No. 1, January 2013 Ghulam Muhammad received his BSC degree in computer science and engineering in 1997 from Bangladesh University of Engineering and Technology, and ME and PhD degrees in 2003 and 2006, respectively, from Toyohashi University of Technology, Japan. After serving as a Japan Society for the Promotion of Science (JSPS) fellow, he joined as a faculty member in the Colleg e of Computer and Information Sciences at King Saud University, Saudi Arabia. His research interests in clude automatic speech recognition, signal processing, an d multimedia forensics. Khaled Alghathbar He received his PhD in Information Technology from George Mason University, USA. PhD, CISSP, CISM, PMP, BS7799 Lead Auditor, is an associate professor and the director of the Centre of Excellence in Information Assurance in King Saud University, Saud i Arabia. He is a security advisor for several govern ment agencies. His main research interests is in informa tion security management, policies, biometrics and desig n.