一种基于学习及视觉感知启发的目标分类方法 -- 西北工业大学学报,2018,36(2):359-367

	论文:2018,Vol:36,Issue(2):359-367
	引用本文：
	李娜, 赵歆波, 杨勇佳, 邹晓春. 一种基于学习及视觉感知启发的目标分类方法[J]. 西北工业大学学报
	Li Na, Zhao Xinbo, Yang Yongjia, Zou Xiaochun. A Method of Objects Classification Based on Learning and Visual Perception[J]. Northwestern polytechnical university

一种基于学习及视觉感知启发的目标分类方法

李娜¹, 赵歆波¹, 杨勇佳¹, 邹晓春²

1. 西北工业大学计算机学院, 陕西西安 710029;
2. 西北工业大学电子信息学院, 陕西西安 710029

摘要:

目标分类是计算机视觉研究中的重要基本问题之一。为提高目标分类的准确率，由对目标进行人工分类的完整过程所得到的启发，提出了一种视觉注意力模型与CNN相结合的目标分类新方法。该方法与传统目标分类方法相比，在分类过程上更接近于人工行为，有明显的生物学优势。首先，建立一个基于分类任务的眼动数据库，研究并记录人在进行目标分类时的视觉行为；然后，利用该数据库训练出一个结合低层特征（如方向、颜色、强度等）及高层特征（如人、脸、汽车等）的视觉注意力模型，以此，预测人工区分不同目标时所感兴趣的区域；最后设计CNN网络，利用视觉注意力模型所得到的感兴趣区域进行目标分类。实验结果表明，所提出的视觉注意力模型可以更准确地预测人在分类时的感兴趣区域，可显著提高目标分类的准确度，并且收敛速度更快。

关键词: 视觉注意力模型 CNN 目标分类 SVM

A Method of Objects Classification Based on Learning and Visual Perception

Li Na¹, Zhao Xinbo¹, Yang Yongjia¹, Zou Xiaochun²

1. School of Computer Science, Northwestern Polytechnical University, Xi'an 710029, China;
2. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710029, China

Abstract:

Objects classification is one of the most significant problems in computer vision. For improving the accuracy of objects classification,we put forward a new classification method enlightened the whole process that human distinguish different types of objects. Our method mixed visual saliency model and CNN, is more close to human and has apparently biological advantages. Firstly, we built an eye-tracking database to learn people visual behaviors when they classify various objects and recorded the eye-tracking data. Secondly, this database is used to train a learning-based visual attention model, which is based on low-level (e.g., orientation, color, intensity, etc.) and high-level (e.g., faces, people, cars, etc.) image features to analyze and predict the human's classification RoIs. Finally, we established a CNN framework to classify RoIs. The results of the experiment showed our attention model can determine saliency regions and predict human's classification RoIs more precisely and our classification method improved the efficiency of classification markedly.

Key words: visual attention model CNN objects classification SVM

收稿日期: 2017-04-12 修回日期:

DOI:

基金项目: 国家自然科学基金（NCYM0001，61117115，61201319）资助

通讯作者: Email：

作者简介: 李娜(1992-),女,西北工业大学硕士研究生,主要从事图像处理研究。

相关功能

PDF(3533KB) Free

打印本文

把本文推荐给朋友

作者相关文章

李娜在本刊中的所有文章

赵歆波 在本刊中的所有文章

杨勇佳 在本刊中的所有文章

邹晓春 在本刊中的所有文章


	参考文献:
	[1] Itti L, Koch C. A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention[J]. Vision Research, 2000, 40(12):1489-1506 [2] Garcia-Diaz A, Fdez-Vidal X R, Pardo X M, et al. Decorrelation and Distinctiveness Provide with Human-Like Saliency[C]//International Conference on Advanced Concepts for Intelligent Vision Systems Springer, Berlin, Heidelberg, 2009, 5807:343-354 [3] Zhang L, Tong M H, Marks T K, et al. Sun:A Bayesian Framework for Saliency Using Natural Statistics[J]. Journal of Vision, 2008, 8(7):1-20 [4] Torralba A. Modeling Global Scene Factors in Attention[J]. Journal of The Optical Society of America A, 2003, 20(7):1407-1418 [5] Schölkopf B, Platt J, Hofmann T. Graph-Based Visual Saliency[J]. Advances in Neural Information Processing Systems, 2007, 19:545-552 [6] Schölkopf B, Platt J, Hofmann T. A Nonparametric Approach to Bottom-Up Visual Saliency[C]//International Conference on Neural Information Processing Systems, 2006:689-696 [7] Judd T, Ehinger K, Durand F, et al. Learning to Predict Where Humans Look[C]//IEEE International Conference on Computer Vision, 2010:2106-2113 [8] Swain M J, Ballard D H. Indexing via Color Histograms[C]//International Conference on Computer Vision, 1990:390-393 [9] Schiele B, Crowley J L. Recognition Without Correspondence Using Multidimensional Receptive Field Histograms[J]. International Journal of Computer Vision, 2000, 36(1):31-50 [10] Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110 [11] Lindeberg T. Scale Invariant Feature Transform[M]. Scholarpedia, 2012:2012-2021 [12] Mohan R, Nevatia R. Perceptual Organization for Scene Segmentation and Description[J]. IEEE Trans on Pattern Analysis & Machine Intelligence, 1992, 14(6):616-635 [13] Lillywhite K, Lee D J, Tippetts B, et al. A Feature Construction Method for General Object Recognition[J]. Pattern Recognition, 2013, 46(12):3300-3314 [14] Hubel D H, Wiesel T N. Receptive Fields and Functional Architecture of Monkey Striate Cortex[J]. Journal of Physiology, 1968, 195(1):215-243 [15] Krizhevsky A, Sutskever I, Hinton G E. Imagenet Classification with Deep Convolutional Neural Networks[C]//International Conference on Neural Information Processing Systems, 2012:1097-1105 [16] Bruce N D B, Tsotsos J K. Saliency Based on Information Maximization[C]//International Conference on Neural Information Processing Systems, 2005:155-162 [17] Garcia-Diaz A, Fdez-Vidal X R, Pardo X M, et al. Decorrelation and Distinctiveness Provide with Human-Like Saliency[C]//International Conference on Advanced Concepts for Intelligent Vision Systems, 2009:343-354

邮编:710072 电话：029-88495455 Email：xuebao@nwpu.edu.cn

本系统由北京仁和汇智信息技术有限公司设计开发技术支持：info@rhhz.net