Histograms of Oriented Gradients for Human Detection¶
Why this mattered¶
TBD
Abstract¶
We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.
Related¶
- cite → Distinctive Image Features from Scale-Invariant Keypoints — HOG builds on SIFT's local gradient-orientation descriptor idea, adapting oriented gradient histograms from keypoints to dense human-detection windows.
- enables → Are we ready for autonomous driving? The KITTI vision benchmark suite — HOG pedestrian detection became a baseline vision feature for evaluating object-recognition performance in the KITTI autonomous-driving benchmark.
- enables → Selective Search for Object Recognition — HOG showed that gradient-orientation histograms capture object shape, a cue selective search incorporated among complementary region descriptors.
- enables → Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation — HOG demonstrated that oriented-gradient descriptors are effective for object detection, setting the feature-engineering context that R-CNN replaced with learned CNN representations.
- cite ← Are we ready for autonomous driving? The KITTI vision benchmark suite — The KITTI benchmark uses Histograms of Oriented Gradients as a baseline feature representation for object detection tasks such as pedestrian recognition.
- cite ← Selective Search for Object Recognition — Selective Search relates its object-region proposals to detection pipelines that use HOG features for recognizing object categories.
- cite ← Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation — R-CNN contrasts its learned region features with HOG descriptors used in earlier object detection pipelines.
- cite ← The Pascal Visual Object Classes (VOC) Challenge — The PASCAL VOC Challenge cites HOG as an influential gradient-feature method for pedestrian and object detection within benchmarked recognition pipelines.