It is a challenging problem that involves building upon methods for object recognition (e.g. where are they), object localization (e.g. what are their extent), and object classification (e.g. what are they). In recent years, deep learning techniques are achieving state-of-the-art results for object detection, such as on standard benchmark datasets and in computer vision competitions. Notable is the “You Only Look Once,” or YOLO, a family of Convolutional Neural Networks that achieve near state-of-the-art results with a single end-to-end model that can perform object detection in real-time. YOLO-based Convolutional Neural Network family of models for object detection and the most recent variation called YOLOv3. The best-of-breed open-source library implementation of the YOLOv3 for the Keras deep learning library.
YOLO for Object Detection:
Object detection is a computer vision task that involves both localizing one or more objects within an image and classify each object in the image.
It is a challenging computer vision task that requires both successful object localization to locate and draw a bounding box around each object in an image and object classification to predict the correct class of object that was localized.
The “You Only Look Once,” or YOLO, a family of models are a series of end-to-end deep learning models designed for fast object detection, developed by Joseph Redmon, et al. and first described in the 2015 paper titled “You Only Look Once: Unified, Real-Time Object Detection.” The approach involves a single deep convolutional neural network (originally a version of GoogLeNet, later updated and called DarkNet based on VGG) that splits the input into a grid of cells and each cell directly predicts a bounding box and object classification. A result is a large number of candidate bounding boxes that are consolidated into a final prediction by a post-processing step.
There are three main variations of the approach, at the time of writing; they are YOLOv1, YOLOv2, and YOLOv3. The first version proposed the general architecture, whereas the second version refined the design and made use of predefined anchor boxes to improve the bounding box proposal, and version three further refined the model architecture and training process.
Although the accuracy of the models is close but not as good as Region-Based Convolutional Neural Networks (R-CNNs), they are popular for object detection because of their detection speed, often demonstrated in real-time on video or with a camera feed input. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance.
FOLLOW US ON INSTAGRAM, FACEBOOK AND PINTEREST
DISCLAIMER
The information is provided by Tecquisition for general informational and educational purposes only and is not a substitute for professional legal advice. If you have any feedback, comments, requests for technical support or other inquiries, please mail us by tecqusition@gmail.com.
Comments