object tracking

object tracking

Multiple Object Tracking(MOT) is the task of detecting various objects of interest in a video, tracking these detected objects in subsequent frames by assigning them a unique ID, and maintaining these unique IDs as the objects move around in a video in successive frames.Generally, multiple object tracking happens in two stages: object detection and object association. Object detection is the process of identifying all potential objects of interest in the current frame using object detectors such as Faster-RCNN or YOLO. Object association is the process of linking objects detected in the current frame with its corresponding objects from previous frames, referred to as tracklets. Object or instance association is usually done by predicting the object’s location at the current frame based on previous frames’ tracklets using the Kalman Filter followed by one-to-one linear assignment typically using the Hungarian Algorithm to minimise the total differences between the matching results.

Metrics

MOTP (Multiple Object Tracking Precision)

MOTP (Multi-Object Tracking Precision) expresses how well exact positions of the object are estimated. It is the total error in estimated position for matched ground truth-hypothesis pairs over all frames, averaged by the total number of matches made. This metric is not responsible for recognizing object configurations and evaluating object trajectories.

MOTA (Multiple Object Tracking Accuracy)

MOTA (Multi-Object Tracking Accuracy) shows how many errors the tracker system has made in terms of Misses, False Positives, Mismatch errors, etc. Therefore, it can be derived from three error ratios: the ratio of Misses, the ratio of False positives, and the ratio of Mismatches over all the frames.

IDF1 score (IDF1)

IDF1 score (IDF1) is the ratio of correctly identified detections over the average of ground truth and predicted detections.

Benchmarks

OTB

KITTI

MOT16

Methods(models)

IOU tracker

The Intersection-Over-Union (IOU) tracker uses the IOU values among the detector’s bounding boxes between the two consecutive frames to perform the association between them or assign a new target ID if no match found.

Simple Online And Realtime Tracking (SORT)

Simple Online And Realtime Tracking (SORT) is a lean implementation of a tracking-by detection framework.SORT uses the position and size of the bounding boxes for both motion estimation and data association through frames. SORT combines location and motion cues by adopting a Kalman filter to predict the location of the tracklets in the new frame, then computes the IoU between the detection boxes and the predicted boxes as the similarity.

DeepSORT

DeepSORT replaces the association metric with a more informed metric that combines motion and appearance information. In particular, a “deep appearance” distance metric is added. The core idea is to obtain a vector that can be used to represent a given image. DeepSort adopts a stand-alone RE-ID model to extract appearance features from the detection boxes. After similarity computation matching strategy assigns identities to the objects. This can be done by the Hungarian Algorithm or greedy assignment.

FairMOT

FairMOT is a new tracking approach built on top of the anchor-free object detection architecture CenterNet.It has a simple network structure that consists of two homogeneous branches for detecting objects and extracting re-ID features.

TransMOT

TransMOT is a new spatial-temporal graph Transformer that solves all these issues. It arranges the trajectories of all the tracked objects as a series of sparse weighted graphs that are constructed using the spatial relationships of the targets. TransMOT then uses these graphs to create a spatial graph transformer encoder layer, a temporal transformer encoder layer, and a spatial transformer decoder layer to model the spatial-temporal relationships of the objects.

ByteTrack

BYTE is an effective association method that utilizes all detection boxes from high scores to low ones in the matching process.BYTE is built on the premise that the similarity with tracklets provides a strong cue to distinguish the objects and background in low score detection boxes. BYTE first matches the high score detection boxes to the tracklets based on motion similarity. It uses Kalman Filter to predict the location of the tracklets in the new frame. The motion similarity is computed by the IoU of the predicted box and the detection box. Then, it performs the second matching between the unmatched tracklets.

The primary innovation of BYTETrack is keeping non-background low confidence detection boxes which are typically discarded after the initial filtering of detections and use these low-score boxes for a secondary association step. Typically, occluded detection boxes have lower confidence scores than the threshold, but still contain some information about the objects which make their confidence score higher than purely background boxes. Hence, these low confidence boxes are still meaningful to keep track of during the association stage.

Comparison of DeepSort and ByteTrack

DeepSort uses a pre-trained object detection model to detect objects in each frame and a Siamese network to match the detected objects based on their appearance features. It also uses Kalman filters to predict the locations of the objects in the next frame. ByteTrack, on the other hand, uses a lightweight Siamese network architecture that takes in two input frames and outputs a similarity score. It also uses a simple but effective data augmentation technique to improve its performance on challenging datasets.

using ByteTrack

ByteTracker initiates a new tracklet only if a detection is not matched with any previous tracklet and the bounding box score is higher than a threshold.

references

https://www.datature.io/blog/introduction-to-bytetrack-multi-object-tracking-by-associating-every-detection-box
https://pub.towardsai.net/multi-object-tracking-metrics-1e602f364c0c
https://learnopencv.com/object-tracking-and-reidentification-with-fairmot/
https://medium.com/augmented-startups/top-5-object-tracking-methods-92f1643f8435
https://medium.com/@pedroazevedo6/object-tracking-state-of-the-art-2022-fe9457b77382

Author

s-serenity

Posted on

2023-08-03

Updated on

2023-08-27

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.