Multi-camera Tracklet association and fusion using ensemble of visual and geometric cues

Kanishka Nithin 1 François Bremond 1
1 STARS - Spatio-Temporal Activity Recognition Systems
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Data association and fusion is pivot for object tracking in multi-camera network. We present a novel framework for solving online multi-object tracking in partially overlapping multi-camera network by modelling tracklet association as combinatorial optimization problem hypothesized on ensemble of cues such as appearance, motion and geometry information. Our method learns discriminant weight as a measure of consistency and discriminancy of feature patterns to make ensemble feature selection and combination between local and global tracking information. Our approach contributes uniquely in the way tracklet selection, association and fusion is done. Once multi-view correspondences are established using planar homography, Dynamic Time Warping algorithm is used to make tracklet selection for which similarity has to be calculated i.e overlapping tracklets and subtracklets. Then trajectory similarities are computed for these selective tracklets and subtracklets using ensemble of appearance and motion cues weighted by online learnt discriminative function. Later on, we tackle the association problem by building a k-partite graph and association rules to match all the pair-wise trackets. Finally, from outcome of hungarian algorithm, the associated trajectories are later fused. Fusion is done based on calculated individual tracklet reliability criteria. Experimental results demonstrate our system achieve performance that significantly improve the state of the art on PETS 2009.
Contributor : Soumik Mallick <>
Submitted on : Friday, July 27, 2018 - 5:24:53 PM
Last modification on : Wednesday, October 10, 2018 - 10:09:53 AM
Long-term archiving on : Sunday, October 28, 2018 - 12:37:48 PM


Files produced by the author(s)




Kanishka Nithin, François Bremond. Multi-camera Tracklet association and fusion using ensemble of visual and geometric cues. IEEE Transactions on Circuits and Systems for Video Technology, Institute of Electrical and Electronics Engineers, 2017, 27 (3), pp.431 - 440. ⟨10.1109/TCSVT.2016.2615538⟩. ⟨hal-01849546⟩



