关于行人检测中的常用数据集

关于行人检测中的常用数据集

栏目：行业资讯发布时间：2024-07-16

　Human parsing has been extensively studied recently (Yamaguchi et al. 2012; Xia et al. 2017) due to its wide applications in many important scenarios. Mainstream fashion parsing models (i.e., parser

　　Human parsing has been extensively studied recently (Yamaguchi et al. 2012; Xia et al. 2017) due to its wide applications in many important scenarios. Mainstream fashion parsing models (i.e., parsers) focus on parsing the high-resolution

　　and clean images. However, directly applying the parsers

　　trained on benchmarks of high-quality samples to a particular application scenario in the wild, e.g., a canteen, airport

　　or workplace, often gives non-satisfactory performance due

　　to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source

　　domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient crossdomain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our

　　proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain

　　differences. A discriminative feature adversarial network is

　　introduced to supervise the feature compensation to effectively reduces the discrepancy between feature distributions

　　of two domains. Besides, our proposed model also introduces

　　a structured label adversarial network to guide the parsing

　　results of the target domain to follow the high-order relationships of the structured labels shared across domains. The

　　proposed framework is end-to-end trainable, practical and

　　scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows without any annotations, are evaluated as target

　　domains. The results consistently confirm data efficiency and

　　performance advantages of the proposed method for the challenging cross-domain human parsing problem.

　　Abstract—This paper presents a robust Joint Discriminative

　　appearance model based Tracking method using online random

　　forests and mid-level feature (superpixels). To achieve superpixel-

　　wise discriminative ability, we propose a joint appearance model

　　that consists of two random forest based models, i.e., the

　　Background-Target discriminative Model (BTM) and Distractor-

　　Target discriminative Model (DTM). More specifically, the BTM

　　effectively learns discriminative information between the target

　　object and background. In contrast, the DTM is used to suppress

　　distracting superpixels which significantly improves the tracker’s

　　robustness and alleviates the drifting problem. A novel online

　　random forest regression algorithm is proposed to build the

　　two models. The BTM and DTM are linearly combined into

　　a joint model to compute a confidence map. Tracking results

　　are estimated using the confidence map, where the position

　　and scale of the target are estimated orderly. Furthermore,

　　we design a model updating strategy to adapt the appearance

　　changes over time by discarding degraded trees of the BTM and

　　DTM and initializing new trees as replacements. We test the

　　proposed tracking method on two large tracking benchmarks,

　　the CVPR2013 tracking benchmark and VOT2014 tracking

　　challenge. Experimental results show that the tracker runs at

　　real-time speed and achieves favorable tracking performance

　　compared with the state-of-the-art methods. The results also sug-

　　gest that the DTM improves tracking performance significantly

　　and plays an important role in robust tracking.

上一篇：纳芯威代理供应 NS4205 完全替代 PAM8403 高音质 NS4205 音量大

下一篇：“不忘初心，牢记使命”主题教育应知应会