Kingkearns6277

Our extensive experimental results on benchmark datasets and ablation studies demonstrate that the proposed LTI-ST method outperforms existing index methods by a large margin while providing the above new capabilities which are highly desirable in practice.This article proposes a hybrid multi-dimensional features fusion structure of spatial and temporal segmentation model for automated thermography defects detection. In addition, the newly designed attention block encourages local interaction among the neighboring pixels to recalibrate the feature maps adaptively. A Sequence-PCA layer is embedded in the network to provide enhanced semantic information. The final model results in a lightweight structure with smaller number of parameters and yet yields uncompromising performance after model compression. The proposed model allows better capture of the semantic information to improve the detection rate in an end-to-end procedure. Compared with current state-of-the-art deep semantic segmentation algorithms, the proposed model presents more accurate and robust results. In addition, the proposed attention module has led to improved performance on two classification tasks compared with other prevalent attention blocks. In order to verify the effectiveness and robustness of the proposed model, experimental studies have been carried out for defects detection on four different datasets. The demo code of the proposed method can be linked soon http//faculty.uestc.edu.cn/gaobin/zh_CN/lwcg/153392/list/index.htm.High Efficiency Video Coding (HEVC) can significantly improve the compression efficiency in comparison with the preceding H.264/Advanced Video Coding (AVC) but at the cost of extremely high computational complexity. Hence, it is challenging to realize live video applications on low-delay and power-constrained devices, such as the smart mobile devices. In this article, we propose an online learning-based multi-stage complexity control method for live video coding. The proposed method consists of three stages multi-accuracy Coding Unit (CU) decision, multi-stage complexity allocation, and Coding Tree Unit (CTU) level complexity control. Consequently, the encoding complexity can be accurately controlled to correspond with the computing capability of the video-capable device by replacing the traditional brute-force search with the proposed algorithm, which properly determines the optimal CU size. Specifically, the multi-accuracy CU decision model is obtained by an online learning approach to accommodate the different characteristics of input videos. In addition, multi-stage complexity allocation is implemented to reasonably allocate the complexity budgets to each coding level. In order to achieve a good trade-off between complexity control and rate distortion (RD) performance, the CTU-level complexity control is proposed to select the optimal accuracy of the CU decision model. The experimental results show that the proposed algorithm can accurately control the coding complexity from 100% to 40%. Furthermore, the proposed algorithm outperforms the state-of-the-art algorithms in terms of both accuracy of complexity control and RD performance.Person re-identification (Re-ID) aims to match pedestrian images across various scenes in video surveillance. There are a few works using attribute information to boost Re-ID performance. Specifically, those methods leverage attribute information to boost Re-ID performance by introducing auxiliary tasks like verifying the image level attribute information of two pedestrian images or recognizing identity level attributes. Identity level attribute annotations cost less manpower and are well-fitted for person re-identification task compared with image-level attribute annotations. #link# However, the identity attribute information may be very noisy due to incorrect attribute annotation or lack of discriminativeness to distinguish different persons, which is probably unhelpful for the Re-ID task. In this paper, we propose a novel Attribute Attentional Block (AAB), which can be integrated into any backbone network or framework. Our AAB adopts reinforcement learning to drop noisy attributes based on our designed reward and then utilizes aggregated attribute attention of the remaining attributes to facilitate the Re-ID task. Experimental results demonstrate that our proposed method achieves state-of-the-art results on three benchmark datasets.Mismatches between the precisions of representing the disparity, depth value and rendering position in 3D video systems cause redundancies in depth map representations. In this paper, we propose a highly efficient multiview depth coding scheme based on Depth Histogram Projection (DHP) and Allowable Depth Distortion (ADD) in view synthesis. Firstly, DHP exploits the sparse representation of depth maps generated from stereo matching to reduce the residual error from INTER and INTRA predictions in depth coding. We provide a mathematical foundation for DHP-based lossless depth coding by theoretically analyzing its rate-distortion cost. Then, due to the mismatch between depth value and rendering position, there is a many-to-one mapping relationship between them in view synthesis, which induces the ADD model. Based on this ADD model and DHP, depth coding with lossless view synthesis quality is proposed to further improve the compression performance of depth coding while maintaining the same synthesized video quality. Experimental results reveal that the proposed DHP based depth coding can achieve an average bit rate saving of 20.66% to 19.52% for lossless coding on Multiview High Efficiency Video Coding (MV-HEVC) with different groups of pictures. In addition, our depth coding based on DHP and ADD achieves an average depth bit rate reduction of 46.69%, 34.12% and 28.68% for lossless view synthesis quality when the rendering precision varies from integer, half to quarter pixels, respectively. We obtain similar gains for lossless depth coding on the 3D-HEVC, HEVC Intra coding and JPEG2000 platforms.Detection and analysis of informative keypoints is a fundamental problem in image analysis and computer vision. Keypoint detectors are omnipresent in visual automation tasks, and recent years have witnessed a significant surge in the number of such techniques. Evaluating QNZ inhibitor of keypoint detectors remains a challenging task owing to the inherent ambiguity over what constitutes a good keypoint. In this context, we introduce a reference based keypoint quality index which is based on the theory of spatial pattern analysis. Unlike traditional correspondence-based quality evaluation which counts the number of feature matches within a specified neighborhood, we present a rigorous mathematical framework to compute the statistical correspondence of the detections inside a set of salient zones (cluster cores) defined by the spatial distribution of a reference set of keypoints. We leverage the versatility of the level sets to handle hypersurfaces of arbitrary geometry, and develop a mathematical framework to estimate the model parameters analytically to reflect the robustness of a feature detection algorithm.

Autoři článku: Kingkearns6277 (Lauritsen Lorentsen)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Kingkearns6277