Kyedhansson6311

Recent advances in deep convolution neural networks (CNNs) boost the development of video salient object detection (SOD), and many remarkable deep-CNNs video SOD models have been proposed. However, many existing deep-CNNs video SOD models still suffer from coarse boundaries of the salient object, which may be attributed to the loss of high-frequency information. The traditional graph-based video SOD models can preserve object boundaries well by conducting superpixels/supervoxels segmentation in advance, but they perform weaker in highlighting the whole object than the latest deep-CNNs models, limited by heuristic graph clustering algorithms. To tackle this problem, we find a new way to address this issue under the framework of graph convolution networks (GCNs), taking advantage of graph model and deep neural network. Specifically, a superpixel-level spatiotemporal graph is first constructed among multiple frame-pairs by exploiting the motion cues implied in the frame-pairs. Then the graph data is imported into the devised multi-stream attention-aware GCN, where a novel Edge-Gated graph convolution (GC) operation is proposed to boost the saliency information aggregation on the graph data. A novel attention module is designed to encode the spatiotemporal sematic information via adaptive selection of graph nodes and fusion of the static-specific and the motion-specific graph embedding. Finally, a smoothness-aware regularization term is proposed to enhance the uniformity of salient object. Graph nodes (superpixels) inherently belonging to the same class will be ideally clustered together in the learned embedding space. Extensive experiments have been conducted on three widely used datasets. Compared with fourteen state-of-the-art video SOD models, our proposed method can well retain the salient object boundaries and possess a strong learning ability, which shows that this work is a good practice for designing GCNs for video SOD.Person re-identification (re-id) suffers from the significant challenge of occlusion, where an image contains occlusions and less discriminative pedestrian information. Selleckchem N-Formyl-Met-Leu-Phe However, certain work consistently attempts to design complex modules to capture implicit information (including human pose landmarks, mask maps, and spatial information). The network, consequently, focuses on discriminative features learning on human non-occluded body regions and realizes effective matching under spatial misalignment. Few studies have focused on data augmentation, given that existing single-based data augmentation methods bring limited performance improvement. To address the occlusion problem, we propose a novel Incremental Generative Occlusion Adversarial Suppression (IGOAS) network. It consists of 1) an incremental generative occlusion block, generating easy-to-hard occlusion data, that makes the network more robust to occlusion by gradually learning harder occlusion instead of hardest occlusion directly. And 2) a global-adversarial suppression (G&A) framework with a global branch and an adversarial suppression branch. The global branch extracts steady global features of the images. The adversarial suppression branch, embedded with two occlusion suppression module, minimizes the generated occlusion's response and strengthens attentive feature representation on human non-occluded body regions. Finally, we get a more discriminative pedestrian feature descriptor by concatenating two branches' features, which is robust to the occlusion problem. The experiments on the occluded dataset show the competitive performance of IGOAS. On Occluded-DukeMTMC, it achieves 60.1% Rank-1 accuracy and 49.4% mAP.The emergence of the single-chip polarized color sensor now allows for simultaneously capturing chromatic and polarimetric information of the scene on a monochromatic image plane. However, unlike the usual camera with an embedded demosaicing method, the latest polarized color camera is not delivered with an in-built demosaicing tool. For demosaicing, the users have to down-sample the captured images or to use traditional interpolation techniques. Neither of them can perform well since the polarization and color are interdependent. Therefore, joint chromatic and polarimetric demosaicing is the key to obtaining high-quality polarized color images. In this paper, we propose a joint chromatic and polarimetric demosaicing model to address this challenging problem. Instead of mechanically demosaicing for the multi-channel polarized color image, we further present a sparse representation-based optimization strategy that utilizes chromatic information and polarimetric information to jointly optimize the model. To avoid the interaction between color and polarization during demosaicing, we separately construct the corresponding dictionaries. We also build an optical data acquisition system to collect a dataset, which contains various sources of polarization, such as illumination, reflectance and birefringence. Results of both qualitative and quantitative experiments have shown that our method is capable of faithfully recovering full RGB information of four polarization angles for each pixel from a single mosaic input image. Moreover, the proposed method can perform well not only on the synthetic data but the real captured data.By using the viewpoint of modern computational algebraic geometry, we explore properties of the optimization landscapes of the deep linear neural network models. After clarifying on the various definitions of "flat" minima, we show that the geometrically flat minima, which are merely artifacts of residual continuous symmetries of the deep linear networks, can be straightforwardly removed by a generalized L_2 regularization. Then, we establish upper bounds on the number of isolated stationary points of these networks with the help of algebraic geometry. Using these upper bounds and utilizing a numerical algebraic geometry method, we find all stationary points for modest depth and matrix size. We show that in the presence of the non-zero regularization, deep linear networks indeed possess local minima which are not the global minima. We show that though the number of stationary points increases as the number of neurons (regularization parameter) increases (decreases), the number of higher index saddles are surprisingly rare.

Autoři článku: Kyedhansson6311 (Wilson Nielsen)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Kyedhansson6311