Klineastrup4912

Furthermore, we employ a theory-based statistical framework to devise a consistent strategy to estimate all required parameters, including both the regularization parameters of the algorithm and the number of superpixels of the transformation, resulting in a truly blind (from the parameters setting perspective) unmixing method. Experimental results attest the superior performance of the proposed method when comparing with other, state-of-the-art, related strategies.Street Scene Parsing (SSP) is a fundamental and important step for autonomous driving and traffic scene understanding. Recently, Fully Convolutional Network (FCN) based methods have delivered expressive performances with the help of large-scale dense-labeling datasets. However, in urban traffic environments, not all the labels contribute equally for making the control decision. Certain labels such as pedestrian, car, bicyclist, road lane or sidewalk would be more important in comparison with labels for vegetation, sky or building. Based on this fact, in this paper we propose a novel deep learning framework, named Residual Atrous Pyramid Network (RAPNet), for importance-aware SSP. More specifically, to incorporate the importance of various object classes, we propose an Importance-Aware Feature Selection (IAFS) mechanism which automatically selects the important features for label predictions. The IAFS can operate in each convolutional block, and the semantic features with different importance are captured in different channels so that they are automatically assigned with corresponding weights. To enhance the labeling coherence, we also propose a Residual Atrous Spatial Pyramid (RASP) module to sequentially aggregate global-to-local context information in a residual refinement manner. Extensive experiments on two public benchmarks have shown that our approach achieves new state-of-the-art performances, and can consistently obtain more accurate results on the semantic classes with high importance levels.In this paper, we proposed a new end-to-end model, termed as dual-discriminator conditional generative adversarial network (DDcGAN), for fusing infrared and visible images of different resolutions. Our method establishes an adversarial game between a generator and two discriminators. The generator aims to generate a real-like fused image based on a specifically designed content loss to fool the two discriminators, while the two discriminators aim to distinguish the structure differences between the fused image and two source images, respectively, in addition to the content loss. Consequently, the fused image is forced to simultaneously keep the thermal radiation in the infrared image and the texture details in the visible image. Moreover, to fuse source images of different resolutions, e.g., a low-resolution infrared image and a high-resolution visible image, our DDcGAN constrains the downsampled fused image to have similar property with the infrared image. This can avoid causing thermal radiation information blurring or visible texture detail loss, which typically happens in traditional methods. GRL0617 datasheet In addition, we also apply our DDcGAN to fusing multi-modality medical images of different resolutions, e.g., a low-resolution positron emission tomography image and a high-resolution magnetic resonance image. The qualitative and quantitative experiments on publicly available datasets demonstrate the superiority of our DDcGAN over the state-of-the-art, in terms of both visual effect and quantitative metrics.Facial landmark detection aims at localizing multiple keypoints for a given facial image, which usually suffers from variations caused by arbitrary pose, diverse facial expression and partial occlusion. In this paper, we develop a two-stage regression network for facial landmark detection on unconstrained conditions. Our model consists of a Structural Hourglass Network (SHN) for detecting the initial locations of all facial landmarks based on heatmap generation, and a Global Constraint Network (GCN) for further refining the detected locations based on offset estimation. Specifically, SHN introduces an improved Inception-ResNet unit as basic building block, which can effectively improve the receptive field and learn contextual feature representations. In the meanwhile, a novel loss function with adaptive weight is proposed to make the whole model focus on the hard landmarks precisely. GCN attempts to explore the spatial contextual relationship between facial landmarks and refine the initial locations of facial landmarks by optimizing the global constraint. Moreover, we develop a pre-processing network to generate features with different scales, which will be transmitted to SHN and GCN for effective feature representations. Different from existing models, the proposed method realizes the heatmap-offset framework, which combines the outputs of heatmaps generated by SHN and coordinates estimated by GCN, to obtain an accurate prediction. The extensive experimental results on several challenging datasets, including 300W, COFW, AFLW, and 300-VW confirm that our method achieve competitive performance compared with the state-of-the-art algorithms.Retinex theory is developed mainly to decompose an image into the illumination and reflectance components by analyzing local image derivatives. In this theory, larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. In this paper, we utilize exponentiated local derivatives (with an exponent γ) of an observed image to generate its structure map and texture map. The structure map is produced by been amplified with γ > 1, while the texture map is generated by been shrank with γ less then 1. To this end, we design exponential filters for the local derivatives, and present their capability on extracting accurate structure and texture maps, influenced by the choices of exponents γ. The extracted structure and texture maps are employed to regularize the illumination and reflectance components in Retinex decomposition. A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image.

Autoři článku: Klineastrup4912 (Cummings Butt)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Klineastrup4912