Penningtonvaughan3489

In such way, these modules enable the deep synergistic interaction between these two tasks. In addition, we introduce a task interaction loss to enhance the mutual supervision between the classification and segmentation tasks and guarantee the consistency of their predictions. Relying on the proposed deep synergistic interaction mechanism, DSI-Net achieves superior classification and segmentation performance on public dataset in comparison with state-of-the-art methods. The source code is available at https//github.com/CityU-AIM-Group/DSI-Net.Graph convolutional networks are widely used in graph-based applications such as graph classification and segmentation. However, current GCNs have limitations on implementation such as network architectures due to their irregular inputs. In contrast, convolutional neural networks are capable to extract rich features from large-scale input data, but they do not support general graph inputs. To bridge the gap between GCNs and CNNs, in this paper we study the problem of how to effectively and efficiently map general graphs to 2D grids that CNNs can be directly applied to, while preserving graph topology as much as possible. We therefore propose two novel graph-to-grid mapping schemes, namely, graph-preserving grid layout and its extension Hierarchical GPGL for computational efficiency. We formulate the GPGL problem as an integer programming and further propose an approximate yet efficient solver based on a penalized Kamada-Kawai method, a well-known optimization algorithm in 2D graph drawing. We propose a novel vertex separation penalty that encourages graph vertices to lay on the grid without any overlap. We demonstrate the empirical success of GPGL on general graph classification with small graphs and H-GPGL on 3D point cloud segmentation with large graphs, based on 2D CNNs including VGG16, ResNet50 and multi-scale-maxout CNN.Symmetric image registration estimates bi-directional spatial transformations between images while enforcing an inverse-consistency. Its capability of eliminating bias introduced inevitably by generic single-directional image registration allows more precise analysis in different interdisciplinary applications of image registration, e.g. click here computational anatomy and shape analysis. However, most existing symmetric registration techniques especially for multimodal images are limited by low speed from the commonly-used iterative optimization, hardship in exploring inter-modality relations or high labor cost for labeling data. We propose SymReg-GAN to shatter these limits, which is a novel generative adversarial networks (GAN) based approach to symmetric image registration. We formulate symmetric registration of unimodal/multimodal images as a conditional GAN and train it with a semi-supervised strategy. The registration symmetry is realized by introducing a loss for encouraging that the cycle composed of the geometric transformation from one image to another and its reverse should bring an image back. The semi-supervised learning enables both the precious labeled data and large amounts of unlabeled data to be fully exploited. Experimental results from 6 public brain magnetic resonance imaging (MRI) datasets and 1 our own computed tomography (CT) & MRI dataset demonstrate the superiority of SymReg-GAN to several existing state-of-the-art methods.End-to-end trained convolutional neural networks have led to a breakthrough in optical flow estimation. The most recent advances focus on improving the optical flow estimation by improving the architecture and setting a new benchmark on the publicly available MPI-Sintel dataset. Instead, in this article, we investigate how deep neural networks estimate optical flow. A better understanding of how these networks function is important for (i) assessing their generalization capabilities to unseen inputs, and (ii) suggesting changes to improve their performance. For our investigation, we focus on FlowNetS, as it is the prototype of an encoder-decoder neural network for optical flow estimation. Furthermore, we use a filter identification method that has played a major role in uncovering the motion filters present in animal brains in neuropsychological research. The method shows that the filters in the deepest layer of FlowNetS are sensitive to a variety of motion patterns. Not only do we find translation filters, as demonstrated in animal brains, but thanks to the easier measurements in artificial neural networks, we even unveil dilation, rotation, and occlusion filters. Furthermore, we find similarities in the refinement part of the network and the perceptual filling-in process which occurs in the mammal primary visual cortex.In this paper, we address the makeup transfer and removal tasks. Existing methods cannot well transfer makeup between images with large pose and expression differences, or handle makeup details like blush or highlight. In addition, they cannot control the degree of makeup transfer. In this work, we propose a Pose and expression robust Spatial-aware GAN (PSGAN++), which can perform both detail-preserving makeup transfer and makeup removal. For makeup transfer, PSGAN++ uses a Makeup Distill Network (MDNet) to extract makeup information as spatial-aware makeup matrices. We also devise an Attentive Makeup Morphing (AMM) module that specifies how the makeup in the source image is morphed from the reference image, and a makeup detail loss to supervise the model within the selected makeup detail area. For makeup removal, PSGAN++ applies an Identity Distill Network (IDNet) to embed the identity information from with-makeup images into identity matrices. Finally, the makeup/identity matrices are fed to a Style Transfer Network (STNet) to achieve makeup transfer or removal. We collect a new Makeup Transfer In the Wild (MT-Wild) dataset and a Makeup Transfer High-Resolution (MT-HR) dataset. Experiments demonstrate that PSGAN++ not only achieves state-of-the-art results with makeup details even in cases of large pose/expression differences.We study the problem of efficient semantic segmentation for large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Comparative experiments show that our RandLA-Net can process 1 million points in a single pass with up to 200X faster than existing approaches. Moreover, extensive experiments on several large-scale point cloud datasets, including Semantic3D, SemanticKITTI, Toronto3D, S3DIS and NPM3D, demonstrate the state-of-the-art semantic segmentation performance of our RandLA-Net.

Autoři článku: Penningtonvaughan3489 (Finley Williamson)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Penningtonvaughan3489