Slothlau1886

Extensive experiments are performed on four machine learning applications using both synthetic and real-world data sets. Results show that the proposed algorithm is not only fast but also has better performance than the state-of-the-arts.Mammogram mass detection is crucial for diagnosing and preventing breast cancers in clinical practice. The complementary effect of multi-view mammogram images provides valuable information about the breast anatomical prior structure and is of great significance in digital mammography interpretation. However, unlike radiologists who can utilize reasoning ability to identify masses, how to endow existing models with capability of multi-view reasoning is vital in clinical diagnosis. In this paper, we propose an Anatomy-aware Graph convolutional Network (AGN), which is tailored for mammogram mass detection and endows existing methods with multi-view reasoning ability. The proposed AGN consists of three steps. Firstly, we introduce a Bipartite Graph convolutional Network (BGN) to model intrinsic geometric and semantic relations of ipsilateral views. Secondly, considering that visual asymmetry of bilateral views is widely adopted in clinical practice to assist the diagnosis of breast lesions, we propose an Inception Graph convolutional Network (IGN) to model structural similarities of bilateral views. Finally, based on the constructed graphs, the multi-view information is propagated through nodes methodically, which equips the learned features with multi-view reasoning ability. Experiments on two benchmarks reveal that AGN significantly exceeds the state-of-the-art performance. Visualization results show that AGN provides interpretable visual cues for clinical diagnosis.We present the first systematic study on concealed object detection (COD), which aims to identify objects that are ?perfectly? embedded in their background. The high intrinsic similarities between the concealed objects and their background make COD far more challenging than traditional object detection/segmentation. To better understand this task, we collect a large-scale dataset, called COD10K, which consists of 10,000 images covering concealed objects in diverse real-world scenarios from 78 object categories. Further, we provide rich annotations including object categories, object boundaries, challenging attributes, object-level labels, and instance-level annotations. Our COD10K enables comprehensive concealed object understanding and can even be used to help progress several other vision tasks, such as detection, segmentation, classification etc. We also design a simple but strong baseline for COD, termed the Search Identification Network (SINet). Without any bells and whistles, SINet outperform 12 cutting-edge baselines on all datasets tested, making them robust, general architectures that could serve as catalysts for future research in COD. Finally, we provide some interesting findings, and highlight several potential applications and future directions. To spark research in this new field, our code, dataset, and online demo are available at our project page http//mmcheng.net/cod.Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts. Crenolanib mw This task can refer to the relational inference in a graphical model with sparse contextual subjects (nodes) and unknown graph structure (relation descriptor); how to model the underlying context-aware relational inference is critical. To this end, we propose a novel Context-Aware Graph (CAG) neural network. We focus on the exploitation of fine-grained relational reasoning with object-level visual-historical co-reference nodes. The graph structure (relation in dialog) is iteratively updated using an adaptive top-K message passing mechanism. To eliminate sparse useless relations, each node has dynamic relations in the graph (different related K neighbor nodes), and only the most relevant nodes are attributive to the context-aware relational graph inference. In addition, to avoid negative performance caused by linguistic bias of history, we propose a pure visual-aware knowledge distillation mechanism named CAG-Distill, in which image-only visual clues are used to regularize the joint visual-historical contextual awareness. Experimental results on VisDial v0.9 and v1.0 datasets show that both CAG and CAG-Distill outperform comparative methods. Visualization results further validate the remarkable interpretability of our graph inference solution.Original k-means method using Lloyd algorithm partitions a data set by minimizing a sum of squares cost function to find local minima, which can be used for data analysis and machine learning that shows promising performance. However, Lloyd algorithm suffers from finding bad local minima. In this paper, we use coordinate descent (CD) method to solve the problem. First, we show that the k-means minimization problem can be reformulated as a trace maximization problem, a simple and very efficient coordinate descent scheme is proposed to solve this problem later. The effectiveness of our method is illustrated on several real-world data sets with varing number of clusters, varing number of samples and varing number of dimensionalty. Extensive experiments conducted show that CD performs better compared to Lloyd, i.e., lower objective value and better local minima. What's more, the results show that CD is more robust to initialization than Lloyd method whether the initialization strategy is random or k-means++. In addition, according to the computational complexity analysis, it is verified CD has the same time complexity with original k-means method.In this paper, we propose a novel deep Efficient Relational Sentence Ordering Network (referred to as ERSON) by leveraging pre-trained language model in both encoder and decoder architectures to strengthen the coherence modeling of the entire model. Specifically, we first introduce a divide-and-fuse BERT (referred to as DF-BERT), a new refactor of BERT network, where lower layers in the improved model encode each sentence in the paragraph independently, which are shared by different sentence pairs, and the higher layers learn the cross-attention between sentence pairs jointly. It enables us to capture the semantic concepts and contextual information between the sentences of the paragraph, while significantly reducing the runtime and memory consumption without sacrificing the model performance. Besides, a Relational Pointer Decoder (referred to as RPD) is developed, which utilizes the pre-trained Next Sentence Prediction (NSP) task of BERT to capture the useful relative ordering information between sentences to enhance the order predictions.

Autoři článku: Slothlau1886 (Hansson Nyholm)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Slothlau1886