Gilbertstentoft3823
Accurate and automated detection of anomalous samples in an image dataset can be accomplished with a probabilistic model. Such images have heterogeneous complexity, however, and a probabilistic model tends to overlook simply shaped objects with small anomalies. The reason is that a probabilistic model assigns undesirable lower likelihoods to complexly shaped objects, which are nevertheless consistent with the current set standards. This difficulty is critical, especially for a defect detection task, where the anomaly can be a small scratch or grime. To overcome this difficulty, we propose an unregularized score for deep generative models (DGMs). We found that the regularization terms of the DGMs considerably influence the anomaly score depending on the complexity of the samples. By removing these terms, we obtain an unregularized score, which we evaluated on toy datasets, two in-house manufacturing datasets, and on open manufacturing and medical datasets. The empirical results demonstrate that the unregularized score is robust to the apparent complexity of given samples and detects anomalies selectively.Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all views to learn a common Hamming space, thus making it difficult to handle the data with increasing views or a large number of views. To overcome these difficulties, we propose a decoupled CVH network (DCHN) approach which consists of a semantic hashing autoencoder module (SHAM) and multiple multiview hashing networks (MHNs). To be specific, SHAM adopts a hashing encoder and decoder to learn a discriminative Hamming space using either a few labels or the number of classes, that is, the so-called flexible inputs. After that, MHN independently projects all samples into the discriminative Hamming space that is treated as an alternative ground truth. In brief, the Hamming space is learned from the semantic space induced from the flexible inputs, which is further used to guide view-specific hashing in an independent fashion. Thanks to such an independent/decoupled paradigm, our method could enjoy high computational efficiency and the capacity of handling the increasing number of views by only using a few labels or the number of classes. For a newly coming view, we only need to add a view-specific network into our model and avoid retraining the entire model using the new and previous views. Extensive experiments are carried out on five widely used multiview databases compared with 15 state-of-the-art approaches. The results show that the proposed independent hashing paradigm is superior to the common joint ones while enjoying high efficiency and the capacity of handling newly coming views.The least-square support vector machine (LS-SVM) has been deeply studied in the machine-learning field and widely applied on a great deal of occasions. A disadvantage is that it is less efficient in dealing with the non-Gaussian noise. In this article, a novel probabilistic LS-SVM is proposed to enhance the modeling reliability even data contaminated by the non-Gaussian noise. The stochastic effect of noise on the kernel function and the regularization parameter is first analyzed and estimated. On the basis of this, a new objective function is constructed under a probabilistic sense. A probabilistic inference method is then developed to construct the distribution of the model parameter, including distribution estimation of both the kernel function and the regularization parameter from data. Using this distribution information, a solving strategy is then developed for this new objective function. Different from the original LS-SVM that uses a deterministic scenario approach to gain the model, the proposed method builds the distribution relation between the model and noise and makes use of this distribution information in the process of modeling; thus, it is more robust for modeling of noise data. The effectiveness of the proposed probabilistic LS-SVM is demonstrated by using both artificial and real cases.The large data volume and high algorithm complexity of hyperspectral image (HSI) problems have posed big challenges for efficient classification of massive HSI data repositories. Recently, cloud computing architectures have become more relevant to address the big computational challenges introduced in the HSI field. This article proposes an acceleration method for HSI classification that relies on scheduling metaheuristics to automatically and optimally distribute the workload of HSI applications across multiple computing resources on a cloud platform. By analyzing the procedure of a representative classification method, we first develop its distributed and parallel implementation based on the MapReduce mechanism on Apache Spark. The subtasks of the processing flow that can be processed in a distributed way are identified as divisible tasks. The optimal execution of this application on Spark is further formulated as a divisible scheduling framework that takes into account both task execution precedences and task divisibility when allocating the divisible and indivisible subtasks onto computing nodes. The formulated scheduling framework is an optimization procedure that searches for optimized task assignments and partition counts for divisible tasks. Two metaheuristic algorithms are developed to solve this divisible scheduling problem. The scheduling results provide an optimized solution to the automatic processing of HSI big data on clouds, improving the computational efficiency of HSI classification by exploring the parallelism during the parallel processing flow. Experimental results demonstrate that our scheduling-guided approach achieves remarkable speedups by facilitating the automatic processing of HSI classification on Spark, and is scalable to the increasing HSI data volume.A growing number of clinical studies have provided substantial evidence of a close relationship between the microbe and the disease. see more Thus, it is necessary to infer potential microbe-disease associations. But traditional approaches use experiments to validate these associations that often spend a lot of materials and time. Hence, more reliable computational methods are expected to be applied to predict disease-associated microbes. In this article, an innovative mean for predicting microbe-disease associations is proposed, which is based on network consistency projection and label propagation (NCPLP). Given that most existing algorithms use the Gaussian interaction profile (GIP) kernel similarity as the similarity criterion between microbe pairs and disease pairs, in this model, Medical Subject Headings descriptors are considered to calculate disease semantic similarity. In addition, 16S rRNA gene sequences are borrowed for the calculation of microbe functional similarity. In view of the gene-based sequence information, we use two conventional methods (BLAST+ and MEGA7) to assess the similarity between each pair of microbes from different perspectives.