Therkildsenjonsson2165

Z Iurium Wiki

Additionally, we propose an active learning scheme based on visual memory, which learns to recognize open classes in a data-efficient manner for future expansions. On three large-scale open long-tailed datasets we curated from ImageNet (object-centric), Places (scene-centric), and MS1M (face-centric) data, as well as three standard benchmarks (CIFAR-10-LT, CIFAR-100-LT, and iNaturalist-18), our approach, as a unified framework, consistently demonstrates competitive performance. Notably, our approach also shows strong potential for the active exploration of open classes and the fairness analysis of minority groups.In this paper, we propose some efficient multi-view stereo methods for accurate and complete depth map estimation. We first present our basic methods with Adaptive Checkerboard sampling and Multi-Hypothesis joint view selection (ACMH & ACMH+). Based on our basic models, we develop two frameworks to deal with the depth estimation of ambiguous regions (especially low-textured areas) from two different perspectives multi-scale information fusion and planar geometric clue assistance. For the former one, we propose a multi-scale geometric consistency guidance framework (ACMM) to obtain the reliable depth estimates for low-textured areas at coarser scales and guarantee that they can be propagated to finer scales. For the latter one, we propose a planar prior assisted framework (ACMP). We utilize a probabilistic graphical model to contribute a novel multi-view aggregated matching cost. At last, by taking advantage of the above frameworks, we further design a multi-scale geometric consistency guided and planar prior assisted multi-view stereo (ACMMP). This greatly enhances the discrimination of ambiguous regions and helps their depth sensing. Experiments on extensive datasets show our methods achieve state-of-the-art performance, recovering the depth estimation not only in low-textured areas but also in details. Related codes are available at https//github.com/GhiXu.Semi-supervised learning is the learning setting in which we have both labeled and unlabeled data at our disposal. This survey covers theoretical results for this setting and maps out the benefits of unlabeled data in classification and regression tasks. Most methods that use unlabeled data rely on certain assumptions about the data distribution. When those assumptions are not met, including unlabeled data may actually decrease performance. For all practical purposes, it is therefore instructive to have an understanding of the underlying theory and the possible learning behavior that comes with it. This survey gathers results about the possible gains one can achieve when using semi-supervised learning as well as results about the limits of such methods. Specifically, it aims to answer the following questions what are, in terms of improving supervised methods, the limits of semi-supervised learning? What are the assumptions of different methods? What can we achieve if the assumptions are true? As, indeed, the precise assumptions made are of the essence, this is where the survey's particular attention goes out to.Existing solutions to instance-level visual identification usually aim to learn faithful and discriminative feature extractors from offline training data and directly use them for the unseen online testing data. However, their performance is largely limited due to the severe distribution shifting issue between training and testing samples. Therefore, we propose a novel online group-metric adaptation model to adapt the offline learned identification models for the online data by learning a series of metrics for all sharing-subsets. Each sharing-subset is obtained from the proposed novel frequent sharing-subset mining module and contains a group of testing samples that share strong visual similarity relationships to each other. Furthermore, to handle potentially large-scale testing samples, we introduce self-paced learning (SPL) to gradually include samples into adaptation from easy to difficult which elaborately simulates the learning principle of humans. Unlike existing online visual identification methods, our model simultaneously takes both the sample-specific discriminant and the set-based visual similarity among testing samples into consideration. Our method is generally suitable to any off-the-shelf offline learned visual identification baselines for online performance improvement which can be verified by extensive experiments on several widely-used visual identification benchmarks.How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g. object detection, motion forecasting). However, in the context of end-to-end driving, we find that imitation learning based on existing sensor fusion methods underperforms in complex driving scenarios with a high density of dynamic agents. this website Therefore, we propose TransFuser, a mechanism to integrate image and LiDAR representations using self-attention. Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps. We experimentally validate its efficacy on a challenging new benchmark with long routes and dense traffic, as well as the official leaderboard of the CARLA urban driving simulator. At the time of submission, TransFuser outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin. Compared to geometry-based fusion, TransFuser reduces the average collisions per kilometer by 48%.The performance of deep networks for medical image analysis is often constrained by limited medical data, which is privacy-sensitive. Federated learning (FL) alleviates the constraint by allowing different institutions to collaboratively train a federated model without sharing data. However, the federated model is often suboptimal with respect to the characteristics of each client's local data. Instead of training a single global model, we propose Customized FL (CusFL), for which each client iteratively trains a client-specific/private model based on a federated global model aggregated from all private models trained in the immediate previous iteration. Two overarching strategies employed by CusFL lead to its superior performance 1) the federated model is mainly for feature alignment and thus only consists of feature extraction layers; 2) the federated feature extractor is used to guide the training of each private model. In that way, CusFL allows each client to selectively learn useful knowledge from the federated model to improve its personalized model. We evaluated CusFL on multi-source medical image datasets for the identification of clinically significant prostate cancer and the classification of skin lesions.Lung cancer has the highest mortality rate among all malignancies. Non-micro pulmonary nodules are the primary manifestation of early-stage lung cancer. If patients can be detected with nodules in the early stage and receive timely treatment, their survival rate can be improved. Due to the large number of patients and limited medical resources, doctors take a longer time to make a diagnosis, which reduces efficiency and accuracy. Besides, there are no suitable approaches for developing countries. Therefore, we propose a 2.5D-based cascaded multi-stage framework for automatic detection and segmentation (DS-CMSF) of pulmonary nodules. The first three stages of the framework are used to discover lesions, and the latter stage is used to segment them. The first locating stage introduces the classical 2D-based Yolov5 model to locate the nodules roughly on axial slices. The second aggregation stage proposes a candidate nodule selection (CNS) algorithm to locate further and reduce redundant candidate nodules. The third classification stage uses a multi-size 3D-based fusion model to accommodate nodules of varying sizes and shapes for false-positive reducing. The last segmentation stage introduces multi-scale and attention modules into 3D-based UNet autoencoder to segment the nodular regions finely. Our proposed framework achieves 95.95% sensitivity and 89.50% CPM for nodules detection on the LUNA16 dataset, and 86.75% DSC for nodules segmentation on the LIDC-IDRI dataset. Moreover, our approach also achieves the accuracy-complexity trade-off, which can effectively realize the auxiliary diagnosis of pulmonary nodules in developing countries.There is an increasing interest in the applications of 3D ultrasound imaging of the pelvic floor to improve the diagnosis, treatment, and surgical planning of female pelvic floor dysfunction (PFD). Pelvic floor biometrics are obtained on an oblique image plane known as the plane of minimal hiatal dimensions (PMHD). Identifying this plane requires the detection of two anatomical landmarks, the pubic symphysis and anorectal angle. The manual detection of the anatomical landmarks and the PMHD in 3D pelvic ultrasound requires expert knowledge of the pelvic floor anatomy, and is challenging, time-consuming, and subject to human error. These challenges have hindered the adoption of such quantitative analysis in the clinic. This work presents an automatic approach to identify the anatomical landmarks and extract the PMHD from 3D pelvic ultrasound volumes. To demonstrate clinical utility and a complete automated clinical task, an automatic segmentation of the levator-ani muscle on the extracted PMHD images was also performed. Experiments using 73 test images of patients during a pelvic muscle resting state showed that this algorithm has the capability to accurately identify the PMHD with an average Dice of 0.89 and an average mean boundary distance of 2.25mm. Further evaluation of the PMHD detection algorithm using 35 images of patients performing pelvic muscle contraction resulted in an average Dice of 0.88 and an average mean boundary distance of 2.75mm. This work had the potential to pave the way towards the adoption of ultrasound in the clinic and development of personalized treatment for PFD.Recent research has shown the great potential of deep learning algorithms in the hyperspectral image (HSI) classification task. Nevertheless, training these models usually requires a large amount of labeled data. Since the collection of pixel-level annotations for HSI is laborious and time-consuming, developing algorithms that can yield good performance in the small sample size situation is of great significance. In this study, we propose a robust self-ensembling network (RSEN) to address this problem. The proposed RSEN consists of two subnetworks including a base network and an ensemble network. With the constraint of both the supervised loss from the labeled data and the unsupervised loss from the unlabeled data, the base network and the ensemble network can learn from each other, achieving the self-ensembling mechanism. To the best of our knowledge, the proposed method is the first attempt to introduce the self-ensembling technique into the HSI classification task, which provides a different view on how to utilize the unlabeled data in HSI to assist the network training. We further propose a novel consistency filter to increase the robustness of self-ensembling learning. Extensive experiments on three benchmark HSI datasets demonstrate that the proposed algorithm can yield competitive performance compared with the state-of-the-art methods.

Autoři článku: Therkildsenjonsson2165 (Leach Arildsen)