Gouldterkildsen7667
Both quantitative and qualitative results show that the proposed framework has superiority compared with the state-of-the-art human pose and attribute transfer methods. Detailed ablation studies report the effectiveness of each contribution, which proves the robustness and efficacy of the proposed framework.Unsupervised domain adaptation aims to learn a classification model for the target domain without any labeled samples by transferring the knowledge from the source domain with sufficient labeled samples. The source and the target domains usually share the same label space but are with different data distributions. In this paper, we consider a more difficult but insufficient-explored problem named as few-shot domain adaptation, where a classifier should generalize well to the target domain given only a small number of examples in the source domain. In such a problem, we recast the link between the source and target samples by a mixup optimal transport model. The mixup mechanism is integrated into optimal transport to perform the few-shot adaptation by learning the cross-domain alignment matrix and domain-invariant classifier simultaneously to augment the source distribution and align the two probability distributions. Moreover, spectral shrinkage regularization is deployed to improve the transferability and discriminability of the mixup optimal transport model by utilizing all singular eigenvectors. Experiments conducted on several domain adaptation tasks demonstrate the effectiveness of our proposed model dealing with the few-shot domain adaptation problem compared with state-of-the-art methods.Segmenting portal vein (PV) and hepatic vein (HV) from magnetic resonance imaging (MRI) scans is important for hepatic tumor surgery. Compared with single phase-based methods, multiple phases-based methods have better scalability in distinguishing HV and PV by exploiting multi-phase information. However, these methods just coarsely extract HV and PV from different phase images. In this paper, we propose a unified framework to automatically and robustly segment 3D HV and PV from multi-phase MR images, which considers both the change and appearance caused by the vascular flow event to improve segmentation performance. Firstly, inspired by change detection, flow-guided change detection (FGCD) is designed to detect the changed voxels related to hepatic venous flow by generating hepatic venous phase map and clustering the map. The FGCD uniformly deals with HV and PV clustering by the proposed shared clustering, thus making the appearance correlated with portal venous flow robustly delineate without increasing framework complexity. FPH1 price Then, to refine vascular segmentation results produced by both HV and PV clustering, interclass decision making (IDM) is proposed by combining the overlapping region discrimination and neighborhood direction consistency. Finally, our framework is evaluated on multi-phase clinical MR images of the public dataset (TCGA) and local hospital dataset. The quantitative and qualitative evaluations show that our framework outperforms the existing methods.Segmentation of curvilinear structures is important in many applications, such as retinal blood vessel segmentation for early detection of vessel diseases and pavement crack segmentation for road condition evaluation and maintenance. Currently, deep learning-based methods have achieved impressive performance on these tasks. Yet, most of them mainly focus on finding powerful deep architectures but ignore capturing the inherent curvilinear structure feature (e.g., the curvilinear structure is darker than the context) for a more robust representation. In consequence, the performance usually drops a lot on cross-datasets, which poses great challenges in practice. In this paper, we aim to improve the generalizability by introducing a novel local intensity order transformation (LIOT). Specifically, we transfer a gray-scale image into a contrast-invariant four-channel image based on the intensity order between each pixel and its nearby pixels along with the four (horizontal and vertical) directions. This results in a representation that preserves the inherent characteristic of the curvilinear structure while being robust to contrast changes. Cross-dataset evaluation on three retinal blood vessel segmentation datasets demonstrates that LIOT improves the generalizability of some state-of-the-art methods. Additionally, the cross-dataset evaluation between retinal blood vessel segmentation and pavement crack segmentation shows that LIOT is able to preserve the inherent characteristic of curvilinear structure with large appearance gaps. An implementation of the proposed method is available at https//github.com/TY-Shi/LIOT.Image-based age estimation aims to predict a person's age from facial images. It is used in a variety of real-world applications. Although end-to-end deep models have achieved impressive results for age estimation on benchmark datasets, their performance in-the-wild still leaves much room for improvement due to the challenges caused by large variations in head pose, facial expressions, and occlusions. To address this issue, we propose a simple yet effective method to explicitly incorporate facial semantics into age estimation, so that the model would learn to correctly focus on the most informative facial components from unaligned facial images regardless of head pose and non-rigid deformation. To this end, we design a face parsing-based network to learn semantic information at different scales and a novel face parsing attention module to leverage these semantic features for age estimation. To evaluate our method on in-the-wild data, we also introduce a new challenging large-scale benchmark called IMDB-Clean. This dataset is created by semi-automatically cleaning the noisy IMDB-WIKI dataset using a constrained clustering method. Through comprehensive experiment on IMDB-Clean and other benchmark datasets, under both intra-dataset and cross-dataset evaluation protocols, we show that our method consistently outperforms all existing age estimation methods and achieves a new state-of-the-art performance. To the best of our knowledge, our work presents the first attempt of leveraging face parsing attention to achieve semantic-aware age estimation, which may be inspiring to other high level facial analysis tasks.Multi-label image recognition has attracted considerable research attention and achieved great success in recent years. Capturing label correlations is an effective manner to advance the performance of multi-label image recognition. Two types of label correlations were principally studied, i.e., the spatial and semantic correlations. However, in the literature, previous methods considered only either of them. In this work, inspired by the great success of Transformer, we propose a plug-and-play module, named the Spatial and Semantic Transformers (SST), to simultaneously capture spatial and semantic correlations in multi-label images. Our proposal is mainly comprised of two independent transformers, aiming to capture the spatial and semantic correlations respectively. Specifically, our Spatial Transformer is designed to model the correlations between features from different spatial positions, while the Semantic Transformer is leveraged to capture the co-existence of labels without manually defined rules. Other than methodological contributions, we also prove that spatial and semantic correlations complement each other and deserve to be leveraged simultaneously in multi-label image recognition. Benefitting from the Transformer's ability to capture long-range correlations, our method remarkably outperforms state-of-the-art methods on four popular multi-label benchmark datasets. In addition, extensive ablation studies and visualizations are provided to validate the essential components of our method.Convex probes have been widely used in clinical abdominal imaging for providing deep penetration and wide field of view. Ultrafast imaging modalities have been studied extensively in the ultrasound community. Specifically, broader wavefronts, such as plane wave and spherical wave, are used for transmission. For convex array, spherical wavefront can be simply synthesized by turning all elements simultaneously. Due to the lack to transmit focus, the image quality is suboptimal. One solution is to adopt virtual sources behind the transducer and compound corresponding images. In this work, we propose two novel Fourier-domain beamformers (vs1 and vs2) for nonsteered diverging wave imaging and an explicit interpolation scheme for virtual-source-based steered diverging wave imaging using a convex probe. The received echoes are first beamformed using the proposed beamformers and then interpolated along the range axis. A total of 31 virtual sources located on a circular line are used. The lateral resolution, the contrast ( C ), and the contrast-to-noise ratio (CNR) are evaluated in simulations, phantom experiments, ex vivo imaging of the bovine heart, and in vivo imaging of the liver. The results show that the two proposed Fourier-domain beamformers give higher contrast than dynamic receive focusing (DRF) with better resolution. In vitro results demonstrate the enhancement on CNR 6.7-dB improvement by vs1 and 5.9-dB improvement by vs2. Ex vivo imaging experiments on the bovine heart validate the CNR enhancements by 8.4 dB (vs1) and 8.3 dB (vs2). In vivo imaging on the human liver also reveals 6.7- and 5.5-dB improvements of CNR by vs1 and vs2, respectively. The computation time of vs1 and vs2, depending on the image pixel number, is shortened by 2-73 and 4-216 times than the DRF.Due to the significant acoustic impedance contrast at cortical boundaries, highly inside attenuation, and the unknown sound velocity distribution, accurate ultrasound cortical bone imaging remains a challenge, especially for the traditional pulse-echo modalities using unique sound velocity. Moreover, the large amounts of data recorded by multielement probe results in a relatively time-consuming reconstruction process. To overcome these limitations, this article proposed an index-rotated fast ultrasound imaging method based on predicted velocity model (IR-FUI-VP) for cortical cross section ultrasound tomography (UST) imaging, utilizing ray-tracing synthetic aperture (RTSA). In virtue of ring probe, the sound velocity model was predicted in advance using bent-ray inversion (BRI). With the predicted velocity model, index-rotated fast ultrasound imaging (IR-FUI) was further applied to image the cortical cross sections in the sectors corresponding to the dynamic apertures (DAs) and ring center. The final result was merged by all sector images. One cortical bone phantom and two ex vivo bovine femurs were utilized to demonstrate the performance of the proposed method. Compared to the conventional synthetic aperture (SA) imaging, the method can not only accurately image the outer cortical boundary but also precisely reconstruct the inner cortical surface. The mean relative errors of the predicted sound velocity in the region of interest (ROI) were all smaller than 7%, and the mean errors of cortical thickness are all less than 0.31 mm. The reconstructed images of bovine femurs were in good agreement with the reference images scanned by micro-computed tomography ( μ CT) with respect to the morphology and thickness. The speed of IR-FUI is about 3.73 times faster than the traditional SA. It is proved that the proposed IR-FUI-VP-based UST is an effective way for fast and accurate cortical bone imaging.