Levesquesnedker5891
The regression relationship between an input test image and the entire training image collection in the target domain is inferred via a deep domain translation framework, in which a domain-wise adaption term and a local consistency adaption term are developed. To improve the robustness of the style transfer process, we propose a multiview domain translation method that flexibly leverages a convolutional neural network representation with hand-crafted features in an optimal way. Qualitative and quantitative comparisons are provided for universal unconstrained conditions of unavailable training images from the source domain, demonstrating the effectiveness and superiority of our method for universal face photo-sketch style transfer.Spectral clustering is a popular tool in many unsupervised computer vision and machine learning tasks. Recently, due to the encouraging performance of deep neural networks, many conventional spectral clustering methods have been extended to the deep framework. Although these deep spectral clustering methods are quite powerful and effective, learning the cluster number from data is still a challenge. In this paper, we aim to tackle this problem by integrating the spectral clustering, generative adversarial network and low rank model within a unified Bayesian framework. First, we adapt the low rank method to the cluster number estimation problem. Then, an adversarial-learning-based deep clustering method is proposed and incorporated. When introducing the spectral clustering method into our model clustering procedure, a hidden space structure preservation term is proposed. Via a Bayesian framework, the structure preservation term is embedded into the generative process, which can then be used to deduce a spectral clustering in the optimization procedure. Finally, we derive a variational-inference-based method and embed it into the network optimization and learning procedure. Experiments on different datasets prove that our model has the cluster number estimation capability and show that our method can outperform many similar graph clustering methods.If an object is photographed at motion in front of a static background, the object will be blurred while the background sharp and partially occluded by the object. The goal is to recover the object appearance from such blurred image. We adopt the image formation model for fast moving objects and consider objects undergoing 2D translation and rotation. For this scenario we formulate the estimation of the object shape, appearance, and motion from a single image and known background as a constrained optimization problem with appropriate regularization terms. Both similarities and differences with blind deconvolution are discussed with the latter caused mainly by the coupling of the object appearance and shape in the acquisition model. Necessary conditions for solution uniqueness are derived and a numerical solution based on the alternating direction method of multipliers is presented. The proposed method is evaluated on a new dataset.Most of the current action localization methods follow an anchor-based pipeline depicting action instances by pre-defined anchors, learning to select the anchors closest to the ground truth, and predicting the confidence of anchors with refinements. Pre-defined anchors set prior about the location and duration for action instances, which facilitates the localization for common action instances but limits the flexibility for tackling action instances with drastic varieties, especially for extremely short or extremely long ones. To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points. Specifically, this module represents an action instance as a point with its distances to the starting boundary and ending boundary, alleviating the pre-defined anchor restrictions in terms of action localization and duration. The proposed anchor-free module is capable of predicting the action instances whose duration is either extremely short or extremely long. By combining the proposed anchor-free module with a conventional anchor-based module, we propose a novel action localization framework, called A2Net. The cooperation between anchor-free and anchor-based modules achieves superior performance to the state-of-the-art on THUMOS14 (45.5% vs. 42.8%). Furthermore, comprehensive experiments demonstrate the complementarity between the anchor-free and the anchor-based module, making A2Net simple but effective.Deep neural networks (DNNs) have been extensively applied in image processing, including visual saliency map pre-diction of images. A major difficulty in using a DNN for visual saliency prediction is the lack of labeled ground truth of visual saliency. A powerful DNN usually contains a large number of trainable parameters. This condition can easily lead to model over-fitting. In this study, we develop a novel method that over-comes such difficulty by embedding hierarchical knowledge of existing visual saliency models in a DNN. We achieve the objective of exploiting the knowledge contained in the existing visual sali-ency models by using saliency maps generated by local, global, and semantic models to tune and fix about 92.5% of the parame-ters in our network in a hierarchical manner. As a result, the number of trainable parameters that need to be tuned by the ground truth is considerably reduced. This reduction enables us to fully utilize the power of a large DNN and overcome the issue of over-fitting at the same time. Furthermore, we introduce a simple but very effective center prior in designing the learning cost function of the DNN by attaching high importance to the errors around the image center. We also present extensive experimental results on four commonly used public databases to demonstrate the superiority of the proposed method over classical and state-of-the-art methods on various evaluation metrics.Recent progress in vision-based fire detection is driven by convolutional neural networks. However, the existing methods fail to achieve a good tradeoff among accuracy, model size, and speed. In this paper, we propose an accurate fire detection method that achieves a better balance in the abovementioned aspects. Specifically, a multiscale feature extraction mechanism is employed to capture richer spatial details, which can enhance the discriminative ability of fire-like objects. Then, the implicit deep supervision mechanism is utilized to enhance the interaction among information flows through dense skip connections. Finally, a channel attention mechanism is employed to selectively emphasize the contribution between different feature maps. Experimental results demonstrate that our method achieves 95.3% accuracy, which outperforms the suboptimal method by 2.5%. Moreover, the speed and model size of our method are 3.76% faster on the GPU and 63.64% smaller than the suboptimal method, respectively.The goal of our work is to discover dominant objects in a very general setting where only a single unlabeled image is given. This is far more challenge than typical colocalization or weakly-supervised localization tasks. To tackle this problem, we propose a simple but effective pattern mining-based method, called Object Location Mining (OLM), which exploits the advantages of data mining and feature representation of pretrained convolutional neural networks (CNNs). Specifically, we first convert the feature maps from a pre-trained CNN model into a set of transactions, and then discovers frequent patterns from transaction database through pattern mining techniques. We observe that those discovered patterns, i.e., co-occurrence highlighted regions, typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns. Extensive experiments on a variety of benchmarks demonstrate that OLM achieves competitive localization performance compared with the state-of-the-art methods. We also evaluate our approach compared with unsupervised saliency detection methods and achieves competitive results on seven benchmark datasets. Moreover, we conduct experiments on finegrained classification to show that our proposed method can locate the entire object and parts accurately, which can benefit to improving the classification results significantly.The recent development of high-frame-rate (HFR) imaging/Doppler methods based on the transmission of plane or diverging waves, has proposed new challenges to echographic data management and display. Due to the huge amount of data that need to be processed at very high speed, the pulse repetition frequency (PRF) is typically limited to hundreds Hz or few kHz. In Doppler applications, a PRF limitation may result unacceptable since it inherently translates to a corresponding limitation in the maximum detectable velocity. In this paper, the ULA-OP 256 implementation of a novel ultrasound modality, called virtual real-time (VRT), is described. First, for a given HFR real-time modality, the scanner displays the processed results while saving channel data into an internal buffer. Then, ULA-OP 256 switches to VRT mode, according to which the raw data stored in the buffer are immediately re-processed by the same hardware used in real-time. In the two phases, the ULA-OP 256 calculation power can be differently distributed to increase the acquisition frame rate or the quality of processing results. VRT was here used to extend the PRF limit in a multi-line vector Doppler application. In real-time, the PRF was maximized at the expense of the display quality; in VRT, data were reprocessed at a lower rate in a high-quality display format, which provides more detailed flow information. Experiments are reported in which the multi-line vector Doppler technique is shown capable of working at 16 kHz PRF, so that flow jet velocities higher up to 3 m/s can be detected.In an adhesively bonded structure, utilizing the adhesive itself for monitoring the joint integrity can be beneficial in reduction of labor, time and potential human errors while avoiding problems associated with introduction of a foreign sensor component. This work started from the examination of effective piezoelectricity of commercial structural adhesives/sealants, and five of them were found to possess effective piezoelectric property, with effective piezoelectric coefficient d33 from -0.11 to -1.77 pm/V depending on frequency under substrate clamping condition. With stable piezoelectric response at least up to MHz, an epoxy adhesive with inorganic filler was selected for SHM feasibility demonstration via generating or sensing guided ultrasonic Lamb waves. The presence of disbond in the adhesive joint is detectable by comparing the Lamb waves signal with a reference baseline signal associated with an intact structure. The results show that the selected adhesive with piezoelectric response can perform the dual roles of structural bonding and ultrasonic joint integrity monitoring.Ultrasonography and photoacoustic tomography provide complementary contrasts in preclinical studies, disease diagnoses, and imaging-guided interventional procedures. Here, we present a video-rate (20 Hz) dual-modality ultrasound and photoacoustic tomographic platform that has a high resolution, rich contrasts, deep penetration, and wide field of view. A three-quarter ring-array ultrasonic transducer is used for both ultrasound and photoacoustic imaging. Plane-wave transmission/receiving approach is used for ultrasound imaging, which improves the imaging speed by nearly two folds and reduces the RF data size compared with the sequential single-channel scanning approach. GPU-based image reconstruction is developed to advance computational speed. We demonstrate fast dual-modality imaging in phantom, mouse, and human finger joint experiments. Proteinase K supplier The results show respiration motion, heart beating, and detailed features in the mouse internal organs. To our knowledge, this is the first report on fast plane-wave ultrasound imaging and single-shot photoacoustic computed tomography in a ring-array system.