Navarrosanchez4795

Z Iurium Wiki

We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving. Existing methods for recovering depth for dynamic, non-rigid objects from monocular video impose strong assumptions on the objects' motion and may only recover sparse depth. In this paper, we take a data-driven approach and learn human depth priors from a new source of data thousands of Internet videos of people imitating mannequins, i.e., freezing in diverse, natural poses, while a hand-held camera tours the scene. Because people are stationary, training data can be generated using multi-view stereo reconstruction. At inference time, our method uses motion parallax cues from the static areas of the scenes to guide the depth prediction. We demonstrate our method on real-world sequences of complex human actions captured by a moving hand-held camera, show improvement over state-of-the-art monocular depth prediction methods, and show various 3D effects produced using our predicted depth.Multi-label classification is an important research topic in machine learning, for which exploiting label dependency is an effective modeling principle. Recently, probabilistic models have shown great potential in discovering dependencies among labels. In this paper, motivated by the recent success of multi-view learning to improve the generalization performance, we propose a novel multi-view probabilistic model named latent conditional Bernoulli mixture (LCBM) for multi-label classification. The LCBM is a generative model taking features from different views as inputs, and conditional on the latent subspace shared by the views a Bernoulli mixture model is adopted to build label dependency. Inside each component of the mixture, the labels have a weak correlation which facilitates computational convenience. The mean field variational inference framework is used to carry out approximate posterior inference in the probabilistic model, where we propose a Gaussian mixture variational autoencoder (GMVAE) for effective posterior approximation. We further develop a scalable stochastic training algorithm for efficiently optimizing the model parameters and variational parameters, and derive an efficient prediction procedure based on greedy search. Experimental results on multiple benchmark datasets show that our approach outperforms other state-of-the-art methods under various metrics.This paper introduces a novel depth recovery method based on light absorption in water. Water absorbs light at almost all wavelengths whose absorption coefficient is related to the wavelength. Based on the Beer-Lambert model, we introduce a bispectral depth recovery method that leverages the light absorption difference between two near-infrared wavelengths captured with a distant point source and orthographic cameras. Through extensive analysis, we show that accurate depth can be recovered irrespective of the surface texture and reflectance, and introduce algorithms to correct for nonidealities of a practical implementation including tilted light source and camera placement, nonideal bandpass filters and the perspective effect of the camera with a diverging point light source. We construct a coaxial bispectral depth imaging system using low-cost off-the-shelf hardware and demonstrate its use for recovering the shapes of complex and dynamic objects in water. selleck inhibitor We also present a trispectral variant to further improve robustness to extremely challenging surface reflectance. Experimental results validate the theory and practical implementation of this novel depth recovery paradigm, which we refer to as shape from water.Grounding referring expressions in images aims to locate the object instance in an image described by a referring expression. It involves a joint understanding of natural language and image content and is essential for a range of visual tasks related to human-computer interaction. As a language-to-vision matching task, the core of this problem is to not only extract all the necessary information in both the image and referring expressions, but also to to make full use of context information to achieve alignment of cross-modal semantic concepts in the extracted information. In this paper, we propose a Cross-Modal Relationship Extractor (CMRE) to adaptively highlight objects and relationships related to the given expression, with a cross-modal attention mechanism, and represent the extracted information as language-guided visual relation graphs. In addition, we propose a Gated Graph Convolutional Network (GGCN) to compute multimodal semantic context by fusing information from different modes and propagating multimodal information in the structured relation graphs. Experimental results on three common benchmark datasets show that our Cross-Modal Relationship Inference Network, which consists of CMRE and GGCN, greatly surpass all existing state-of-the-art methods.OBJECTIVE Treatment of brain tumors requires high precision in order to ensure sufficient treatment while minimizing damage to surrounding healthy tissue. Ablation of such tumors using needle-based therapeutic ultrasound (NBTU) under real-time magnetic resonance imaging (MRI) can fulfill this need. However, the constrained space and strong magnetic field in the MRI bore restricts patient access limiting precise placement of the NBTU ablation tool. A surgical robot compatible with use inside the bore of an MRI scanner can alleviate these challenges. METHODS We present preclinical trials of a robotic system for NBTU ablation of brain tumors under real-time MRI guidance. The system comprises of an updated robotic manipulator and corresponding control electronics, the NBTU ablation system and applications for planning, navigation and monitoring of the system. RESULTS The robotic system had a mean translational and rotational accuracy of 1.39±0.64 mm and 1.27±0.56° in gelatin phantoms and 3.13±1.41 mm and 5.58±3.59° in 10 porcine trials while causing a maximum reduction in signal to noise ratio (SNR) of 10.3%. CONCLUSION The integrated robotic system can place NBTU ablator at a desired target location in porcine brain and monitor the ablation in realtime via magnetic resonance thermal imaging (MRTI). SIGNIFICANCE Further optimization of this system could result in a clinically viable system for use in human trials for various diagnostic or therapeutic neurosurgical interventions.

Autoři článku: Navarrosanchez4795 (Pallesen Dean)