Goldmanbattle1101

Objective evaluation of audio processed with time-scale modification (TSM) remains an open problem. Recently, a dataset of time-scaled audio with subjective quality labels was published and used to create an initial objective measure of quality (OMOQ). In this paper, an improved OMOQ for time-scaled audio is proposed. The measure uses handcrafted features and a fully connected network to predict subjective mean opinion scores (SMOS). Basic and advanced perceptual evaluation of audio quality features are used in addition to nine features specific to TSM artefacts. Six methods of alignment are explored with interpolation of the reference magnitude spectrum to the length of the test magnitude spectrum giving the best performance. The proposed measure achieves a mean root mean square error of 0.490 and a mean Pearson correlation of 0.864 to SMOS, equivalent to the 97th and 82nd percentiles of the subjective sessions, respectively. The proposed measure is used to evaluate TSM algorithms, finding that Elastique gives the highest objective quality for solo instrument and voice signals, whereas the identity phase-locking phase vocoder gives the highest objective quality for music signals and the best overall quality. The objective measure is available online at https//www.github.com/zygurt/TSM.Two experiments quantitatively investigated the interaction of prosody and syntax in marking focus in English. A production study with 28 participants (analyzing 919 utterances) found that the acoustic marking of subject focus vs broad focus, induced through a preceding context question, was generally the same in clefts as in sentences with unmarked syntax. Thus, results suggested that prosody is independent from syntax rather than showing a trade-off (weaker prosodic marking for clefts). Focus was marked with f0 range, f0 maxima, f0 minima, duration, and intensity. Maxima of focused subjects were not significantly higher, but they were earlier than in broad focus. In a perception experiment, 230 participants rated the suitability of 24 auditorily presented stimuli as answers to preceding context questions inducing subject focus or broad focus. Clefts and sentences prosodically marking the subject as focused were rated higher in subject focus than in broad focus contexts. Syntax and prosody did not interact, again suggesting the absence of a trade-off. Thus, both studies suggest an additive use of syntax and prosody Prosodic focus marking was equally extensive and effective in the presence of syntactic focus marking as without.The analysis of real-world conversational signal-to-noise ratios (SNRs) can provide insight into people's communicative strategies and difficulties and guide the development of hearing devices. However, measuring SNRs accurately is challenging in everyday recording conditions in which only a mixture of sound sources can be captured. This study introduces a method for accurate in situ SNR estimation where the speech signal of a target talker in natural conversation is captured by a cheek-mounted microphone, adjusted for free-field conditions and convolved with a measured impulse response to estimate its power at the receiving talker. A microphone near the receiver provides the noise-only component through voice activity detection. CADD522 The method is applied to in situ recordings of conversations in two real-world sound scenarios. It is shown that the broadband speech level and SNR distributions are estimated more accurately by the proposed method compared to a typical single-channel method, especially in challenging, low-SNR environments. The application of the proposed two-channel method may render more realistic estimates of conversational SNRs and provide valuable input to hearing instrument processing strategies whose operating points are determined by accurate SNR estimates.Piles are the state-of-the-art foundation type for offshore structures like offshore wind turbines. The pile driving process induces high sound pressure levels into the water, which are potentially harmful for the marine environment. To protect the marine life, regulations for these levels apply in many regions of the world. Therefore, detailed pile driving noise models are necessary to allow for both a prognosis of the underwater noise levels and the dimensioning and optimization of possible noise mitigation systems. In this paper, an established model based on a finite element approach is validated by means of three measurement campaigns. These have been conducted at different sites in the North Sea and include piling with and without noise mitigation measures. The noise mitigation systems are modelled as fully absorbing by applying a mixed Dirichlet-Neumann boundary condition at its position. Therefore, the computational results with noise mitigation measures are generally below the measured data and present the highest achievable noise reduction. The measurement campaigns have been conducted with a big bubble curtain and a noise mitigation screen. The occurring differences between the modelled and measured results with and without noise mitigation are shown.Older adults exhibit deficits in auditory temporal processing relative to younger listeners. These age-related temporal processing difficulties may be further exacerbated in older adults with cochlear implant (CIs) when CI electrodes poorly interface with their target auditory neurons. The aim of this study was to evaluate the potential interaction between chronological age and the estimated quality of the electrode-neuron interface (ENI) on psychophysical forward masking recovery, a measure that reflects single-channel temporal processing abilities. Fourteen CI listeners (age 15 to 88 years) with Advanced Bionics devices participated. Forward masking recovery was assessed on two channels in each ear (i.e., the channels with the lowest and highest signal detection thresholds). Results indicated that the rate of forward masking recovery declined with advancing age, and that the effect of age was more pronounced on channels estimated to interface poorly with the auditory nerve. These findings indicate that the quality of the ENI can influence the time course of forward masking recovery for older CI listeners.

Autoři článku: Goldmanbattle1101 (Ellis Craven)

Práce s článkem

Osobní nástroje

Navigace

Nástroje

Goldmanbattle1101