Página 1 dos resultados de 389 itens digitais encontrados em 0.009 segundos

Efeito de diferentes estratégias de codificação dos processadores de fala na voz de crianças usuárias de implante coclear; Effect of different speech processors coding strategies on the voice of children with cochlear implants

Coelho, Ana Cristina de Castro
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 29/06/2011 Português
Relevância na Pesquisa
36.34%
O implante coclear tem como objetivo promover a percepção auditiva de indivíduos com deficiência auditiva de grau severo e profundo. Seu uso resulta na otimização do desenvolvimento da linguagem, da fala e da produção vocal de seus usuários. Esse dispositivo tem se mostrado uma das tecnologias mais efetivas e promissoras para remediar a perda auditiva, sendo que seus resultados são altamente dependentes da estratégia de codificação selecionada no processador de fala. O objetivo deste trabalho foi comparar as características perceptivas e acústicas da voz de crianças deficientes auditivas usuárias de implante coclear que utilizam as estratégias de codificação de fala Advanced Combination Encoder (ACE) e Fine Structure Processing (FSP), bem como investigar se as vozes dessas crianças se diferenciam das de crianças ouvintes. Crianças de 3 anos a 5 anos e 11 meses de idade foram selecionadas. Foi realizada análise acústica da vogal /a/ por meio do Multi Dimentional Voice Program (MDVP), da fala encadeada e da conversa espontânea por meio do Real Time Pitch (RTP), e análise perceptiva das mesmas emissões por meio de escalas visuais analógicas de parâmetros pré-selecionados. Em comparação com os usuários da estratégia ACE...

Correlação entre voz e processamento auditivo; Correlation between voice and auditory processing

Ramos, Janine Santos
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 25/02/2015 Português
Relevância na Pesquisa
36.47%
Introdução: A literatura revela que há uma possível relação entre processamento auditivo e as disfonias no que se refere principalmente a parâmetros acústicos da voz (frequência, intensidade e duração). Desta forma, um paciente que apresenta dificuldades auditivas para analisar e discriminar um desses parâmetros, provavelmente também apresente dificuldade para reproduzi-los vocalmente, o que justificaria a não evolução do processo terapêutico. Na clínica vocal, a avaliação da reprodução tonal vocal poderia auxiliar a identificação de dificuldades do paciente disfônico que pudessem estar relacionadas com alterações do processamento auditivo, contribuindo com o diagnóstico fonoaudiológico diferencial. Objetivo: Comparar o desempenho de mulheres disfônicas e sem alterações vocais em testes de processamento auditivo e teste de reprodução tonal vocal e correlacionar os testes de processamento auditivo utilizados com o teste de reprodução tonal vocal. Metodologia: Participaram do estudo 40 mulheres, na faixa etária de 18 a 44 anos, sendo subdivididas em dois grupos: Grupo Disfônico (20 Disfônicas) e Grupo Não Disfônico (20 Não Disfônicas). Após a assinatura do Termo de Consentimento Livre e Esclarecido...

Um estudo sobre processamento adaptativo de sinais utilizando redes neurais; A study about adaptive signal processing using neural nets

Dorneles, Ricardo Vargas
Fonte: Universidade Federal do Rio Grande do Sul Publicador: Universidade Federal do Rio Grande do Sul
Tipo: Dissertação Formato: application/pdf
Português
Relevância na Pesquisa
36.35%
Nos últimos anos muito tem se pesquisado na área de arquiteturas paralelas de computadores, devido ao fato da melhora de desempenho nas arquiteturas sequenciais não estar acompanhando as necessidades crescentes de capacidade de processamento. Entre as arquiteturas paralelas, um grupo que tem recebido especial atenção por parte dos pesquisadores é o de redes neurais. Uma rede neural é uma arquitetura baseada em paralelismo massivo, na interconexão de numerosos elementos simples de processamento segundo uma determinada topologia e com uma regra de aprendizagem. As redes neurais tem tido grande importância na área de reconhecimento de padrões e diversas aplicações em reconhecimento de caracteres, imagem e voz tem sido desenvolvidas. Outra área de aplicação das redes neurais é o processamento de sinais. A característica de adaptabilidade das redes neurais torna-as apropriadas à utilização em aplicações, onde as características do sinal, ou do meio, são variáveis ou não totalmente conhecidas, como filtros adaptativos. O objetivo deste trabalho é mostrar as aplicações de redes neurais nesta área. Na primeira parte do trabalho foram implementadas aplicações de redes neurais à filtragem utilizando diversas topologias e modelos de neurônios. Os modelos implementados são aqui apresentados juntamente com os resultados das simulações. A segunda parte do trabalho consiste na aplicação de um modelo de redes neurais a um problema bem específico...

Audiovisual voice activity detection and localization of simultaneous speech sources; Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo
Fonte: Universidade Federal do Rio Grande do Sul Publicador: Universidade Federal do Rio Grande do Sul
Tipo: Dissertação Formato: application/pdf
Português
Relevância na Pesquisa
36.35%
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas...

Evolvable hardware applied to voice recognition

Mantovani, Suely Cunha Amaro; De Oliveira, José Raimundo
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Conferência ou Objeto de Conferência Formato: 321-326
Português
Relevância na Pesquisa
36.42%
This paper presents some results of the application on Evolvable Hardware (EHW) in the area of voice recognition. Evolvable Hardware is able to change inner connections, using genetic learning techniques, adapting its own functionality to external condition changing. This technique became feasible by the improvement of the Programmable Logic Devices. Nowadays, it is possible to have, in a single device, the ability to change, on-line and in real-time, part of its own circuit. This work proposes a reconfigurable architecture of a system that is able to receive voice commands to execute special tasks as, to help handicapped persons in their daily home routines. The idea is to collect several voice samples, process them through algorithms based on Mel - Ceptrais theory to obtain their numerical coefficients for each sample, which, compose the universe of search used by genetic algorithm. The voice patterns considered, are limited to seven sustained Portuguese vowel phonemes (a, eh, e, i, oh, o, u).

What does voice-processing technology support today?

Nakatsu, R; Suzuki, Y
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
Publicado em 24/10/1995 Português
Relevância na Pesquisa
46.09%
This paper describes the state of the art in applications of voice-processing technologies. In the first part, technologies concerning the implementation of speech recognition and synthesis algorithms are described. Hardware technologies such as microprocessors and DSPs (digital signal processors) are discussed. Software development environment, which is a key technology in developing applications software, ranging from DSP software to support software also is described. In the second part, the state of the art of algorithms from the standpoint of applications is discussed. Several issues concerning evaluation of speech recognition/synthesis algorithms are covered, as well as issues concerning the robustness of algorithms in adverse conditions.

The Developmental Origins of Voice Processing in the Human Brain

Grossmann, Tobias; Oberecker, Regine; Koch, Stefan Paul; Friederici, Angela D.
Fonte: Cell Press Publicador: Cell Press
Tipo: Artigo de Revista Científica
Publicado em 25/03/2010 Português
Relevância na Pesquisa
46.35%
In human adults, voices are processed in specialized brain regions in superior temporal cortices. We examined the development of this cortical organization during infancy by using near-infrared spectroscopy. In experiment 1, 7-month-olds but not 4-month-olds showed increased responses in left and right superior temporal cortex to the human voice when compared to nonvocal sounds, suggesting that voice-sensitive brain systems emerge between 4 and 7 months of age. In experiment 2, 7-month-old infants listened to words spoken with neutral, happy, or angry prosody. Hearing emotional prosody resulted in increased responses in a voice-sensitive region in the right hemisphere. Moreover, a region in right inferior frontal cortex taken to serve evaluative functions in the adult brain showed particular sensitivity to happy prosody. The pattern of findings suggests that temporal regions specialize in processing voices very early in development and that, already in infancy, emotions differentially modulate voice processing in the right hemisphere.

Voice processing in dementia: a neuropsychological and neuroanatomical analysis

Hailstone, Julia C.; Ridgway, Gerard R.; Bartlett, Jonathan W.; Goll, Johanna C.; Buckley, Aisling H.; Crutch, Sebastian J.; Warren, Jason D.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.51%
Voice processing in neurodegenerative disease is poorly understood. Here we undertook a systematic investigation of voice processing in a cohort of patients with clinical diagnoses representing two canonical dementia syndromes: temporal variant frontotemporal lobar degeneration (n = 14) and Alzheimer’s disease (n = 22). Patient performance was compared with a healthy matched control group (n = 35). All subjects had a comprehensive neuropsychological assessment including measures of voice perception (vocal size, gender, speaker discrimination) and voice recognition (familiarity, identification, naming and cross-modal matching) and equivalent measures of face and name processing. Neuroanatomical associations of voice processing performance were assessed using voxel-based morphometry. Both disease groups showed deficits on all aspects of voice recognition and impairment was more severe in the temporal variant frontotemporal lobar degeneration group than the Alzheimer’s disease group. Face and name recognition were also impaired in both disease groups and name recognition was significantly more impaired than other modalities in the temporal variant frontotemporal lobar degeneration group. The Alzheimer’s disease group showed additional deficits of vocal gender perception and voice discrimination. The neuroanatomical analysis across both disease groups revealed common grey matter associations of familiarity...

Voices to reckon with: perceptions of voice identity in clinical and non-clinical voice hearers

Badcock, Johanna C.; Chhabra, Saruchi
Fonte: Frontiers Media S.A. Publicador: Frontiers Media S.A.
Tipo: Artigo de Revista Científica
Publicado em 03/04/2013 Português
Relevância na Pesquisa
36.41%
The current review focuses on the perception of voice identity in clinical and non-clinical voice hearers. Identity perception in auditory verbal hallucinations (AVH) is grounded in the mechanisms of human (i.e., real, external) voice perception, and shapes the emotional (distress) and behavioral (help-seeking) response to the experience. Yet, the phenomenological assessment of voice identity is often limited, for example to the gender of the voice, and has failed to take advantage of recent models and evidence on human voice perception. In this paper we aim to synthesize the literature on identity in real and hallucinated voices and begin by providing a comprehensive overview of the features used to judge voice identity in healthy individuals and in people with schizophrenia. The findings suggest some subtle, but possibly systematic biases across different levels of voice identity in clinical hallucinators that are associated with higher levels of distress. Next we provide a critical evaluation of voice processing abilities in clinical and non-clinical voice hearers, including recent data collected in our laboratory. Our studies used diverse methods, assessing recognition and binding of words and voices in memory as well as multidimensional scaling of voice dissimilarity judgments. The findings overall point to significant difficulties recognizing familiar speakers and discriminating between unfamiliar speakers in people with schizophrenia...

Emotional Voice Processing: Investigating the Role of Genetic Variation in the Serotonin Transporter across Development

Grossmann, Tobias; Vaish, Amrisha; Franz, Janett; Schroeder, Roland; Stoneking, Mark; Friederici, Angela D.
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 08/07/2013 Português
Relevância na Pesquisa
46.29%
The ability to effectively respond to emotional information carried in the human voice plays a pivotal role for social interactions. We examined how genetic factors, especially the serotonin transporter genetic variation (5-HTTLPR), affect the neurodynamics of emotional voice processing in infants and adults by measuring event-related brain potentials (ERPs). The results revealed that infants distinguish between emotions during an early perceptual processing stage, whereas adults recognize and evaluate the meaning of emotions during later semantic processing stages. While infants do discriminate between emotions, only in adults was genetic variation associated with neurophysiological differences in how positive and negative emotions are processed in the brain. This suggests that genetic association with neurocognitive functions emerges during development, emphasizing the role that variation in serotonin plays in the maturation of brain systems involved in emotion recognition.

On the definition and interpretation of voice selective activation in the temporal cortex

Bethmann, Anja; Brechmann, André
Fonte: Frontiers Media S.A. Publicador: Frontiers Media S.A.
Tipo: Artigo de Revista Científica
Publicado em 08/07/2014 Português
Relevância na Pesquisa
36.42%
Regions along the superior temporal sulci and in the anterior temporal lobes have been found to be involved in voice processing. It has even been argued that parts of the temporal cortices serve as voice-selective areas. Yet, evidence for voice-selective activation in the strict sense is still missing. The current fMRI study aimed at assessing the degree of voice-specific processing in different parts of the superior and middle temporal cortices. To this end, voices of famous persons were contrasted with widely different categories, which were sounds of animals and musical instruments. The argumentation was that only brain regions with statistically proven absence of activation by the control stimuli may be considered as candidates for voice-selective areas. Neural activity was found to be stronger in response to human voices in all analyzed parts of the temporal lobes except for the middle and posterior STG. More importantly, the activation differences between voices and the other environmental sounds increased continuously from the mid-posterior STG to the anterior MTG. Here, only voices but not the control stimuli excited an increase of the BOLD response above a resting baseline level. The findings are discussed with reference to the function of the anterior temporal lobes in person recognition and the general question on how to define selectivity of brain regions for a specific class of stimuli or tasks. In addition...

Investigating the Neural Correlates of Voice versus Speech-Sound Directed Information in Pre-School Children

Raschle, Nora Maria; Smith, Sara Ashley; Zuk, Jennifer; Dauvermann, Maria Regina; Figuccio, Michael Joseph; Gaab, Nadine
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 22/12/2014 Português
Relevância na Pesquisa
36.48%
Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI) in 20 typically developing preschool children (average age  = 5.8 y; range 5.2–6.8 y) to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized...

Investigating the Neural Correlates of Voice versus Speech-Sound Directed Information in Pre-School Children

Raschle, Nora Maria; Smith, Sara Ashley; Zuk, Jennifer; Dauvermann, Maria Regina; Figuccio, Michael Joseph; Gaab, Nadine
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
36.48%
Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI) in 20 typically developing preschool children (average age = 5.8 y; range 5.2–6.8 y) to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized...

Ontogénèse et spécificité de la voix humaine

Beauchemin, Maude
Fonte: Université de Montréal Publicador: Université de Montréal
Tipo: Thèse ou Mémoire numérique / Electronic Thesis or Dissertation
Português
Relevância na Pesquisa
36.47%
La voix est un stimulus auditif omniprésent dans notre environnement sonore. Elle permet non seulement la parole, mais serait aussi l’équivalent d’un visage auditif transmettant notamment des informations identitaires et affectives importantes. Notre capacité à discriminer et reconnaître des voix est socialement et biologiquement importante et elle figure parmi les fonctions les plus importantes du système auditif humain. La présente thèse s’intéressait à l’ontogénèse et à la spécificité de la réponse corticale à la voix humaine et avait pour but trois objectifs : (1) mettre sur pied un protocole électrophysiologique permettant de mesurer objectivement le traitement de la familiarité de la voix chez le sujet adulte; (2) déterminer si ce même protocole pouvait aussi objectiver chez le nouveau-né de 24 heures un traitement préférentiel d’une voix familière, notamment la voix de la mère; et (3) mettre à l’épreuve la robustesse d’une mesure électrophysiologique, notamment la Fronto-Temporal Positivity to Voices, s’intéressant à la discrimination pré-attentionnelle entre des stimuli vocaux et non-vocaux. Les résultats découlant des trois études expérimentales qui composent cette thèse ont permis (1) d’identifier des composantes électrophysiologiques (Mismatch Negativity et P3a) sensibles au traitement de la familiarité d’une voix; (2) de mettre en lumière un patron d’activation corticale singulier à la voix de la mère chez le nouveau-né...

Cerebral processing of emotional prosody — influence of acoustic parameters, arousal and the role of cross-gender interactions; Zerebrale Verarbeitung emotionaler Sprachmelodie - Einfluss akustischer Parameter, des Arousals und die Rolle geschlechtsspezifischer Interaktionen

Wiethoff, Sarah
Fonte: Universidade de Tubinga Publicador: Universidade de Tubinga
Tipo: Dissertação
Português
Relevância na Pesquisa
36.35%
Nonverbal signals play an important role in the way humans communicate with each other. Body movements like gestures and facial expressions are only one part of it – another important factor is prosody, in the clinical context firstly defined by Monrad-Kohn (1947) as that special facility of language which creates independently from semantics different meanings via modulation of speech-rhythm, loudness, frequency and stress patterns. Approximately, only seven percent of the information about the emotional state of a speaker are inferred from semantics, meaning the content of his words or “what” he or she says. 55 percent is conveyed by body language and the rest, impressive 38 percent, is transported via prosody, e. g. “how” one says, what he says (Mehrabian, 1972). Therefore, prosody – and its adequate interpretation – represents a vital tool within human every-day-life. So far, a lot of research has been carried out to further disentangle the contribution of different acoustic parameters to the expression of emotional prosody. Numerous scientists tried to clarify the influence and importance of single acoustic features within the creation of different emotional intonations (like for example anger, happiness, disgust...

The early spatio-temporal correlates and task independence of cerebral voice processing studied with MEG

Capilla González, Almudena; Belin, Pascal; Gross, Joachim
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Português
Relevância na Pesquisa
46.35%
OXFORD UNIVERSITY PRESS: This is a pre-copyedited, author-produced PDF of an article accepted for publication in Cerebral Cortex following peer review. The version of record Cerebral Cortex 23.6 (2013): 1388-1395 is available online at: htttp://cercor.oxfordjournals.org/; Functional magnetic resonance imaging studies have repeatedly provided evidence for temporal voice areas (TVAs) with particular sensitivity to human voices along bilateral mid/anterior superior temporal sulci and superior temporal gyri (STS/STG). In contrast, electrophysiological studies of the spatio-temporal correlates of cerebral voice processing have yielded contradictory results, finding the earliest correlates either at ∼300–400 ms, or earlier at ∼200 ms (“fronto-temporal positivity to voice”, FTPV). These contradictory results are likely the consequence of different stimulus sets and attentional demands. Here, we recorded magnetoencephalography activity while participants listened to diverse types of vocal and non-vocal sounds and performed different tasks varying in attentional demands. Our results confirm the existence of an early voicepreferential magnetic response (FTPVm, the magnetic counterpart of the FTPV) peaking at about 220 ms and distinguishing between vocal and non-vocal sounds as early as 150 ms after stimulus onset. The sources underlying the FTPVm were localized along bilateral mid-STS/STG...

Implementing voice recognition and natural language processing in the NPSNET networked virtual environment

DeVilliers, Edward Michael.
Fonte: Monterey, California. Naval Postgraduate School Publicador: Monterey, California. Naval Postgraduate School
Tipo: Tese de Doutorado Formato: xiii, 178 p.
Português
Relevância na Pesquisa
36.38%
Interfaces to military Virtual Reality (VR) systems, such as NPSNET IV.9, have been limited mainly to keyboard, mouse, and joystick devices. This presents two major problems; remembering how to access all the functionality of the system, and using the interface when the user is otherwise physically constrained. This can occur during the use of body-position tracking devices and Heads-Up-Displays (HUD). Voice recognition and Natural Language Processing (NLP) were used as a solution to both problems. The approach taken was to develop a networked Spoken Language System (SLS) using a Commercial-Off-The-Shelf (COTS) voice recognition and NLP system. The Nuance Speech Recognition System from Nuance Communications was chosen after analyzing the special requirements of NPSNET. Implementing the SLS occurred in four phases. First, vocabularies and grammars were developed to simulate the 108 keyboard commands, focusing on flexibility and decreased response latency. Second, new C++ classes were written to ease reuse of the Nuance API's. Third, a control panel was written to manage the voice processing, and fourth, the code was integrated into NPSNET. As a result of this effort, a new voice-enabled interface exists for NPSNET. In addition, C++ classes exist to ease future use of the Nuance API in other software systems. All of the 108 keyboard commands are executable through voice control with a 83.8% sentence understanding rate in a noisy background environment. :; NA; U.S. Marine Corps (U.S.M.C.) author

Corrélats neuronaux de l'expertise auditive

Chartrand, Jean-Pierre
Fonte: Université de Montréal Publicador: Université de Montréal
Tipo: Thèse ou Mémoire numérique / Electronic Thesis or Dissertation
Português
Relevância na Pesquisa
36.55%
La voix humaine constitue la partie dominante de notre environnement auditif. Non seulement les humains utilisent-ils la voix pour la parole, mais ils sont tout aussi habiles pour en extraire une multitude d’informations pertinentes sur le locuteur. Cette expertise universelle pour la voix humaine se reflète dans la présence d’aires préférentielles à celle-ci le long des sillons temporaux supérieurs. À ce jour, peu de données nous informent sur la nature et le développement de cette réponse sélective à la voix. Dans le domaine visuel, une vaste littérature aborde une problématique semblable en ce qui a trait à la perception des visages. L’étude d’experts visuels a permis de dégager les processus et régions impliqués dans leur expertise et a démontré une forte ressemblance avec ceux utilisés pour les visages. Dans le domaine auditif, très peu d’études se sont penchées sur la comparaison entre l’expertise pour la voix et d’autres catégories auditives, alors que ces comparaisons pourraient contribuer à une meilleure compréhension de la perception vocale et auditive. La présente thèse a pour dessein de préciser la spécificité des processus et régions impliqués dans le traitement de la voix. Pour ce faire...

A perspective on early commercial applications of voice-processing technology for telecommunications and aids for the handicapped.

Seelbach, C
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
Publicado em 24/10/1995 Português
Relevância na Pesquisa
46.19%
The Colloquium on Human-Machine Communication by Voice highlighted the global technical community's focus on the problems and promise of voice-processing technology, particularly, speech recognition and speech synthesis. Clearly, there are many areas in both the research and development of these technologies that can be advanced significantly. However, it is also true that there are many applications of these technologies that are capable of commercialization now. Early successful commercialization of new technology is vital to ensure continuing interest in its development. This paper addresses efforts to commercialize speech technologies in two markets: telecommunications and aids for the handicapped.

Voice-processing technologies--their application in telecommunications.

Wilpon, J G
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
Publicado em 24/10/1995 Português
Relevância na Pesquisa
46.16%
As the telecommunications industry evolves over the next decade to provide the products and services that people will desire, several key technologies will become commonplace. Two of these, automatic speech recognition and text-to-speech synthesis, will provide users with more freedom on when, where, and how they access information. While these technologies are currently in their infancy, their capabilities are rapidly increasing and their deployment in today's telephone network is expanding. The economic impact of just one application, the automation of operator services, is well over $100 million per year. Yet there still are many technical challenges that must be resolved before these technologies can be deployed ubiquitously in products and services throughout the worldwide telephone network. These challenges include: (i) High level of accuracy. The technology must be perceived by the user as highly accurate, robust, and reliable. (ii) Easy to use. Speech is only one of several possible input/output modalities for conveying information between a human and a machine, much like a computer terminal or Touch-Tone pad on a telephone. It is not the final product. Therefore, speech technologies must be hidden from the user. That is, the burden of using the technology must be on the technology itself. (iii) Quick prototyping and development of new products and services. The technology must support the creation of new products and services based on speech in an efficient and timely fashion. In this paper I present a vision of the voice-processing industry with a focus on the areas with the broadest base of user penetration: speech recognition...