At present, a significant focus in the research and development of hearing aids abroad is centered on China, particularly involving studies on the Chinese language and speech, as well as the advancement of related speech recognition technologies and products. The center of Chinese listening is no exception. It is well known that auditory science is a rapidly evolving field. Its core subject is human hearing. In this context, we will explore how scientists and audiologists are increasingly concerned with the way people hear. Science is being applied to enhance the hearing and speech capabilities of Chinese individuals.
Chinese is a tonal language, which exhibits clear phonetic differences from other language families based on pinyin, such as Slavic languages. These differences are not only evident in linguistic features but also in practical usage. Whether these phonetic variations affect the understanding of speech in hearing-impaired patients, especially when using hearing aids developed based on different language research, has become a hot topic in recent academic and scientific discussions. For example, one characteristic of domestically developed cochlear implants is their consideration of Chinese speech characteristics during algorithm design. Foreign hearing aid manufacturers are expected to introduce hearing aids tailored for Chinese speech in the near future.
Through years of research and experimentation, Canada's speech lab in China used advanced digital signal processing (DSP) technology in 2000 to incorporate Chinese speech algorithms into its digital hearing aids and applied for relevant patents. At present, they are the first new digital hearing aids based on Chinese speech processing technology, called Intelligia, which have been recognized in clinical trials. Preliminary evidence suggests that this new type of hearing aid benefits Chinese-speaking patients.
Current research results indicate that different language families, such as Chinese and English, have distinct characteristics in the process of auditory perception. There are important differences between English and Chinese in speech and spoken language. Ming-Xi Tsai et al. (2000) argue that Chinese and English speech differ significantly in structural features. Chinese words, syllables, and finals contain various levels of information and maintain complex relationships. In spoken language, the pronunciation of Chinese is also quite different. Under different conversational conditions, it is affected by varying levels of information within these structures.
Research on Chinese speech recognition and tones is reflected in the algorithm of cochlear implants. Speech processing strategies help patients understand core language technologies, and extensive research has been conducted. However, there are fewer studies on speech, especially tones and intonations, such as those based on Chinese intonation. In a recent trial, Australian cochlear implants were used to observe the impact of Chinese phonetic comprehension. The results showed that the use of Chinese in some speech processing strategies was more effective than other time-based strategies. If we can improve stimulation rates and enhance the understanding of speech and tone, they believe that different speech processing strategies also contribute to Chinese understanding. Research once again proves that Chinese should have a specific voice system to address its own language, especially for the hearing impaired.
Michael Qin, a researcher at MIT, studied the relationship between the recognition of Chinese Mandarin and noise in his experiment "Identification of Noise Background Pronunciation and Tone." He believes that different languages use different types of tones to make our spoken words rich in different meanings. In a noisy environment, these meaningful tones are affected, so he needs to determine how Chinese speakers can recognize different tones in such environments. In the experiment, he used six vowels and four tones: yin and yang. The results show that the recognition of Chinese tones and vowels is greatly affected when the signal-to-noise ratio decreases, which affects the ability to reduce speech. Therefore, the signal-to-noise ratio is an important factor in understanding Chinese. This test holds great significance for hearing rehabilitation and the design of targeted hearing aids.
In addition, a comprehensive expert research group has been established in the United States to develop hearing aids suitable for Chinese speech. The team includes the renowned House Institute and the Chinese University of Hong Kong’s ENT department. Similar to the above study, they believe that when listening to tones as a language for recognizing speech and semantics, such as Mandarin, Cantonese, and Thai, it may be more important to listen to basic frequency-related information to understand the language, which differs from other languages. Therefore, when developing hearing aids, we should consider the language characteristics of these patients.
Of course, the author is most interested in a recent trial sponsored by the Wellcome Trust, titled "Chinese Mandarin Speakers Use Their Brain More Than English Conversers When Understanding Language." Using imaging technology, the researchers observed and studied the different brain activities of native Chinese and English speakers. Dr. Sophie Gault, who led the study, found that when English subjects heard English, their left temporal lobe became very active, and the researchers thought that this area was responsible for combining speech sounds to form independent words. But when Chinese subjects heard Mandarin, both their left and right temporal lobes became active simultaneously. Obviously, because different language subjects use different regions of their brain to decode different language stimuli. This has had a big impact on our understanding of these theories. They further believe that the left temporal lobe of Chinese subjects treats the speech signal while their right temporal lobe processes the tone while producing meaning. Speech is a very complicated voice, and correctly understanding the meaning of speech transmission requires the brain to fully utilize the speaker's swaying tone to decode its speech, thus turning the spoken language into a meaningful signal.
The auditory region of the brain is easily influenced by external influences and changes the ability to distinguish sounds. Once the hearing is compromised, rehabilitation is necessary, and the brain needs to be reconnected and coded. The plasticity of the brain is very strong. Understanding the brain's response to different languages can effectively help hearing patients regain their understanding of the language. What is important is that based on these studies, we can clearly see the development of hearing rehabilitation equipment with Chinese phonetic features. I remember that at the opening ceremony of the Speech and Hearing Center of Peking University and the China Disabled Persons' Federation in 2002, Mr. Deng Pufang specifically spoke in his speech: He first heard the influence of Chinese speech processing features on hearing aid users, he thought it was an important topic requiring a lot of work, and the development of an auditory rehabilitation device featuring Chinese speech will have important implications. According to the internationally recognized incidence of hearing loss, 10% of China's population, or 130 million people, have different levels of hearing loss. Therefore, using Chinese speech processing technology to help listeners with disabilities is very important.
One of the key principles of Chinese speech technology processing is the use of algorithms. The English terms include "Chinese speech processing strategy" or "Chinese speech recognition," and "hearing aid algorithm." Among them, the term "algorithm" is used more frequently, especially in the development of digital hearing aids. An "algorithm" represents the core of a particular technology. An "algorithm" can be viewed simply as a sequence of instructions that implement certain signal processing functions. Chinese speech features can be formed through algorithmic research. The digital signal processor and algorithm form the DSP line of the digital hearing aid. Including multi-channel dynamic range compression, noise attenuation, and other processing, the main goal of the algorithm for designing hearing aids is to use Chinese speech processing technology, even in different listening environments, to ensure that speech is heard and comfortable. At the same time, the use of digital hearing aids to improve Chinese intelligibility helps Chinese patients with hearing loss understand Chinese more easily.
Chinese is a tonal single-word language, and tone is one of the important phonetic features of Chinese. The tonal characteristics are mainly reflected in the mode in which the fundamental frequency of the voice changes with time. Eady Technology (1982) examined the tone language—the fundamental frequency pattern of Chinese and the accent language—what is different in English. The tone of Chinese has a disciplinary role in words. In life practice, everyone can also understand that the tone helps us to understand other people's words, and the "Southern North" is often difficult to understand and not very understandable and not very good. meaning.
For continuous speech, the long-term average positive and negative tremor factors are similar in all languages and male and female speakers. Only negative tremor is always greater than positive tremors, and the frequency of occurrence is also higher. Eady's measurements show that Chinese speaks more slowly than English. This may be because when speaking Chinese, the speaker has to work harder to control the vocal cord movement on each syllable, that is to say, the syllable throat motion control of the tone language has a larger linguistic load, so it takes more time. The result is that the speech is slower.
Therefore, the tone information mainly exists in the change of the fundamental frequency with time, the intensity change has a compensation effect on the tone information, and the presence or absence of the clear consonant has a certain influence on the tone definition.
Principle (Principles)
This paper introduces a speech processing method that can be applied to digital hearing aids to improve Chinese intelligibility. The goal is to make it easier for Chinese-speaking people with disabilities to understand the language. The idea of enhancing verbal intelligibility comes from people's practical experience. Recall that when you make it easier for a hearing-impaired person to understand what they are saying: you not only need to increase the volume, but also change the way you pronounce it, making it slower and clearer. Some studies have shown that reading meaningless sentences clearly can increase word intelligibility by about 17% compared to everyday conversation sentences. The so-called clearer here refers to some hints in the speech signal that have many different forms, such as the duration of a particular segment, the formant position of a vowel, or the transition between phonemes.
Not everyone will simply and conveniently speak “clearly†to patients with hearing loss. Therefore, the way we use speech enhancement is to construct a processing model between the speaker and the listener. The model can emphasize and highlight the specific components of the sentence, making the statement sound clearer.
The reason why all voices can express meaning is because there is a difference between the individual sounds. These differences arise from the differences in the pronunciation and pronunciation of the organs and muscles, etc., which are determined by the activity inside the acoustic cavity, and at the same time as the differences in the acoustic characteristics of the speech. The speech enhancement method proposed in this paper is to strengthen these differences by reconstructing the speech signal. The so-called recombination refers to the identification of signals of different natures in the speech signal and the targeted processing, emphasizing the characteristics of the human perception, so as to achieve the purpose of improving speech clarity. The method can be summarized as: amplifying consonants, accentuating accents, and highlighting tones.
Perceived characteristics of Chinese speech signals
Tone
Tone tune.
The perception of tone.
Mainly based on changes in the fundamental frequency.
Changes in pitch pitch may have an effect on both the length and the intensity.
Stress
The acoustic properties of light and heavy sounds.
It is closely related to the actual sound intensity, but it is not equal.
It is also subject to the tone, pitch and length.
Perceptual characteristics: When distinguishing between light and heavy, the sound intensity is often not the decisive factor.
Consonant Amplification
The psychological experiment of speech perception confirms the following characteristics: In the process of speech perception, there is a strong difference between the perception ability of the speech signal load on the pronunciation method and the resolution information of the pronunciation part. In general, people have a better ability to distinguish the pronunciation method than the pronunciation part. The relationship between method definition and consonant clarity is very similar. Among the perceptual importance of the pronunciation method of Chinese consonants, there are relationship between strong and weak, clear and turbid, aspiration and non-aspiration, friction and non-friction. Studies have shown that relatively enhanced consonants help improve speech intelligibility.
Kates describes how to amplify consonants, and Figure 1 is a model that is widely used. The system decomposes the signal into several bands, detects short-term spectral shapes in each band, identifies vowels and consonants according to the spectral shape, and amplifies the consonants. It should be pointed out that Du Limin et al. proposed the concept of Chinese phonetic guidance features, and provided an auxiliary matching structure for the Chinese automatic speech recognition system from the perspective of acoustic information calculation and detection.
Figure 1 Consonant enhancement system
Stress (Stress)
The syllables that make up a stream of speech sounds are not exactly equal. Some syllables sound louder than other syllables in the stream, which is the accented syllable. Some accents are closely related to semantics and grammar, such as the accent of words in Mandarin Chinese. The word accent appears in the word because the meaning of the word is different and the position of the accented syllable is different. For example, "Technology" and "Count", the accent is in the first syllable and the second syllable, respectively. This semantic difference is expressed by the "supersegment feature."
In Chinese, the influence of accent on the prosodic characteristic parameters has received much attention. The prosodic feature in the flow of speech is expressed by the change in pitch, length, and intensity, that is, the "supersegment feature." Observed from the language map, the sound field clearly expands the characteristics of the accent. Gao Mingming studied the acoustic performance of accent stress in the summary of Putonghua sentences, pointing out:
(1) “The rise in pitch is an important prosodic feature that emphasizes accent in Mandarin sentencesâ€.
(2) Pitch and duration play an equally important role in emphasizing the realization of stress. The relationship between them is complementary.
The experience of speech synthesis tells us that pitch is the most effective way to adjust accents, so the method of enhancing accent is mainly to improve the pitch.
Tone and Intonation
A syllable must include a certain pitch, pitch, and length in addition to vowels and consonants arranged in chronological order into a series of sound quality units. In some languages, the role of pitch in syllables can be said to be as important as vowels and consonants. The pitch that distinguishes the meaning of syllables is "tune." According to the presence or absence of tones, the world's languages can be divided into two major categories: tonal and non-tonal. One of the most prominent features of the Han and Tibetan language languages is the tone.
The tone of Mandarin Chinese plays a role in the formation of words. For a syllable with the same pinyin, it can have different meanings due to different tones. There are four modes of tone change in a monosyllabic syllable. Different tones are reflected in the speech parameters, which are different in the pitch of the pitch frequency. According to some rules defined by experimental observations, it can be considered that a certain parameter of the pitch frequency trajectory exceeds a predetermined threshold, and can be judged as a certain tone type. On this basis, the recognition mode proposed by Huang Zezhen and Yang Xingjun uses the first and second slopes, valley points and flatness of the pitch trajectory curve to have strong distinction between the four tones. Experiments show that the recognition rate of this algorithm can be Reached 99%.
Lin Maocan pointed out that tone information mainly exists in the main vowel (and its acoustic transition). Considering the change of pitch pitch, it may have an influence on the length and intensity of sound, that is, the shortest and strongest sound, the longest and weakest, the weight of Yinping and Yangping, and Yangping is often slightly longer than Yinping. Tone enhancement cannot simply amplify the main vowel, but different tones have different processing in pitch and intensity. In practice, we adopt the following strategies:
(1) Enhance the sound intensity of the sound.
(2) Increase the length of the sound.
(3) No change to Yinping and Yangping.
The four acoustic curves shown in Figure 3 depict the frequency characteristics of the four sounds at different times.
Figure 3: Acoustic characteristics of Chinese four tones
Methodology
The core part of the digital hearing aid is the gain calculation. Based on the frequency domain processing, it establishes the input instantaneous energy as a function of the gain for each frequency segment. As shown in Figure 3, the instantaneous energy of each frequency band is accumulated in a short time. And the long-term slow average averaging data necessary for signal identification and classification. among them:
(1) E j (n)= a E j (n-1) where: a is a time constant.
(2) Use the cepstrum algorithm to extract the fundamental frequency, 512 point FFT, 40ms Hamming window, and the window shift to 10ms.
(3) Smoothing the fundamental frequency measured by each syllable with a simple moving average algorithm, and eliminating those values whose deviations are too large in the smooth segment.
(4) The pitch and length are normalized separately.
(5) A quadratic curve is used to approximate the pitch trajectory in the sense of minimum mean square error. And calculate the slope, secondary slope, valley point and flatness of the curve.
The above algorithm is implemented in an assembly language based on the TOCCATA instruction system. 14-bit A/D with a sampling rate set to 32KHz.
Figure 3. Chinese speech enhancement system processing structure
Speech division (Classifications of Phonemes)
Sound waves are composed of four parts: sound quality (tone), pitch, intensity and length. These four parts play different roles in speech, but they coexist in time.
Sound quality components - divided by syllables, such as vowels and consonants.
Ultra-quality component - consists of three parts: pitch, intensity, and length, attached to a syllable or segment.
From the acoustic characteristics, the pitch can be determined from the fundamental frequency, the intensity is determined from the amplitude, and the length is determined according to the time.
Principles of Processing (Algorithm Principles)
Chinese speech processing is mainly reflected in:
In the fitting process, considering the frequency of the long-term spectral coverage of the Chinese speech for weighting, and raising the part of the speech frequency in the target curve, the effect of enhancing the speech understanding can be achieved.
In the signal processing program of the hearing aid, the compression controller is specially set to make the start time and release time of the high frequency signal compression short, to make the consonant clear and enhance the user's understanding of the speech.
In the noise reduction process, according to the sampling analysis of Chinese speech in the noise environment, the noise reduction strategy optimized for Chinese speech is obtained. Experiments have confirmed that this strategy can increase the signal-to-noise ratio by 18dB.
Chinese speech processing technology in applications involving hearing aids
The following is a specific example of applying Chinese speech technology to designing a hearing aid. This technology uses the world's most advanced DSP digital technology, including low-power digital chips.
TOCCATA digital signal processing system
The ToccataTM system is a miniature, ultra-low power, high efficiency digital signal processing system. It includes a high-fidelity weighted filter bank (WOLA filter bank), a 16-bit DSP core, two 14-bit A/D converters, a 14-bit D/A converter, and other peripherals. Toccata TM technology provides a standard software-programmable DSP development platform and a miniature VLSI fabricated in a 0.18 μ process. It not only facilitates the development of audio processing system manufacturers but also other DSP-based miniature, low-power products.
Hardware Structure (Hardware Structure)
Figure 4 hardware system structure
The TOCCATA system consists of three chips, an "analog" chip (ALPHA), a "digital" chip (DELTA), and an E 2 PROM chip for uncharged storage.
ALPHA chip
The ALPHA chip includes input and output amplifiers, two A/D converters, a D/A converter, and a master clock and power supply system.
DELTA chip
The DELTA chip includes a 16-bit software programmable DSP core, a WOLA filter bank coprocessor, a DMA controller (input and output processor or IOP), and memory (RAM and ROM). The combination of a programmable core and a flexible filter allows the signal to be processed by software. Therefore, the structure can perform a conventional audio processing system processing scheme (for example, dual-channel compression), and of course, a more powerful processing scheme can be performed through the DSP core (for example, compression of 16 channels or more channels, noise reduction, feedback suppression, etc.) ).
DSP core and instruction system (DSP Core)
RCORE is a flexible DSP core that uses a dual Harvard architecture with a single-cycle multiply-accumulate operation and a 40-bit accumulator. Peripheral components are provided by a composite of extended registers, memory map registers, and shared memory.
Signal path
Figure 5. Signal path provided by the Toccata system:
Intelligia digital hearing aid structure
The Intelligia all-digital hearing aid is designed based on the technical characteristics of the chip described above, and its structure can be illustrated by Figure 6. Although digital hearing aids use microphones and receivers as energy converters like analog hearing aids, level signals have been converted to digital codes after A/D sampling in digital signal processors. Digital coding can be used very flexibly to provide gain, improve frequency response, or otherwise process the patient's hearing requirements. When the DSP algorithm is completed, the digital code is converted to a level signal by D/A and converted to sound via the receiver.
The key to digital hearing aids is the information processing system. The all-digital hearing aid Intelligia, based on the current digital signal processing system ToccataTM, has a unique Chinese speech processing function. In the design, the hearing aid decomposes the signal into 16 bands of filtering processing, and then combines the signals of 16 bands into 10 groups of channels. Each channel independently uses the input automatic gain control method (AGCi) to compress the signal, and each channel is used. Fast and slow two time detectors, fast time detectors to monitor faster changes in signals, and slow time detectors to detect slower signal changes, i.e., syllable changes, and match the Chinese speech changes The compression and release time constants achieve better hearing results.
Full digital hearing aid technology features:
1) Chinese speech signal processing
After delving into the vocal characteristics of Chinese and other tonal languages, we put the original Chinese speech processing technology into place, which can greatly improve the intelligibility of listening in Chinese language environment.
2) Faster
The 3rd generation digital hearing aid processing system TOCCATA, designed for digital hearing aids, has a powerful computing power that enables fast processing of various voice signals.
3) More power saving
The operating current is less than 1 mA, and it automatically enters the power saving mode when there is no signal input. This low energy consumption eliminates the wearer's frequent battery replacement.
4) Fully programmable
Through its programmable advantages, the hearing impaired can be configured with the most suitable hearing compensation program and parameters to ensure that the wearer can get the best listening experience.
5) Multi-channel independent compression
The external sound is subdivided into multiple bands and channels by frequency, and the signals of each band and channel are processed differently to ensure that the wearer hears clearer and more realistic sound.
6) Noise reduction processing
It can effectively suppress environmental noise and improve the ability to distinguish language, thus ensuring that the wearer can hear clear sounds in noisy streets or in noisy supermarkets.
7) Directional processing
A directional microphone system and corresponding software can be configured to make the noise reduction better, thus ensuring that the wearer hears clearer, more natural sound.
8) Acoustic feedback suppression
Hearing aids are prone to whistling during use, which is acoustic feedback. The use of acoustic feedback suppression technology can effectively suppress the appearance of acoustic feedback, enabling the wearer to hear a more comfortable sound.
9) Easy to upgrade
Thanks to the fully open digital signal processing (DSP) platform TOCCATA technology, it offers programmable capabilities, full adaptability and upgradeability, so wearers can enjoy the latest features right away with our software.
The following is a comparison of the technical indicators of this Chinese speech processing:
Table 1 Technical comparison of Chinese speech technology for hearing aids and other hearing aids
In the laboratory, digital hearing aids with Chinese speech enhancement methods, the results of preliminary experiments show that the use of Chinese speech processing technology can help Chinese-speaking patients better understand language and improve rehabilitation. In clinical use, patients wearing Intelligia hearing aids work well, especially in noisy environments, enhancing speech intelligibility. In a sense, the patient's ability to understand the language is improved. Of course, we must realize that the use of Chinese speech processing technology in all digital hearing aids is still in the early stages of research. The author believes that audiologists and hearing aid experts should conduct more in-depth research from the following aspects:
The English and Chinese-based speech processing techniques should be compared in depth, especially in the noisy environment, and the effects of the two techniques on the different processing of the two speeches are observed. The most ideal experimental conditions should be the participation of subjects with bilingual ability.
Combining Chinese speech processing technology with the currently used nonlinear hearing aid fitting method, observing the English-based fitting method, whether it is more effective in helping Chinese-speaking patients in daily life with the support of Chinese speech processing technology Improve speech comprehension in life.
Chinese speech processing technology is currently one of the research hotspots of human-machine dialogue. The algorithm is complex and diverse. We should study the hearing aid technology algorithm with Chinese characteristics in depth and give full play to the great potential of digital chips.
The application of Chinese speech processing technology to the hearing device has only just begun. This is a very complicated subject involving many unsolved technical problems. However, the author believes that only the development of hearing aids with Chinese phonetic features can more effectively help many Chinese-speaking listeners.
·World most famous brand diesel engine: Cummins, Perkins, MTU, Yuchai
·World famous brand High Voltage AC alternator: Stamford, Leroy Somer, Marathon, Faraday, etc
·Advanced and reliable controller: Auto start, AMF & Remote control by PC with RS232/485
·Full range protect function and alarm shutdown feature.
High Voltage Generators with Diesel Engine, High Voltage AC Generator, Control System
·Engine and alternator shall be mounted on a same frame steel skid.
·Small size, low weight, easy to operating, installation and maintenance.·World most famous brand diesel engine: Cummins, Perkins, MTU, Yuchai
·World famous brand High Voltage AC alternator: Stamford, Leroy Somer, Marathon, Faraday, etc
·Advanced and reliable controller: Auto start, AMF & Remote control by PC with RS232/485
·Full range protect function and alarm shutdown feature.
·Comply with ISO8528 national standard and ISO9001 quality standard.
Hv Generator,High Voltage Genset,Hv Diesel Generator,High Voltage Generator
Guangdong Superwatt Power Equipment Co., Ltd , https://www.swtgenset.com