Gespeaker is a free, open-source software for gesture control and speech recognition. It allows users to control their computer and applications using hand gestures and voice commands. Gespeaker works with standard webcams and microphones.
Gespeaker: Free Gesture Control & Speech Recognition Software
A free, open-source software for controlling computers using hand gestures and voice commands, Gespeaker works with standard webcams and microphones
What is Gespeaker?
Gespeaker is a free and open-source software application that enables gesture and voice control of a computer. It allows users to interact with their computer using intuitive hand gestures and voice commands for a more natural user experience.
With Gespeaker, users can launch applications, navigate menus, control media playback, dictate text, and more using just a webcam and microphone. No additional hardware is required. It works by using computer vision and speech recognition technology to interpret the user's hand movements and spoken commands.
Some key features of Gespeaker include:
Hand gesture recognition for common controls like left/right click, scroll, zoom, rotate, etc.
Custom voice commands for launching apps, automation routines, dictation, and more
Point and click functionality for precise on-screen control
Head tracking allows mouse movement by turning your head
Open-source codebase that is customizable by developers
By providing hands-free control of a computer, Gespeaker makes using a PC more accessible for those with limited mobility. It also creates a more engaging and futuristic user experience. With its customizable nature, Gespeaker serves as a platform for developers to experiment with innovative interfaces.
Gespeaker Features
Features
Hand gesture recognition
Speech recognition
Mouse control using hand gestures
Perform actions like left click, right click, scroll using gestures
Nuance Dragon is a advanced speech recognition software that allows users to dictate text and control their computer using only their voice. It provides capabilities like:Accurately transcribing audio recordings and live speech into text documents or formats like Microsoft Word.Controlling computer functions completely hands-free using speech commands, like opening files,...
eSpeak is an open source, compact, multi-lingual software speech synthesizer for Linux, Windows, and other platforms. It was released under the GNU General Public License in 2005. eSpeak uses a "formant synthesis" method, which allows it to generate speech quickly and use little memory. It supports over 70 languages and...
Voice Dream Reader is a popular text-to-speech app that makes digital content more accessible for people who struggle with reading. It can convert documents, ebooks, web articles and more into natural sounding speech so users can listen instead of read.Some key features of Voice Dream Reader include:High-quality text-to-speech voices with...
Loquendo TTS is a powerful text-to-speech (TTS) software that converts text into human-like synthesized speech. It utilizes advanced linguistic analysis and speech synthesis technologies to produce high-quality and natural sounding voices.Some key features of Loquendo TTS include:Supports over 30 languages including English, Spanish, French, German, Italian and more.Provides a wide...
RHVoice is an open-source speech synthesis platform for Linux, Windows, Android, iOS, and other operating systems. It uses statistical parametric speech synthesis to generate natural-sounding vocal output from text input in over 30 languages and 100 voices.Key features of RHVoice include:Support for many languages including English, Russian, Italian, German, French,...
ReadSpeaker is a customizable text-to-speech (TTS) software used to convert written content into natural sounding speech. It can be integrated into websites, mobile apps, e-learning platforms, e-books and documents to make them more accessible for people with reading difficulties like dyslexia or visual impairments.Some key features of ReadSpeaker include:High-quality voices...
Verbify-TTS is an open-source neural text-to-speech engine capable of generating human-like speech from text. Developed by Verbify Labs, it utilizes state-of-the-art deep learning techniques such as Tacotron 2 and WaveRNN to synthesize natural sounding voices that adapt to the input text.Key features of Verbify-TTS include:Production quality voices that sound human-like,...
NeoSpeech is a professional-grade text-to-speech (TTS) software that converts text into high-quality, human-like voice audio. It utilizes advanced speech synthesis technologies to generate natural sounding vocal audio for a wide range of applications.Some key features of NeoSpeech include:Over 140 text-to-speech voices supporting more than 60 languagesNeural network voices that deliver...
eSpeak NG is an open source, text-to-speech synthesizer that can be used to hear typed words aloud. It supports over 100 different languages and accents and is highly customizable, allowing users to adjust parameters like voice pitch, speed, volume, and more to fit their needs.Some key features of eSpeak NG...
Mycroft Mimic is an open-source text-to-speech (TTS) engine developed by Mycroft AI, an open-source voice assistant project. It is designed to generate natural sounding speech from text input using deep learning techniques.Unlike traditional TTS systems that use pre-recorded speech fragments, Mimic utilizes end-to-end deep neural networks to learn the mapping...
Speech Note is voice recognition software that utilizes advanced speech-to-text technology to convert spoken words into digital text quickly and accurately. It is an invaluable productivity tool for anyone who needs to generate written documents and notes without typing.With Speech Note, users can dictate naturally using their voice and see...
The Read Aloud Extension is a useful browser extension that reads text on web pages out loud using text-to-speech technology. It works in Chrome, Firefox, and Edge to make website content more accessible for people who benefit from listening rather than reading text.Some key features of the Read Aloud Extension...
Festival is an open-source software speech synthesis system developed at the University of Edinburgh. It supports text-to-speech conversion for multiple languages and includes several voices. Festival is used for research and development of speech synthesis techniques. Some key features of Festival include:Supports multiple languages including English, Spanish, Welsh, and othersModular...
KMouth is an open-source, cross-platform text-to-speech (TTS) software for Linux and other Unix-like operating systems. Developed by KDE, KMouth utilizes speech synthesizers to convert text into audible, natural-sounding speech in various languages.Some key features of KMouth include:Support for dozens of languages and accentsCustomizable voice speed and pitchPhonetic spelling inputAudio output...