Google Speech-to-Text enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The API recognizes more than 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google's machine learning technology. Note that google-cloud-speech-v2 is a version-specific client library. For most uses, we recommend installing the main client library google-cloud-speech instead. See the readme for more details.
Required Ruby Version
>= 3.2
Authors
Google LLC
Versions
- 1.8.0 June 11, 2026 (99 KB)
- 1.7.1 April 03, 2026 (99 KB)
- 1.7.0 April 02, 2026 (99 KB)
- 1.6.0 March 20, 2026 (98 KB)
- 1.5.0 January 13, 2026 (97.5 KB)