### Install Riva Python Client via Pip Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Install the nvidia-riva-client package directly using pip for a quick setup. ```bash pip install nvidia-riva-client ``` -------------------------------- ### Install Riva Python Clients from Source Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Clone the repository, initialize and update submodules, install dependencies, build the wheel, and then install the wheel package. This method ensures you have the latest code and all necessary components. ```bash git clone https://github.com/nvidia-riva/python-clients.git cd python-clients git submodule init git submodule update --remote --recursive pip install -r requirements.txt python3 setup.py bdist_wheel pip install --force-reinstall dist/*.whl ``` -------------------------------- ### Install NLP Evaluation Libraries Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Install scikit-learn and transformers using pip for NLP evaluation tasks. ```bash pip install -U scikit-learn pip install -U transformers ``` -------------------------------- ### NLP Service Client Setup Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Initializes the NLP service client. Ensure the Riva server is running and accessible. ```python from time import time import riva.client nlp_service = riva.client.NLPService(riva.client.ChannelCredentials()) ``` -------------------------------- ### Install Dependencies for Realtime ASR/TTS Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Install numpy, requests, and websockets using conda for Realtime ASR or TTS functionalities that rely on WebSocket connections. ```bash conda install -c anaconda numpy conda install -c anaconda requests conda install -c anaconda websockets ``` -------------------------------- ### Install PyAudio for Audio Output Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Install the PyAudio library using Conda, which is necessary for handling audio input and output devices, including playing synthesized speech. ```bash conda install -c anaconda pyaudio ``` -------------------------------- ### Get WAV File Parameters Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Retrieves audio parameters (sample rate, channels) from a WAV file. This information is needed for streaming recognition setup. ```python wav_parameters = riva.client.get_wav_file_parameters(my_wav_file) ``` -------------------------------- ### NLP NER Client with Label Output Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Extract Named Entity Recognition (NER) information from text. This example shows how to get label names. ```bash python scripts/nlp/ner_client.py \ --query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \ --test label ``` -------------------------------- ### NLP NER Client with Span Start Output Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Extract NER information and output the starting position of the identified entity spans. ```bash python scripts/nlp/ner_client.py \ --query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \ --test span_start ``` -------------------------------- ### Iterate and Print Audio Chunks Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Iterates through audio chunks from a file and prints their index and length. This is a basic setup for processing audio files. ```python chunk_size = wav_parameters['framerate'] with riva.client.AudioChunkFileIterator( my_wav_file, chunk_size, delay_callback=riva.client.sleep_audio_length, ) as audio_chunk_iterator: for i, chunk in enumerate(audio_chunk_iterator): print(i, len(chunk)) ``` -------------------------------- ### Play Audio During Transcription Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Configures and starts streaming transcription while simultaneously playing audio through a specified output device. It uses `SoundCallBack` to manage audio playback. ```python output_device = None # use default device wav_parameters = riva.client.get_wav_file_parameters(my_wav_file) sound_callback = riva.client.audio_io.SoundCallBack( output_device, wav_parameters['sampwidth'], wav_parameters['nchannels'], wav_parameters['framerate'], ) audio_chunk_iterator = riva.client.AudioChunkFileIterator(my_wav_file, 4800, sound_callback) response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config) riva.client.print_streaming(response_generator, show_intermediate=True) sound_callback.close() ``` -------------------------------- ### Define Repetitions for Asynchronous Calls Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Set the number of times to repeat the synthesis operation for performance comparison. This is used in the asynchronous call example. ```python from time import time num_repeats = 10 ``` -------------------------------- ### Perform Text Classification Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Calls the `classify_text` method of the NLP service to get text classification results. The response is typed as `TextClassResponse`. ```python response: TextClassResponse = nlp_service.classify_text(text_class_queries, text_class_model) ``` -------------------------------- ### Get Next Streaming Response Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Retrieves the next response from the streaming response generator. This is typically used within a loop to process results as they arrive. ```python streaming_response = next(response_generator) ``` -------------------------------- ### List Available Audio Devices Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use `realtime_tts_client.py` to list all available audio output devices. ```bash python scripts/tts/realtime_tts_client.py --list-devices ``` -------------------------------- ### Initialize Riva Authentication Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Establish a connection with the Riva server by creating an authentication instance. This is required before instantiating any Riva service. ```python import riva.client uri = "localhost:50051" # Default value auth = riva.client.Auth(uri=uri) ``` -------------------------------- ### Configure Unix Audio Permissions Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md On Unix systems, add your user to the 'audio' and 'pulse-access' groups and restart to enable microphone and audio output device usage. ```bash adduser $USER audio adduser $USER pulse-access ``` -------------------------------- ### Configure Recognition Settings Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Sets up recognition configurations for offline and streaming ASR. Includes options for audio encoding, punctuation, and interim results. ```python from copy import deepcopy offline_config = riva.client.RecognitionConfig( encoding=riva.client.AudioEncoding.LINEAR_PCM, max_alternatives=1, enable_automatic_punctuation=True, verbatim_transcripts=False, ) streaming_config = riva.client.StreamingRecognitionConfig(config=deepcopy(offline_config), interim_results=True) ``` -------------------------------- ### List Available Audio Devices Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Query the system to list all available audio input devices for real-time ASR. ```bash python scripts/asr/realtime_asr_client.py --list-devices ``` -------------------------------- ### Import Audio IO Module Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Imports the necessary audio input/output module from the Riva client library. ```python import riva.client.audio_io ``` -------------------------------- ### List Available Audio Output Devices Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Display a list of available audio output devices on the system using the `riva.client.audio_io` module. This helps in selecting a specific device if needed. ```python import riva.client.audio_io # show available output devices riva.client.audio_io.list_output_devices() ``` -------------------------------- ### Instantiate NLP Service Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Initializes the NLP service client by passing the authentication instance. This service is used for all subsequent NLP operations. ```python nlp_service = riva.client.NLPService(auth) ``` -------------------------------- ### List Available Audio Output Devices Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Lists all available audio output devices on the system. This helps in selecting a specific device for audio playback. ```python # show available output devices riva.client.audio_io.list_output_devices() ``` -------------------------------- ### Instantiate ASR Service Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Creates an instance of the ASRService by passing the authentication object. This service is used for performing speech recognition tasks. ```python asr_service = riva.client.ASRService(auth) ``` -------------------------------- ### Real-time ASR from Audio File Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Transcribe an audio file in real-time using the ASR client. This simulates live transcription. ```bash python scripts/asr/realtime_asr_client.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav ``` -------------------------------- ### List Available Audio Input Devices Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Lists all available audio input devices (microphones) on the system. This is useful for selecting a specific microphone for live transcription. ```python riva.client.audio_io.list_input_devices() ``` -------------------------------- ### Basic TTS with Audio Playback Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use the `talk.py` script for basic text-to-speech synthesis and play the synthesized audio directly. ```bash python scripts/tts/talk.py --play-audio ``` -------------------------------- ### Run All Integration Tests Source: https://github.com/nvidia-riva/python-clients/blob/main/tests/integration/README.md Execute all integration tests for the Python Riva clients. Assumes Riva server is running at the default address. ```bash bash tests/integration/launch_all_scripts_testing.sh ``` -------------------------------- ### Play Synthesized Audio File Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Use IPython.display.Audio to play the generated WAV file directly in an environment like a Jupyter notebook. Ensure the file path is correct. ```python import IPython IPython.display.Audio(offline_output_file) ``` -------------------------------- ### Prepare Punctuation Queries Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Defines a list of text queries for punctuation and capitalization. Note that batch sizes greater than 1 may not work with certain Riva versions. ```python # Batches of sizes greater than 1 are not working with riva v2.2.0 and v2.2.1 punctuation_queries = [ "by the early 20th century the gar complained more and more about the younger generation", # "boa Vista is the capital of the brazilian state of roraima situated on the western bank of " # "the branco river the city lies 220 km from brazil's border with venezuela.", ] ``` -------------------------------- ### NLP QA Client Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use the Question Answering (QA) client by providing both a query and relevant context. The client finds answers within the context. ```bash python scripts/nlp/qa_client.py \ --query "How many gigatons of carbon dioxide was released in 2005?" \ --context "In 2010 the Amazon rainforest experienced another severe drought, in some ways "\ "more extreme than the 2005 drought. The affected region was approximate 1,160,000 square "\ "miles (3,000,000 km2) of rainforest, compared to 734,000 square miles (1,900,000 km2) in "\ "2005. The 2010 drought had three epicenters where vegetation died off, whereas in 2005 the "\ "drought was focused on the southwestern part. The findings were published in the journal "\ "Science. In a typical year the Amazon absorbs 1.5 gigatons of carbon dioxide; during 2005 "\ "instead 5 gigatons were released and in 2010 8 gigatons were released." ``` -------------------------------- ### Run Tests with Custom Server Address Source: https://github.com/nvidia-riva/python-clients/blob/main/tests/integration/README.md Run all integration tests, specifying a custom server address and port using the SERVER environment variable. ```bash SERVER=YOUR_SERVER_ADDRESS_AND_PORT bash tests/integration/launch_all_scripts_testing.sh ``` -------------------------------- ### Instantiate Speech Synthesis Service Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Create an instance of the SpeechSynthesisService to perform TTS operations. Pass the authenticated client to the constructor. ```python tts_service = riva.client.SpeechSynthesisService(auth) ``` -------------------------------- ### Prepare Text Classification Queries Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Defines sample queries and the model name for text classification. Ensure the model name matches a deployed model on the Riva server. ```python text_class_queries = ["A hurricane is approaching Japan.", "What is the weather on Wednesday in Moscow?"] text_class_model = "riva_intent_weather" ``` -------------------------------- ### Real-time ASR from Microphone Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Capture audio from the microphone and transcribe it in real-time. Specify duration and output file. ```bash python scripts/asr/realtime_asr_client.py \ --mic \ --duration 30 \ --output-text transcript.txt ``` -------------------------------- ### Real-time ASR with Specific Audio Device Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use a specific audio input device for real-time microphone transcription. Requires PyAudio. ```bash python scripts/asr/realtime_asr_client.py \ --mic \ --input-device 1 \ --duration 30 \ --output-text transcript.txt ``` -------------------------------- ### Create Streaming Response Generator Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Creates a generator for streaming ASR responses from an audio iterator and streaming configuration. Ensure `streaming_config` is properly defined. ```python audio_chunk_iterator = riva.client.AudioChunkFileIterator(my_wav_file, 4800) response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config) ``` -------------------------------- ### Print Streaming Recognition Configuration Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Displays the current streaming recognition configuration settings. Useful for debugging and verifying settings. ```python print(streaming_config) ``` -------------------------------- ### Display Output File in Bash Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Uses the 'cat' command in a bash shell to display the content of the specified output file. ```bash !cat $output_file ``` -------------------------------- ### Define TTS Parameters Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Set up the text content, language code, and audio parameters for synthesis. These variables are used in subsequent synthesis calls. ```python language_code = 'en-US' sample_rate_hz = 16000 nchannels = 1 sampwidth = 2 text = ( "The United States of America, commonly known as the United States or America, " "is a country primarily located in North America. It consists of 50 states, " "a federal district, five major unincorporated territories, 326 Indian reservations, " "and nine minor outlying islands." ) ``` -------------------------------- ### Add Audio File Specifications to Config Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Applies audio file specifications (like sample rate and channels) to the recognition configuration. This is necessary for processing audio files. ```python my_wav_file = '../data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav' riva.client.add_audio_file_specs_to_config(offline_config, my_wav_file) riva.client.add_audio_file_specs_to_config(streaming_config, my_wav_file) ``` -------------------------------- ### GoogleTest License Source: https://github.com/nvidia-riva/python-clients/blob/main/Acknowledgments.txt This is the copyright and license notice for the GoogleTest framework. It specifies redistribution terms and disclaims warranties. ```text Copyright 2008, Google Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ``` -------------------------------- ### Play Streaming Synthesized Audio File Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Play the WAV file generated from streaming synthesis using IPython.display.Audio. This allows for immediate playback and verification of the audio quality. ```python import IPython IPython.display.Audio(streaming_output_file) ``` -------------------------------- ### TTS with File Output Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Save synthesized speech to a WAV file using the `talk.py` script. ```bash python scripts/tts/talk.py --output 'my_synth_speech.wav' ``` -------------------------------- ### Display Output File in CMD Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Uses the 'type' command in a Windows CMD shell to display the content of the specified output file. ```cmd !type $output_file ``` -------------------------------- ### NLP Intent Slot Filling Client Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use the intent slot filling client to process natural language queries. Can be run interactively or with direct queries. ```bash python scripts/nlp/intentslot_client.py --query "What is the weather tomorrow?" ``` ```bash python scripts/nlp/intentslot_client.py --interactive ``` -------------------------------- ### Google Logging Library (glog) License Source: https://github.com/nvidia-riva/python-clients/blob/main/Acknowledgments.txt This section provides the copyright and license details for the Google Logging Library (glog). It covers redistribution conditions and liability limitations. ```text Copyright (c) 2008, Google Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ``` -------------------------------- ### Realtime TTS from Text File Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Perform realtime TTS synthesis from a text file and save the output to a WAV file. ```bash python scripts/tts/realtime_tts_client.py \ --input-file input.txt \ --output output.wav ``` -------------------------------- ### Apache License 2.0 for abseil-cpp Source: https://github.com/nvidia-riva/python-clients/blob/main/Acknowledgments.txt This is the Apache License, Version 2.0, which governs the use, reproduction, and distribution of the abseil-cpp library. It defines terms such as License, Licensor, Legal Entity, You, Source, Object, Work, and Derivative Works. ```text Apache License Version 2.0, January 2004 https://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the ``` -------------------------------- ### Print Offline Recognition Configuration Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Displays the current offline recognition configuration settings. Useful for debugging and verifying settings. ```python print(offline_config) ``` -------------------------------- ### Run ASR Service Tests Source: https://github.com/nvidia-riva/python-clients/blob/main/tests/integration/README.md Execute integration tests specifically for the Automatic Speech Recognition (ASR) service. ```bash bash tests/integration/asr.sh ``` -------------------------------- ### NLP Punctuation Client Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Apply punctuation to text using the punctuation client. Supports interactive mode or direct query input. ```bash python scripts/nlp/punctuation_client.py --query "can you prove that you are self aware" ``` ```bash python scripts/nlp/punctuation_client.py --interactive ``` -------------------------------- ### Analyze Intent with Riva Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Use this snippet to analyze user input and extract the most likely intent and relevant slots. Ensure the 'riva.client' library is imported and 'nlp_service' is initialized. ```python options = riva.client.AnalyzeIntentOptions(lang='en-US') intent_query = "How is the weather today in New England?" response: AnalyzeIntentResponse = nlp_service.analyze_intent(intent_query, options) ``` -------------------------------- ### List Available TTS Voices Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use `realtime_tts_client.py` to list all available text-to-speech voices. ```bash python scripts/tts/realtime_tts_client.py --list-voices ``` -------------------------------- ### Add Word Boosting to Config Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Configures word boosting parameters to enhance the recognition of specific terms. This can improve accuracy for domain-specific vocabulary. ```python boosted_lm_words = ['AntiBERTa', 'ABlooper'] boosted_lm_score = 20.0 riva.client.add_word_boosting_to_config(offline_config, boosted_lm_words, boosted_lm_score) riva.client.add_word_boosting_to_config(streaming_config, boosted_lm_words, boosted_lm_score) ``` -------------------------------- ### Stream Audio from Microphone for Transcription Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Captures audio from the default microphone and streams it for real-time transcription. The transcription results are displayed as they are generated. ```python input_device = None # default device with riva.client.audio_io.MicrophoneStream( rate=streaming_config.config.sample_rate_hertz, chunk=streaming_config.config.sample_rate_hertz // 10, device=input_device, ) as audio_chunk_iterator: riva.client.print_streaming( responses=asr_service.streaming_response_generator( audio_chunks=audio_chunk_iterator, streaming_config=streaming_config, ), show_intermediate=True, ) ``` -------------------------------- ### Write Offline Synthesized Audio to WAV File Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Save the synthesized audio data to a WAV file. This involves setting the audio format parameters such as channels, sample width, and frame rate. ```python import wave offline_output_file = "my_offline_synthesized_speech.wav" with wave.open(offline_output_file, 'wb') as out_f: out_f.setnchannels(nchannels) out_f.setsampwidth(sampwidth) out_f.setframerate(sample_rate_hz) out_f.writeframesraw(resp.audio) ``` -------------------------------- ### Print Streaming Results with Intermediate Transcripts Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Enables displaying intermediate transcription results as they are formed, useful for real-time feedback. Requires setting a delay callback in the audio iterator. ```python audio_chunk_iterator = riva.client.AudioChunkFileIterator(my_wav_file, 4800, riva.client.sleep_audio_length) response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config) riva.client.print_streaming(response_generator, show_intermediate=True) ``` -------------------------------- ### Realtime TTS with Specific Voice and Language Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Synthesize speech using a specific voice, language code, and play the output to a file. ```bash python scripts/tts/realtime_tts_client.py \ --text "Hello world" \ --language-code en-US \ --voice English-US.Female-1 \ --output output.wav \ --play-audio ``` -------------------------------- ### Extract and Print First Punctuation Result Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Extracts the transformed text for the first query from the response and prints it. This shows the result of applying punctuation and capitalization. ```python first_query_result = response.text[0] print(first_query_result) ``` -------------------------------- ### Play Audio During Streaming Synthesis Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Synthesize speech in streaming mode and play the audio chunks as they arrive using `riva.client.audio_io.SoundCallBack`. This provides real-time audio feedback. ```python output_device = None # use default device sound_stream = riva.client.audio_io.SoundCallBack( output_device, nchannels=nchannels, sampwidth=sampwidth, framerate=sample_rate_hz ) for resp in tts_service.synthesize_online(text, language_code=language_code, sample_rate_hz=sample_rate_hz): sound_stream(resp.audio) ``` -------------------------------- ### Offline Transcription with Word Boosting Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Improve transcription accuracy for specific words by using the word boosting feature. Specify words and a boost score. ```bash python scripts/asr/transcribe_file_offline.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \ --boosted-lm-words AntiBERTa \ --boosted-lm-words ABlooper \ --boosted-lm-score 20.0 ``` -------------------------------- ### Realtime TTS with Direct Text Input Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Synthesize speech in realtime from direct text input using `realtime_tts_client.py` and play the audio. ```bash python scripts/tts/realtime_tts_client.py \ --text "Hello, this is a text to speech example." \ --play-audio ``` -------------------------------- ### gflags License Source: https://github.com/nvidia-riva/python-clients/blob/main/Acknowledgments.txt This is the copyright and license notice for the gflags library. It outlines the terms for redistribution and use, along with warranty disclaimers. ```text Copyright (c) 2006, Google Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ``` -------------------------------- ### zlib License Source: https://github.com/nvidia-riva/python-clients/blob/main/Acknowledgments.txt This is the license for the zlib compression library. It grants permission to use, modify, and redistribute the software freely, with certain restrictions regarding origin and modification claims. ```text Copyright (C) 1995-2017 Jean-loup Gailly and Mark Adler This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. Jean-loup Gailly Mark Adler jloup@gzip.org madler@alumni.caltech.edu The data format used by the zlib library is described by RFCs (Request for Comments) 1950 to 1952 in the files http://tools.ietf.org/html/rfc1950 (zlib format), rfc1951 (deflate format) and rfc1952 (gzip format). ``` -------------------------------- ### Streaming TTS with Audio Playback Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Utilize streaming mode with `talk.py` to receive and play audio fragments as they become available. ```bash python scripts/tts/talk.py --stream --play-audio ``` -------------------------------- ### Run Specific Script Test (TTS) Source: https://github.com/nvidia-riva/python-clients/blob/main/tests/integration/README.md Execute integration tests for a specific script, such as 'test_talk.sh' within the Text-to-Speech (TTS) service. ```bash bash tests/integration/tts/test_talk.sh ``` -------------------------------- ### NLP Text Classification Client Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Perform text classification using the client. This requires only a query string as input. ```bash python scripts/nlp/text_classify_client.py --query "How much sun does california get?" ``` -------------------------------- ### Print All Probable Intents and Confidences Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Displays the lists of most probable intents and their corresponding confidence scores for all processed queries. ```python print(classes) print(probs) ``` -------------------------------- ### Stream Audio File for Transcription Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Use this script to transcribe an audio file in streaming mode. It shows how the transcript grows over time. ```bash python scripts/asr/transcribe_file.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav ``` ```bash python scripts/asr/transcribe_file.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \ --simulate-realtime \ --show-intermediate ``` ```bash python scripts/asr/transcribe_file.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \ --play-audio \ --show-intermediate ``` -------------------------------- ### Write Streaming Synthesized Audio to WAV File Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Save the concatenated audio data from streaming synthesis to a WAV file. Ensure the audio format parameters match the synthesis settings. ```python import wave streaming_output_file = "my_streaming_synthesized_speech.wav" with wave.open(streaming_output_file, 'wb') as out_f: out_f.setnchannels(nchannels) out_f.setsampwidth(sampwidth) out_f.setframerate(sample_rate_hz) out_f.writeframesraw(streaming_audio) ``` -------------------------------- ### Perform Text Transformation (Punctuation) Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Calls the `punctuate_text` method to add punctuation and capitalization to the input text. The response is typed as `TextTransformResponse`. ```python response: TextTransformResponse = nlp_service.punctuate_text(punctuation_queries) ``` -------------------------------- ### Prepare Queries for Asynchronous Processing Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Defines lists of queries intended for asynchronous processing, such as text classification or punctuation tasks. Note that batching is not supported in Riva v2.2.0 and v2.2.1. ```python text_class_queries = ["A hurricane is approaching Japan.", "What is the weather on Wednesday in Moscow?"] text_class_model = "riva_intent_weather" punctuation_queries = [ "by the early 20th century the gar complained more and more about the younger generation", # "boa Vista is the capital of the brazilian state of roraima situated on the western bank of " ] ``` -------------------------------- ### Print All Token Classification Predictions Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Displays the extracted tokens, class names, confidences, and span positions for the first batch element. Note that spans may not work correctly for batches with more than one element. ```python print("First batch element tokens:", tokens[0]) print("First batch element first token class name:", class_names[0][0]) print(confidences) print(starts) print(ends) ``` -------------------------------- ### Perform Streaming TTS Synthesis Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Initiate streaming TTS synthesis, where audio is returned in multiple chunks as it becomes available. This is efficient for longer texts or real-time applications. ```python responses = tts_service.synthesize_online(text, language_code=language_code, sample_rate_hz=sample_rate_hz) ``` -------------------------------- ### Print Audio Length Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Display the length of the synthesized audio in bytes. This can be useful for debugging or verifying the output. ```python print(len(audio)) ``` -------------------------------- ### Print Streaming Results to Multiple Outputs Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Directs streaming ASR results to both standard output (stdout) and a specified file. Useful for logging and real-time viewing. ```python import sys output_file = "my_results.txt" audio_chunk_iterator = riva.client.AudioChunkFileIterator(my_wav_file, 4800) response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config) riva.client.print_streaming(response_generator, additional_info='confidence', output_file=[sys.stdout, output_file]) ``` -------------------------------- ### Define NLP Task References Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Defines reference outputs for various NLP tasks to be used in assertions. ```python text_class_reference = ['weather.alert', 'weather.weather'] token_class_reference = ['wednesday', 'moscow', '?'] punctuation_reference = ( "By the early 20th century, the Gar complained more and more about the younger generation." ) intent_analysis_reference = "weather.weather" natural_query_reference = "5" ``` -------------------------------- ### Perform Natural Language Query Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Use this snippet to ask a question against a provided context. The 'nlp_service.natural_query' method returns a NaturalQueryResponse object. ```python qa_query = "How many gigatons of carbon dioxide was released in 2005?" qa_context = ( "In 2010 the Amazon rainforest experienced another severe drought, in some ways more extreme than the " "2005 drought. The affected region was approximate 1,160,000 square miles (3,000,000 km2) of " "rainforest, compared to 734,000 square miles (1,900,000 km2) in 2005. The 2010 drought had three " "epicenters where vegetation died off, whereas in 2005 the drought was focused on the southwestern " "part. The findings were published in the journal Science. In a typical year the Amazon absorbs 1.5 " "gigatons of carbon dioxide; during 2005 instead 5 gigatons were released and in 2010 8 gigatons were " "released." ) response: NaturalQueryResponse = nlp_service.natural_query(qa_query, qa_context) ``` -------------------------------- ### Import NLP Protobuf Messages Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Imports necessary protobuf message types for NLP API responses. These are used for type hinting and parsing results. ```python from riva.client.proto.riva_nlp_pb2 import ( AnalyzeIntentResponse, NaturalQueryResponse, TextClassResponse, TextTransformResponse, TokenClassResponse, ) ``` -------------------------------- ### Extract Audio and Metadata from Offline Response Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Access the raw audio data and associated metadata from the offline synthesis response. The audio is typically a byte string. ```python audio = resp.audio meta = resp.meta ``` -------------------------------- ### Print Natural Query Answer Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Prints the extracted answer from the natural query results. ```python print(answer) ``` -------------------------------- ### Extract Intent and Slot Details Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Extracts and prints specific details from the AnalyzeIntentResponse, including intent name, score, domain name, score, and the first slot's token and label. ```python print("intent name:", response.intent.class_name) print("intent score:", response.intent.score) print("domain name:", response.domain.class_name) print("domain score:", response.domain.score) print("first slot token:", response.slots[0].token) print("first slot most probable label name:", response.slots[0].label[0].class_name) print("first slot most probable label score:", response.slots[0].label[0].score) ``` -------------------------------- ### Import Time Module for Asynchronous Calls Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Imports the 'time' module, typically used for measuring latency in asynchronous operations. ```python from time import time ``` -------------------------------- ### Perform Asynchronous Offline Recognition Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Recognizes speech asynchronously by setting `future=True`. This allows other operations to proceed while recognition is in progress. ```python from time import time num_repeats = 10 sync_transcripts = [] start_time = time() for _ in range(num_repeats): sync_transcripts.append( asr_service.offline_recognize(data, offline_config).results[0].alternatives[0].transcript ) print(f"Time spent on synchronous recognition: {time() - start_time:.2f}") async_transcripts = [] start_time = time() futures = [] for _ in range(num_repeats): futures.append(asr_service.offline_recognize(data, offline_config, future=True)) for f in futures: async_transcripts.append(f.result().results[0].alternatives[0].transcript) print(f"Time spent on async recognition: {time() - start_time:.2f}") assert sync_transcripts == async_transcripts ``` -------------------------------- ### Print Detected Intent Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Displays the extracted most probable intent class name for the first query. ```python print(detected_intent) ``` -------------------------------- ### Synchronous NLP Task Execution Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Executes multiple NLP tasks synchronously in a loop and measures the total time taken. This is suitable for sequential processing where immediate results are needed. ```python start_time = time() for repeat_idx in range(num_repeats): text_class_response = nlp_service.classify_text(text_class_queries, text_class_model) check_text_classification(text_class_reference, text_class_response, repeat_idx) token_class_response = nlp_service.classify_tokens(text_class_queries[1], text_class_model) check_token_classification(token_class_class_reference, token_class_response, repeat_idx) punctuation_response = nlp_service.punctuate_text(punctuation_queries) check_punctuation(punctuation_reference, punctuation_response, repeat_idx) intent_analysis_response = nlp_service.analyze_intent(intent_query, options) check_intent_analysis(intent_analysis_reference, intent_analysis_response, repeat_idx) natural_query_response = nlp_service.natural_query(qa_query, qa_context) check_natural_query(natural_query_reference, natural_query_response, repeat_idx) print(f"time spent on synchronous calls: {time() - start_time:.2f} sec") ``` -------------------------------- ### Read Audio Data for Offline Recognition Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Reads the raw audio data from a WAV file. This data will be used as input for the offline recognition service. ```python with open(my_wav_file, 'rb') as fh: data = fh.read() ``` -------------------------------- ### Asynchronous NLP Task Execution Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Executes multiple NLP tasks asynchronously using futures and measures the total time taken. This is beneficial for improving throughput by overlapping I/O operations. ```python start_time = time() futures = [] for _ in range(num_repeats): repeat_futures = [] repeat_futures.append(nlp_service.classify_text(text_class_queries, text_class_model, future=True)) repeat_futures.append(nlp_service.classify_tokens(text_class_queries[1], text_class_model, future=True)) repeat_futures.append(nlp_service.punctuate_text(punctuation_queries, future=True)) repeat_futures.append(nlp_service.analyze_intent(intent_query, options, future=True)) repeat_futures.append(nlp_service.natural_query(qa_query, qa_context, future=True)) futures.append(repeat_futures) for repeat_idx, repeat_futures in enumerate(futures): check_text_classification(text_class_reference, repeat_futures[0].result(), repeat_idx) check_token_classification(token_class_reference, repeat_futures[1].result(), repeat_idx) check_punctuation(punctuation_reference, repeat_futures[2].result(), repeat_idx) check_intent_analysis(intent_analysis_reference, repeat_futures[3].result(), repeat_idx) check_natural_query(natural_query_reference, repeat_futures[4].result(), repeat_idx) print(f"time spent on async calls: {time() - start_time:.2f}sec") ``` -------------------------------- ### Print Offline Recognition Response Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Displays the full response object from the offline recognition service. This includes all results and metadata. ```python print(response) ``` -------------------------------- ### Print Predicted Durations Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Show the number of predicted durations and the value of the first duration. This information relates to the timing of speech synthesis components. ```python print(len(predicted_durations)) print(predicted_durations[0]) ``` -------------------------------- ### Extract All Probable Intents and Confidences Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Uses a utility function to extract the most probable class and its confidence score for all queries in the batch. Returns two lists: one for classes and one for probabilities. ```python classes, probs = riva.client.extract_most_probable_text_class_and_confidence(response) ``` -------------------------------- ### Perform Asynchronous TTS Synthesis Multiple Times Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Measure the time taken to perform multiple asynchronous TTS synthesis calls using futures. This demonstrates how to achieve concurrency for potentially faster processing. ```python start_time = time() async_audio = [] futures = [] for _ in range(num_repeats): futures.append( tts_service.synthesize( text, language_code=language_code, sample_rate_hz=sample_rate_hz, future=True ) ) for f in futures: async_audio.append(f.result().audio) print(f"Async calls time: {time() - start_time:.2f}") ``` -------------------------------- ### Print Streaming Results Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb A utility function to print streaming ASR results. It can display additional information like timestamps. ```python riva.client.print_streaming(response_generator, additional_info='time') ``` -------------------------------- ### Print Natural Query Score Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Prints the score associated with the answer from the natural query results. ```python print(score) ``` -------------------------------- ### Concatenate Streaming Audio Chunks Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Collect all audio chunks received from the streaming synthesis response into a single byte string. This combined audio can then be saved or played. ```python streaming_audio = b'' for resp in responses: streaming_audio += resp.audio ``` -------------------------------- ### Print Single Token Classification Prediction Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Displays the extracted token, its class name, and confidence score. ```python print(token, class_name, class_score) ``` -------------------------------- ### NLP Task Helper Functions Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Provides helper functions to check the results of different NLP tasks against expected references. These functions are used for validating both synchronous and asynchronous API calls. ```python from typing import List def check_text_classification(ref: List[str], resp: TextClassResponse, repeat_idx: int) -> None: labels, _ = riva.client.extract_most_probable_text_class_and_confidence(resp) assert labels == ref, f"On repeat {repeat_idx} expected text classification results {ref}, but got {labels}." def check_token_classification(ref: List[str], resp: TokenClassResponse, repeat_idx: int) -> None: tokens = riva.client.extract_most_probable_token_classification_predictions(resp)[0][0] assert tokens == token_class_reference, ( f"On repeat {repeat_idx} expected to find token classification tokens {ref}, but got {tokens}." ) def check_punctuation(ref: str, resp: TextTransformResponse, repeat_idx: int) -> None: output = resp.text[0] assert output == ref, ( f"On repeat {repeat_idx} expected punctuated output '{ref}', but got '{output}'." ) def check_intent_analysis(ref: str, resp: AnalyzeIntentResponse, repeat_idx: int) -> None: output = resp.intent.class_name assert output == ref, f"On repeat {repeat_idx} expected intent is '{ref}', but got '{output}'." def check_natural_query(ref: str, resp: NaturalQueryResponse, repeat_idx: int) -> None: answer = resp.results[0].answer assert answer == ref, f"On repeat {repeat_idx} expected answer is '{ref}', but got '{answer}'." ``` -------------------------------- ### Offline Transcription of Audio File Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Perform offline transcription of an audio file. This is suitable for processing complete audio files. ```bash python scripts/asr/transcribe_file_offline.py \ --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav ``` -------------------------------- ### Extract All Token Classification Predictions Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Uses a utility function to extract detailed token classification predictions for all elements in a batch. This includes tokens, class names, confidences, and span start/end positions. ```python tokens, class_names, confidences, starts, ends = riva.client.extract_most_probable_token_classification_predictions( response ) ``` -------------------------------- ### Perform Synchronous TTS Synthesis Multiple Times Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Measure the time taken to perform multiple synchronous TTS synthesis calls. This serves as a baseline for comparing asynchronous performance. ```python start_time = time() sync_audio = [] for _ in range(num_repeats): sync_audio.append( tts_service.synthesize(text, language_code=language_code, sample_rate_hz=sample_rate_hz).audio ) print(f"Synchronous calls time: {time() - start_time:.2f}") ``` -------------------------------- ### NLP NER Client with Span End Output Source: https://github.com/nvidia-riva/python-clients/blob/main/README.md Extract NER information and output the ending position of the identified entity spans. ```bash python scripts/nlp/ner_client.py \ --query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \ --test span_end ``` -------------------------------- ### Perform Token Classification Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Calls the `classify_tokens` method for a specific query to classify individual tokens. The response is typed as `TokenClassResponse`. ```python response: TokenClassResponse = nlp_service.classify_tokens(text_class_queries[1], text_class_model) ``` -------------------------------- ### Perform Offline TTS Synthesis Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Synthesize speech from text in offline mode, where the entire audio result is returned in a single response. This is suitable for shorter texts or when immediate full audio is needed. ```python resp = tts_service.synthesize(text, language_code=language_code, sample_rate_hz=sample_rate_hz) ``` -------------------------------- ### Extract Confidence from Offline Response Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Retrieves the confidence score for the top transcript alternative from the offline recognition response. ```python print(response.results[0].alternatives[0].confidence) ``` -------------------------------- ### Extract Answer and Score from Natural Query Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/NLP.ipynb Extracts the answer and its associated score from the first result of a NaturalQueryResponse. ```python answer = response.results[0].answer score = response.results[0].score ``` -------------------------------- ### Remove Output File in CMD Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Uses the 'del' command in a Windows CMD shell to delete the specified output file. ```cmd !del $output_file ``` -------------------------------- ### Perform Offline Speech Recognition Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Sends audio data to the ASR service for offline recognition. The response contains the recognized transcript and confidence scores. ```python response = asr_service.offline_recognize(data, offline_config) ``` -------------------------------- ### Extract Transcript from Offline Response Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Retrieves the most likely transcript from the offline recognition response. Accesses the first result and its top alternative. ```python print(response.results[0].alternatives[0].transcript) ``` -------------------------------- ### Print Processed Text Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Display the text after it has been processed by the TTS engine. This may differ from the input text due to normalization or other transformations. ```python print(processed_text) ``` -------------------------------- ### Extract Processed Text and Durations Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/TTS.ipynb Retrieve the text that was actually processed by the TTS engine and the predicted durations for each word or phoneme. This metadata can be useful for alignment or analysis. ```python processed_text = meta.processed_text predicted_durations = meta.predicted_durations ``` -------------------------------- ### Remove Output File in Bash Source: https://github.com/nvidia-riva/python-clients/blob/main/tutorials/ASR.ipynb Uses the 'rm' command in a bash shell to delete the specified output file. ```bash !rm $output_file ```