Audio Control Service
About 4107 wordsAbout 14 min
2025-05-29
Provides a voice service controller for the robot system. Through AudioController, you can use RPC to control robot audio commands and obtain status.
Interface Definition
AudioController
is a C++ class encapsulating audio control functions, mainly used for audio playback control, TTS playback, volume setting and query, and subscribing to raw voice data.
AudioController
Item | Description |
---|---|
Function Name | AudioController |
Declaration | AudioController(); |
Overview | Initializes the audio controller object, constructs internal state, allocates resources, etc. |
Note | Constructs internal state. |
~AudioController
Item | Description |
---|---|
Function Name | ~AudioController |
Declaration | ~AudioController(); |
Overview | Releases audio controller resources, ensures playback is stopped and underlying resources are cleaned up. |
Note | Ensures resources are safely released. |
Initialize
Item | Description |
---|---|
Function Name | Initialize |
Declaration | bool Initialize(); |
Overview | Initializes the audio control module, prepares playback resources and devices. |
Return Value | true for success, false for failure. |
Note | Used in pair with Shutdown() . |
Shutdown
Item | Description |
---|---|
Function Name | Shutdown |
Declaration | void Shutdown(); |
Overview | Shuts down the audio controller and releases resources. |
Note | Be sure to call before destruction. |
GetVoiceConfig
Item | Description |
---|---|
Function Name | GetVoiceConfig |
Declaration | Status GetVoiceConfig(GetSpeechConfig& config); |
Overview | Gets the complete configuration information of the voice system. |
Parameter | config : Returns the voice system configuration by reference. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. The configuration includes all sub-configs such as speaker, bot, wakeup, dialog, and the current TTS model type. |
SwitchTtsVoiceModel
Item | Description |
---|---|
Function Name | SwitchTtsVoiceModel |
Declaration | Status SwitchTtsVoiceModel(TtsType tts_type, GetSpeechConfig& config); |
Overview | Switches the TTS model and gets the updated configuration. |
Parameter | tts_type : TTS model type to switch toconfig : Returns the updated voice system configuration by reference |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. The configuration is automatically updated after switching the TTS model. |
SetVoiceConfig
Item | Description |
---|---|
Function Name | SetVoiceConfig |
Declaration | Status SetVoiceConfig(const SetSpeechConfig& config); |
Overview | Sets the complete configuration information of the voice system. |
Parameter | config : The voice system configuration to set. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. The set configuration will completely overwrite all current voice-related configurations. |
Play
Item | Description |
---|---|
Function Name | Play |
Declaration | Status Play(const TtsCommand& cmd); |
Overview | Plays a TTS (Text-to-Speech) command. |
Parameter | cmd : TTS command, including text, speed, pitch, etc. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. Make sure the module is initialized before calling. |
Stop
Item | Description |
---|---|
Function Name | Stop |
Declaration | Status Stop(); |
Overview | Stops the current audio playback. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. Usually used to interrupt current speech. |
SetVolume
Item | Description |
---|---|
Function Name | SetVolume |
Declaration | Status SetVolume(int volume); |
Overview | Sets the audio output volume. |
Parameter | volume : Volume value, usually in the range 0~100. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. Takes effect immediately after setting. |
GetVolume
Item | Description |
---|---|
Function Name | GetVolume |
Declaration | Status GetVolume(int& volume); |
Overview | Gets the current audio output volume. |
Parameter | volume : Returns the current volume value by reference. |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. Check the return value before using volume . |
ControlVoiceStream
Item | Description |
---|---|
Function Name | ControlVoiceStream |
Declaration | Status ControlVoiceStream(bool raw_data, bool bf_data); |
Overview | Controls the voice data stream. |
Parameter | raw_data : Whether to send raw databf_data : Whether to send bf data |
Return Value | Status::OK for success, others for failure. |
Note | Blocking interface. Used to control the opening and closing of the voice data stream. |
SubscribeOriginVoiceData
Item | Description |
---|---|
Function Name | SubscribeOriginVoiceData |
Declaration | void SubscribeOriginVoiceData(const RawVoiceDataCallback callback); |
Overview | Subscribes to raw voice data. |
Parameter | callback: Callback to handle received raw voice data. |
Note | Non-blocking interface. The callback will be called when data is updated. |
SubscribeBfVoiceData
Item | Description |
---|---|
Function Name | SubscribeBfVoiceData |
Declaration | void SubscribeBfVoiceData(const BfVoiceDataCallback callback); |
Overview | Subscribes to BF voice data. |
Parameter | callback: Callback to handle received BF voice data. |
Note | Non-blocking interface. The callback will be called when data is updated. |
Type Definitions
TtsPriority
— TTS Playback Priority Level
Used to control interruption behavior between different TTS tasks. Higher priority tasks will interrupt the playback of current lower priority tasks.
Enum Value | Value | Description |
---|---|---|
TtsPriority::HIGH | 0 | Highest priority, e.g., low battery alert, emergency reminder |
TtsPriority::MIDDLE | 1 | Medium priority, e.g., system prompt, status broadcast |
TtsPriority::LOW | 2 | Lowest priority, e.g., daily voice dialogue, background broadcast |
TtsMode
— Task Scheduling Policy at the Same Priority
Used to refine the playback order and clearing logic of multiple TTS tasks under the same priority.
Enum Value | Value | Description |
---|---|---|
TtsMode::CLEARTOP | 0 | Clear all tasks of the current priority (including playing and waiting queue), play this request immediately |
TtsMode::ADD | 1 | Add this request to the end of the current priority queue, play in order (do not interrupt current playback) |
TtsMode::CLEARBUFFER | 2 | Clear unplayed requests in the queue, keep current playback, then play this request |
Struct Definitions
TtsCommand
— TTS Playback Command Structure
Describes the complete information of a TTS playback request, supporting unique ID, text content, priority control, and scheduling mode under the same priority.
Field Name | Type | Description |
---|---|---|
id | std::string | Unique TTS task ID, e.g., "id_01" , used to track playback status |
content | std::string | Text content to play, e.g., "Hello, welcome to the intelligent voice system." |
priority | TtsPriority | Playback priority, controls whether to interrupt lower priority speech |
mode | TtsMode | Scheduling policy under the same priority, controls whether to append, overwrite, etc. |
CustomBotInfo
— Custom Bot Configuration Structure
Describes the basic configuration information of a custom bot.
Field Name | Type | Description |
---|---|---|
name | std::string | Bot name |
workflow | std::string | Workflow ID |
token | std::string | User authorization token |
CustomBotMap
— Custom Bot Mapping Table
Item | Description |
---|---|
Type | std::map<std::string, CustomBotInfo> |
Description | Custom bot mapping table, key is bot ID, value is bot configuration info |
SetSpeechConfig
— Speech Configuration Parameter Structure
Used to set various configuration parameters of the voice system.
Field Name | Type | Description |
---|---|---|
speaker_id | std::string | Speaker ID |
region | std::string | Speaker region |
bot_id | std::string | Mode ID |
is_front_doa | bool | Force recognition from the front |
is_fullduplex_enable | bool | Natural conversation switch |
is_enable | bool | Voice switch |
is_doa_enable | bool | Enable wakeup direction turning |
speaker_speed | float | TTS playback speed, range [1,2] |
wakeup_name | std::string | Wakeup name |
custom_bot | CustomBotMap | Custom bot configuration |
SpeakerConfigSelected
— Selected Speaker Configuration Structure
Current speaker configuration
Field Name | Type | Description |
---|---|---|
region | std::string | Selected region |
speaker_id | std::string | Selected speaker ID |
SpeakerConfig
— Speaker Configuration Structure
Speaker configuration
Field Name | Type | Description |
---|---|---|
data | std::map<std::string, std::vector<std::array<std::string, 2>>> | Speaker data: region->speaker ID->speaker name |
selected | SpeakerConfigSelected | Currently selected speaker configuration |
speaker_speed | float | Speech speed |
BotInfo
— Bot Configuration Information Structure
Describes the basic information of a standard bot.
Field Name | Type | Description |
---|---|---|
name | std::string | Work scenario name |
workflow | std::string | Workflow ID |
BotConfigSelected
— Selected Bot Structure
Field Name | Type | Description |
---|---|---|
bot_id | std::string | Selected bot ID |
BotConfig
— Bot Configuration Structure
Describes bot-related configuration information, including standard bots and custom bots.
Field Name | Type | Description |
---|---|---|
data | std::map<std::string, BotInfo> | Standard bot data: bot ID->bot info |
custom_data | std::map<std::string, CustomBotInfo> | Custom bot data: bot ID->custom bot info |
selected | BotConfigSelected | Currently selected bot configuration |
WakeupConfig
— Wakeup Configuration Structure
Describes wakeup-related configuration information.
Field Name | Type | Description |
---|---|---|
name | std::string | Wakeup name |
data | std::map<std::string, std::string> | Wakeup word data: wakeup word->pinyin |
DialogConfig
— Dialog Configuration Structure
Describes dialog-related configuration parameters.
Field Name | Type | Description |
---|---|---|
is_front_doa | bool | Force enhanced pickup from the front |
is_fullduplex_enable | bool | Enable full-duplex dialog |
is_enable | bool | Enable voice |
is_doa_enable | bool | Enable wakeup direction turning |
TtsType
— TTS Model Enum
Describes the available TTS model types.
Enum Value | Value | Description |
---|---|---|
TtsType::NONE | 0 | No TTS model |
TtsType::DOUBAO | 1 | Doubao TTS model |
TtsType::GOOGLE | 2 | Google TTS model |
GetSpeechConfig
— Complete Voice System Configuration Structure
Describes the complete configuration information of the voice system, including all sub-configuration modules.
Field Name | Type | Description |
---|---|---|
speaker_config | SpeakerConfig | Speaker configuration |
bot_config | BotConfig | Bot configuration |
wakeup_config | WakeupConfig | Wakeup configuration |
dialog_config | DialogConfig | Dialog configuration |
tts_type | TtsType | TTS model, default is TtsType::NONE |
MultiArrayDimension
— Multi-dimensional Array Dimension Description
Field Name | Type | Description |
---|---|---|
label | std::string | Dimension label |
size | int32_t | Dimension size |
stride | int32_t | Stride |
MultiArrayLayout
— Multi-dimensional Array Layout Description
Field Name | Type | Description |
---|---|---|
dim_size | int32_t | Number of dimensions |
dim | std::vector<MultiArrayDimension> | Dimension array |
data_offset | int32_t | Data offset |
ByteMultiArray
— Byte Array Data Structure
Field Name | Type | Description |
---|---|---|
layout | MultiArrayLayout | Array layout information |
data | std::vector<uint8_t> | Byte data array |