Audio Control Service

About 3880 wordsAbout 13 min

2025-01-27

Provides robot system audio control, including TTS synthesis, audio playback, and audio configuration.

API Definition

AudioController is responsible for robot audio control, including TTS synthesis, audio playback, and audio configuration.

`AudioController` — Audio Controller

Item	Description
Class Name	`AudioController`
Overview	Robot audio control, including TTS synthesis, audio playback, and audio configuration
Main Features	Audio playback, volume control, TTS model switching, raw audio data subscription
Use Cases	Voice interaction, audio playback, speech recognition

initialize

Item	Description
Method Name	`initialize`
Declaration	`bool initialize()`
Overview	Initialize the audio controller.
Return Value	`true` for success, `false` for failure.
Note	Must be called before first use.

shutdown

Item	Description
Method Name	`shutdown`
Declaration	`void shutdown()`
Overview	Shut down the audio controller.
Note	Used together with initialize.

play

Item	Description
Method Name	`play`
Declaration	`Status play(TtsCommand command)`
Overview	Play a TTS (Text-to-Speech) command.
Parameters	`command`: TTS command, including text, speed, pitch, etc.
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, ensure the module is initialized before calling.

stop

Item	Description
Method Name	`stop`
Declaration	`Status stop()`
Overview	Stop current audio playback.
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, usually used to interrupt current speech.

set_volume

Item	Description
Method Name	`set_volume`
Declaration	`Status set_volume(int volume)`
Overview	Set the audio output volume.
Parameters	`volume`: Volume value, usually in the range 0~100.
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, takes effect immediately after setting.

get_volume

Item	Description
Method Name	`get_volume`
Declaration	`int get_volume()`
Overview	Get the current audio output volume.
Return Value	Current volume value, returns -1 on failure.
Note	Non-blocking interface.

switch_tts_voice_model

Item	Description
Method Name	`switch_tts_voice_model`
Declaration	`Status switch_tts_voice_model(TtsType tts_type, GetSpeechConfig config)`
Overview	Switch TTS voice model and get updated configuration.
Parameters	`tts_type`: TTS model type to switch `config`: Speech configuration object
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, configuration will be updated automatically after switching.

get_voice_config

Item	Description
Method Name	`get_voice_config`
Declaration	`GetSpeechConfig get_voice_config()`
Overview	Get the complete configuration of the speech system.
Return Value	Speech system configuration, returns empty config object on failure.
Note	Blocking interface, the configuration includes all sub-configs such as speaker, bot, wakeup, dialog, etc.

set_voice_config

Item	Description
Method Name	`set_voice_config`
Declaration	`Status set_voice_config(SetSpeechConfig config)`
Overview	Set the complete configuration of the speech system.
Parameters	`config`: Speech system configuration to set.
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, the set configuration will completely overwrite all current speech-related configs.

control_voice_stream

Item	Description
Method Name	`control_voice_stream`
Declaration	`Status control_voice_stream(bool raw_data, bool bf_data)`
Overview	Control the voice data stream.
Parameters	`raw_data`: Whether to send raw data `bf_data`: Whether to send bf data
Return Value	`Status::OK` for success, others for failure.
Note	Blocking interface, used to control the start and stop of the voice data stream.

Item	Description
Method Name	`subscribe_origin_voice_data`
Declaration	`void subscribe_origin_voice_data(callback)`
Overview	Subscribe to raw voice data.
Parameters	`callback`: Callback to handle received raw voice data. Function signature: callback(data : ByteMultiArray) -> None
Note	Non-blocking interface, callback will be called when data is updated.

Item	Description
Method Name	`subscribe_bf_voice_data`
Declaration	`void subscribe_bf_voice_data(callback)`
Overview	Subscribe to BF voice data.
Parameters	`callback`: Callback to handle received BF voice data. Function signature: callback(data : ByteMultiArray) -> None
Note	Non-blocking interface, callback will be called when data is updated.

Enum Type Definitions

`TtsPriority` — TTS Priority Enum

Enum Value	Value	Description
`HIGH`	0	High priority
`MIDDLE`	1	Medium priority
`LOW`	2	Low priority

`TtsMode` — TTS Mode Enum

Enum Value	Value	Description
`CLEARTOP`	0	Clear top
`ADD`	1	Add
`CLEARBUFFER`	2	Clear buffer

`TtsType` — TTS Type Enum

Enum Value	Value	Description
`NONE`	0	None
`DOUBAO`	1	Doubao
`GOOGLE`	2	Google

Data Structure Definitions

`S2DString` — 2D String Array

Item	Description
Type	Python binding of `std::array<std::string, 2>`
Overview	Fixed-size 2D string array containing two string elements
Main Methods	Supports index access, iteration, length query, fixed length 2
Use Cases	Key-value storage, configuration parameters, status identification

`S2DStringVector` — 2D String Array Vector

Item	Description
Type	Python binding of `std::vector<std::array<std::string, 2>>`
Overview	Variable-length 2D string array, each element is an array of two strings
Main Methods	Supports index access, iteration, length query, dynamic add/delete, each element length is 2
Use Cases	Batch key-value storage, configuration parameter sets, status identification lists

`String2DStringVectorMap` — String to 2D String Array Vector Map

Item	Description
Type	Python binding of `std::map<std::string, std::vector<std::array<std::string, 2>>>`
Overview	Map from string key to 2D string array vector, used for configuration parameters
Main Methods	Supports standard Python dict operations: key-value access, iteration, length query, etc.
Use Cases	Speaker configuration, bot configuration, system parameter storage

`StringBotInfoMap` — String to Bot Info Map

Item	Description
Type	Python binding of `std::map<std::string, magic::dog::BotInfo>`
Overview	Map from string key to bot info, used to manage multiple bot configurations
Main Methods	Supports standard Python dict operations, each value is a BotInfo struct
Use Cases	Bot management, configuration storage, multi-bot systems

`StringCustomBotMap` — Custom Bot Map

Item	Description
Type	Python binding of `std::map<std::string, magic::dog::CustomBotInfo>`
Overview	Map from string key to custom bot info, used to manage user-defined bots
Main Methods	Supports standard Python dict operations, each value is a CustomBotInfo struct
Use Cases	Custom bot management, user configuration storage

`StringStringMap` — String to String Map

Item	Description
Type	Python binding of `std::map<std::string, std::string>`
Overview	Simple mapping from string key to string value, used for configuration parameters
Main Methods	Supports standard Python dict operations, both key and value are strings
Use Cases	Simple configuration storage, parameter mapping, status identification

`TtsCommand` — TTS Command Struct

Field Name	Type	Description
`id`	`int`	Command ID
`content`	`str`	Content
`priority`	`TtsPriority`	Priority
`mode`	`TtsMode`	Mode

`GetSpeechConfig` — Speech Config Get Struct

Field Name	Type	Description
`tts_type`	`TtsType`	TTS Type
`speaker_config`	`SpeakerConfig`	Speaker configuration
`bot_config`	`BotConfig`	Bot configuration
`wakeup_config`	`WakeupConfig`	Wakeup configuration
`dialog_config`	`DialogConfig`	Dialog configuration

`SetSpeechConfig` — Speech Config Set Struct

Field Name	Type	Description
`speaker_id`	`str`	Speaker ID
`region`	`str`	Speaker region
`bot_id`	`str`	Bot ID
`is_front_doa`	`bool`	Enable front DOA
`is_fullduplex_enable`	`bool`	Enable full-duplex dialog
`is_enable`	`bool`	Enable dialog function
`is_doa_enable`	`bool`	Enable DOA
`speaker_speed`	`float`	Speaker speed
`wakeup_name`	`str`	Wakeup word name
`custom_bot`	`CustomBotInfo`	Custom bot info

`SpeakerConfig` — Speaker Config Struct

Field Name	Type	Description
`data`	`MapStringVectorArray2DString`	Speaker data map
`selected`	`SpeakerConfigSelected`	Selected speaker config
`speaker_speed`	`float`	Speaker speed

`BotConfig` — Bot Config Struct

Field Name	Type	Description
`data`	`MapStringBotInfo`	Bot data map
`custom_data`	`CustomBotMap`	Custom bot data
`selected`	`BotInfo`	Selected bot

`WakeupConfig` — Wakeup Config Struct

Field Name	Type	Description
`name`	`str`	Wakeup word name
`data`	`MapStringString`	Wakeup data map

`DialogConfig` — Dialog Config Struct

Field Name	Type	Description
`is_front_doa`	`bool`	Enable front DOA
`is_fullduplex_enable`	`bool`	Enable full-duplex dialog
`is_enable`	`bool`	Enable dialog function
`is_doa_enable`	`bool`	Enable DOA

`SpeakerConfigSelected` — Speaker Config Selected Struct

Field Name	Type	Description
`speaker_id`	`str`	Speaker ID
`region`	`str`	Speaker region

`BotInfo` — Bot Info Struct

Field Name	Type	Description
`name`	`str`	Bot name
`workflow`	`str`	Workflow ID

`CustomBotInfo` — Custom Bot Info Struct

Field Name	Type	Description
`name`	`str`	Custom bot name
`workflow`	`str`	Custom workflow ID
`token`	`str`	Access token