Introduction
Smart home systems that combine voice, gesture, and sensor inputs face a fundamental integration challenge: each modality produces different output formats, but actuators need simple, consistent commands.
The solution is token normalization—mapping every input to a canonical vocabulary before it reaches the dispatcher.
Architecture
In the Sentinel Edge system, voice (Vosk) and gesture (MediaPipe) publishers both emit tokens like fan_on, mode_away, and door_open to the same MQTT topic. The dispatcher's ActionRouter FSM consumes these tokens and routes them to actuator-specific topics.
# Both publishers emit the same format
client.publish('sentinel/commands', json.dumps({
'token': 'fan_on',
'source': 'voice', # or 'gesture'
'confidence': 0.95
}))Guardian FSM
The ActionRouter implements six modes: HOME, AWAY, SLEEP, GUEST, ALERT, and EMERGENCY. Mode transitions are triggered by tokens (mode_away) or sensor events (gas_detected). ALERT mode auto-responds (fan_on) and starts a 30-second countdown to EMERGENCY unless cancelled.
- HOME: all commands allowed
- AWAY: security-focused, limited actuators
- ALERT: auto-response + escalation countdown
- EMERGENCY: all safety actuators activated
E2E Testing
Integration tests publish tokens to MQTT and verify actuator topic messages and FSM state transitions. This catches routing bugs that unit tests miss.
