Apex Aide apexaide

From Audio to Action: How Speech Invocable Action Powers Native AI Automation Across Salesforce

By Yaheli Salina· Salesforce Engineering Blog· ·Advanced ·Developer ·7 min read
Summary

Salesforce’s Agentforce Speech Foundations team created the Speech Invocable Action, a native AI-powered speech automation tool that integrates speech-to-text, text-to-speech, and translation directly within the Salesforce platform. This approach keeps audio data secure within Salesforce’s trust boundaries, removing reliance on third-party services, which is vital for compliance and enterprise environments. The solution efficiently manages multi-tenant resource constraints, ensures reliability with structured error handling, and makes voice automation easily accessible for admins and developers through Flow and Agentforce actions without complex coding. Leveraging AI development tools also accelerated builds within Salesforce’s complex architecture, enabling smaller teams to deliver robust native speech capabilities quickly.

Takeaways
  • Use Speech Invocable Action to embed secure native speech automation in Salesforce Flows.
  • Design speech automation with structured error handling for reliable synchronous execution.
  • Manage multi-tenant resource limits by tuning speech actions for efficient platform usage.
  • Leverage AI coding tools to accelerate development in complex Salesforce platform codebases.
  • Integrate speech-to-text, text-to-speech, and translation as standard reusable actions.

By Yaheli Salina, Karthik Prabhu, and Omri Alon. In our Engineering Energizers Q&A series, we highlight the engineering minds driving innovation across Salesforce. Today, we spotlight Software Engineer Yaheli Salina. She and her Agentforce Speech Foundations team developed Speech Invocable Action — a new AI tool that standardizes repeatable actions, delivering sophisticated AI power throughout the Salesforce ecosystem, including secure, native speech automation housed within the platform trust boundary. Explore how the team integrated native speech automation by developing speech-to-text as a primary action under rigid multi-tenant constraints, engineering protective barriers to stop automation errors from impacting Flows and Agentforce actions, and leveraging AI tools to speed up architectural exploration while maintaining high security standards.

AgentforceAutonomous & Agentic WorkflowsFlow BuilderArtificial Intelligence