How to setup speech recognition in Swift

A flower that represents Swift next to one that represents Xcode. Beneath it sits the text "Speech Recognition."

A step by step tutorial on setting up an iOS app to listen to and transcribe words in real time.

The following tutorial uses Apple's Speech framework and is based on Apple's Recognizing Speech in Live Audio Open Source project, whose link can be found below.

We recommend that you clone our Open Source Swift Starter Project, checking out the main branch and carrying out the steps below. The changes can be found on the tutorial/voice branch.

git clone git@github.com:delasign/swift-starter-project.git

Step One: Create the SpeechCoordinator

A screenshot of Xcode showing the SpeechCoordinator.swift, SpeechCoordinator+Availability.swift and SpeechCoordinator+Transcription.swift.

The following step walks you through how to create the necessary files and structure for the SpeechCoordinator.

A | Create the SpeechCoordinator folder

A screenshot of Xcode showing the SpeechCoordinator folder under the Coordinators folder.

Under the Coordinators folder, create a new group (folder) called SpeechCoordinator by right clicking on the Coordinators folder, and selecting New Group.

B | Create the SpeechCoordinator base file

A screenshot of Xcode showing the SpeechCoordinator.swift under the SpeechCoordinator folder. The code is available below.

In the newly created SpeechCoordinator folder, create a new file called SpeechCoordinator.swift and paste in the code below.

C | Add the Availability extension

A screenshot of Xcode showing the SpeechCoordinator+Availability.swift extension. Code is available below.

In the SpeechCoordinator folder, create a new file called SpeechCoordinator+Availability.swift and paste in the code below.

D | Add the transcription extension

A screenshot of Xcode showing the Transcription Extension for the SpeechCoordinator. Code available below.

In the SpeechCoordinator folder, create a new file called SpeechCoordinator+Transcription.swift and paste in the code below.

Step Two: Initialize the Coordinator

A screenshot of Xcode with the ViewController.swift on display. We have highlighted that we have initialed the SpeechCoordinator under the setupCoordinators function. This code is available below.

Navigate to the ViewController.swift, found under the RootViewController folder and within the setupCoordinators function initialize the SpeechCoordinator using the line below.

SpeechCoordinator.shared.initialize()

Step Three: Request Authorization

A screenshot of Xcode with the ViewController.swift file display. We have highlighted that we have created a new viewDidAppear function and requested authorization from the SpeechCoordinator, every time this is called.

In the Navigate to the ViewController.swift, under the viewDidLoad function, add the code found below.

Step Four: Update Permissions

A screenshot of Xcode showing the info.plist. We have highlighted the two items that you must add. These are detailed below.

Open the project's info.plist and add two rows, one for the NSMicrophoneUsageDescription key and one for the NSSpeechRecognitionUsageDescription key; along with descriptions as to why you need the microphone and speech recognition. Both of these are required for speech recognition to work.

Optional: Mac Catalyst Entitlement

A screenshot of Xcode showing how to activate the Audio Input entitlement so that the algorithm works. Details below

Please note that this screenshot is from our Sans Hands project.

If you are seeking to use the algorithm to do speech recognition on Mac Catalyst, make sure that you check off the Audio Input. This can be found in the Signing and Capabilities tab, in the App Sandbox section under Hardware.

Step Five: Verify

A screenshot of Xcode showing "Hello World" on the terminal. This was created using the code in this tutorial and demonstrates that the SpeechCoordinator works.

Run the project and you should see that the audio is transcribed in real time in the console.

To ease looking for this, filter the console to include [SpeechCoordinator].

Any Questions?

We are actively looking for feedback on how to improve this resource. Please send us a note to inquiries@delasign.com with any thoughts or feedback you may have.