How to use OpenAI Whisper API in Retool
With OpenAI Whisper, converting audio speech to text becomes easy and opens up opportunities for building powerful applications. Whisper can transcribe audio into raw text, which can then be stored in a database, searched, and further manipulated. Integrating this capability in Retool creates even more potential by combining data and automation in one place. Letâs explore how to build a simple app using Whisperâs speech-to-text API in Retool.
Prerequisites #
Before we begin, youâll need the following:
- An OpenAI API key
- A Retool account
Create a Retool REST resource #
Start by creating a REST API resource in Retool. The Whisper API will be the base URL, and youâll need to authenticate using your OpenAI API key.
- Go to Retool Resources, and add a new REST API resource.
- Set the Base URL to
https://api.openai.com/v1/audio/transcriptions. - For the Method, select
POST. - Add the Authorization header with a valid OpenAI API KEY for authentication

Create the minimal interface to upload an audio file #
Weâll now create a minimal interface that allows users to upload an audio file for transcription.
Add a File Picker component to your Retool app to let users upload an audio file. Named fileButton1.
Whisperâs API requires, so the request body will need two fields
model
and
file
- Set the
modelfield towhisper-1. - For the
filefield, bind it to the File Picker component using ``.

Test the connection #
If all the parts are correctly connected, we should see a similar situation when pressing âRunâ to the query:

Display the converted text #
To display the converted text, weâll add a Text Area component that updates with the transcription result.
Drag a Text Area component into the interface.
Bind the component to the transcription response using:
{ { query1.data?.text } }
This will display the text returned from the Whisper API after a successful transcription.

Handle file size limit errors #
One common issue when working with audio files is hitting file size limits. The Whisper API has size restrictions, and if you hit this limit, youâll need to split the file into smaller chunks and transcribe each part individually. You can then concatenate the transcribed results.

Handle timeout errors #
Another possible issue is the API request timing out for longer files. If this happens, you can increase the timeout setting in Retoolâs Advanced tab for your Whisper API resource. Set it to a higher value, such as 120 seconds, to handle longer processing times.

Going beyond #
Once you have your raw text, the skyâs the limit. You can add more components to your Retool app, such as buttons to store the transcribed text in a database or a search feature to browse through past transcriptions.
Hereâs a simple example:
- Add a Button that triggers a query to save the transcription to a database.
- Include a Dropdown Selector to categorize the transcription (e.g., meeting notes, interviews).

With these steps, you now have a fully functional Retool app that integrates OpenAI Whisper for speech-to-text conversion. From here, you can extend its functionality as neededâperhaps building a searchable repository of meeting notes or automating workflows that involve audio data.