Summarize Audio
Last updated
Last updated
Takomo offers a simple and efficient way to summarize audio. This guide will walk you through the process of creating an audio summarization pipeline using Takomo.
First, navigate to the template section inside Takomo. While there is an existing audio summarization pipeline available for immediate use, this guide will show you how to create a new one from scratch. Click on 'New Projects' to start a blank pipeline.
The pipeline has three main sections:
Input: This is where you define all the fields that your API will accept.
Output: This is where you specify what your API will return.
Middle blank space: This is where you build your pipeline.
You will need to add an audio input to your pipeline. You can leave the name as default for now. At the end of the pipeline, you will get some text as output, so add a text node at the end.
To build the pipeline, you will need to add two nodes:
Whisper Node: Found in the audio and speech section, the Whisper node comes in two variants - regular and advanced. The advanced node offers more options but takes longer to run. For a simple operation, the regular Whisper node is sufficient.
GPT Node: Under language processing, you will find GPT-3.5 and GPT-4. GPT-3.5 is faster and suitable for simple and straightforward tasks.
Connect the audio node to the audio input for the Whisper node. If you're using English, there's no need for translation. Then, take the full text from the Whisper node and send it into the GPT-3.5 node. The output of the GPT-3.5 node will go into your output section.
You will need a prompt that instructs GPT to summarize the audio. You can copy and paste the prompt from the ready-made pipeline. The prompt should instruct GPT to act as a text summarizer and come up with the main points of the provided text.
You can use this prompt as an example, or copy it directly:
“Act as a text summarizer and come up with the 5 points of the provided text”
Browse and add the audio file you want to summarize.
Once everything is set up, hit the 'Run Pipeline' button. The pipeline will process the audio and provide a summarized text as output.