Overview

The Transcribe API extracts text transcriptions from video and audio content across major social media platforms and podcast services. Simply provide a media URL, and the API will return a transcription of the audio.

Features

  • Markdown Format: Receive transcriptions in both raw text and formatted markdown, ideal for LLM applications.
  • Time-stamped Text: Get precise timing for each sentence in the transcription
  • Paragraph Organization: Transcriptions are automatically structured into logical paragraphs
  • Platform Metadata: Optionally retrieve media metrics such as upload time, like count, comment count,and thumbnail URL.
  • Webhook Support: Get real-time notifications when a transcription is complete.

Basic Usage

To analyze content, simply add the media URL to the post_url parameter. E.g. https://www.youtube.com/watch?v=example. You can optionally include a query to ask specific questions about the content and a callback_url to receive notifications when the analysis is complete. E.g. https://your-domain.com/webhook.

curl --request POST \
  --url https://api.scribesocial.ai/v1/transcribe \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "post_url": "<string>",
  "callback_url": "<string>"
}'

Checking Transcription Status (Async Only)

If you’re using a callback_url, the API response will include a Location header containing the operation polling URL, which you can use to check the status of the transcription.

Example Operation Polling URL: https://api.scribesocial.ai/v1/transcribe-result/{operation-id}

When the callback request is sent to your endpoint, it will include a Scribe-Verification-Token header. This token corresponds to the API key ID used for the original request, allowing you to verify that the callback is coming from dScribe AI and is legitimate.

Example Response:

{
  "status": "completed",
  "operation_id": "abc123xyz789",
  "data": [
    {
      "video_url": "https://www.youtube.com/watch?v=abcd1234xyz",
      "post_url": "https://www.youtube.com/watch?v=abcd1234xyz",
      "description": "A comprehensive tutorial on machine learning fundamentals",
      "transcription": "Welcome to this tutorial on machine learning...",
      "paragraphs": [
        {
          "sentences": [
            {
              "text": "Welcome to this tutorial on machine learning.",
              "start": 0,
              "end": 2.5
            },
            {
              "text": "Today we'll cover the basic concepts and practical applications.",
              "start": 2.5,
              "end": 5.8
            }
          ],
          "num_words": 15,
          "start": 0,
          "end": 5.8
        }
      ],
      "video_id": "abcd1234xyz",
      "transcribed_duration": 212,
      "total_duration": 212,
      "status": "completed",
      "platform": "youtube",
      "title": "Machine Learning Tutorial",
      "metadata": {
        "upload_time": "2023-12-25T10:30:00Z",
        "comment_count": 423,
        "like_count": 892,
        "thumbnail": "https://i.ytimg.com/vi/abcd1234xyz/maxresdefault.jpg",
        "retweet_count": null,
        "quote_count": null
      }
    }
  ],
  "markdown": "# Machine Learning Tutorial\n\n## Introduction (00:00)\nWelcome to this tutorial on machine learning...\n\n## Key Concepts (01:23)\nLet's discuss the fundamental principles..."
}

For more details about request parameters, response fields, and status codes, check out our Transcribe API Reference.

Common Use Cases

  1. LLM Applications: Enrich LLMs with high-quality transcriptions to improve contextual understanding, retrieval, and summarization.
  2. Social Listening & Brand Monitoring: Convert video content into text to track brand mentions, sentiment, and trends across media platforms.
  3. Content Analysis & Intelligence: Transcribe spoken content to structure and categorize information for easier insight extraction.
  4. Content Indexing & Searchability: Make video and audio content searchable by generating accurate transcriptions.
  5. Compliance & Moderation: Support content compliance by transcribing video/audio for policy checks and copyright monitoring.