1. Authentication
All API requests must be authenticated using an API key. You can generate and manage your API keys from theAPI Accesstab in your dashboard settings.
2. Pricing & Wallet
Our API uses a pay-as-you-go prepaid wallet system. You must add funds to your API wallet before making requests. Funds are deducted based on usage:
- YouTube Transcriptions: $0.008 per video
- Media Files (Audio/Video): $0.0108 per minute of audio ($0.65 per hour)
3. Endpoints
/api/v1/transcribeSubmit a new media file URL or YouTube URL for transcription.
url parameter must be a publicly accessible internet link (e.g., S3, Vercel Blob, Dropbox, or public server paths). Our processing engine will securely stream and download the file for analysis automatically.Request Body (JSON)
{
"type": "youtube" | "media",
"url": "https://youtube.com/watch?v=... or https://example.com/audio.mp3",
"title": "Optional Title",
"language": "auto",
"identifySpeakers": false,
"mainSpeakerName": "Optional Override Name",
"webhookUrl": "https://your-domain.com/webhook"
}Request Parameter Details
typerequiredstringMust be explicitly declared as either "youtube" or "media".
urlrequiredstringAccepts YouTube links or public, unauthenticated audio/video file links (.mp3, .mp4, .wav, .m4a, etc.).
languageoptionalstringControls model speech language settings. Defaults to "auto" detection. For non-English video or audio files, you can pass explicit ISO keys (e.g., "en", "es", "fr", "de", "it") to lock model translation.
identifySpeakersoptionalbooleanMatches the Identify Speakers switch in our UI. Set to true to activate diarization and segment transcript lines by speaker identity. Defaults to false.
mainSpeakerNameoptionalstringMatches the Main Speaker Name input field. When identifySpeakers is active, passing a string here overrides the baseline identifier for the dominant voice profile (e.g., returning "John Doe" instead of "Speaker A").
webhookUrloptionalstringEndpoint target for async processing notifications.
Response (YouTube - Sync)
{
"success": true,
"source": "youtube",
"transcript": {
"id": "cm_yt_abc123x",
"status": "completed",
"content": "Full transcript text...",
"durationSeconds": 300
}
}Response (Media - Async)
{
"success": true,
"message": "Media transcription started",
"transcriptId": "cm_media_xyz789p",
"status": "processing"
}/api/v1/transcribe/:idRetrieve the status and results of a specific transcription job.
Response Example
{
"id": "cm_media_xyz789p",
"status": "completed",
"title": "API Media Transcript",
"durationSeconds": 120,
"content": "John Doe: Hello everyone, welcome. Speaker B: Glad to be here, thank you.",
"speakerData": [
{ "speaker": "John Doe", "start": 0, "end": 4500, "text": "Hello everyone, welcome." },
{ "speaker": "Speaker B", "start": 4510, "end": 8200, "text": "Glad to be here, thank you." }
],
"costUnitsCharged": 216
}costUnitsCharged: 216 units = $0.0216 deducted from wallet. Status values: "completed", "processing", or "failed".
4. Webhooks
When submitting asynchronous jobs (like type: "media"), you can provide a webhookUrl. Once the transcript finishes or fails, we will make a POST request to your webhook URL.
Webhook Payload
{
"event": "transcript.completed",
"transcript": {
"id": "cm_media_xyz789p",
"status": "completed",
"durationSeconds": 120,
"content": "John Doe: Hello everyone, welcome...",
"speakerData": [
{ "speaker": "John Doe", "start": 0, "end": 4500, "text": "Hello everyone, welcome." }
]
}
}Possible event types: transcript.completed or transcript.failed.