Audio Moderation

BETA

Analyze audio content for potential moderation issues.

🎬 Video Audio Analysis

All audio moderation options documented here are also available when analyzing audio in videos via the Video Moderation API. The video API uses the same audio analysis pipeline with an additional channel option to select specific audio tracks.

POSThttps://api.safecomms.dev/moderation/audio

The audio moderation endpoint analyzes audio files for unsafe content such as hate speech, harassment, and other policy violations. It performs speech-to-text transcription and then moderates the content.

Request Body

FieldTypeRequiredDescription
audiostringYesThe audio to moderate. Can be a public URL or a Base64 encoded string.
languagestringNoThe language to use for the explanation (default: "English").
moderationProfileIdstringNoThe ID of a custom moderation profile to apply.
extractMetadatabooleanNoExtract metadata (format, duration, bitrate, etc.) from the audio file. Default: false.
transcribebooleanNoProvide a transcription of the audio. Requires Basic+ tier. Default: false.
transcriptionModestringNo"single" (Basic+) or "multiple" (Pro+). Defaults to tier capability.
analyzeSentimentbooleanNoAnalyze speaker sentiment (Basic+).
detectTopicsbooleanNoCategorize discussion topics (Basic+).
detectPiibooleanNoDetect PII in audio (Pro+).
includeTimestampsbooleanNoInclude timestamps in transcription (Pro+). Requires transcribe=true.
customKeywordsarrayNoList of keywords to flag (Business+).
detectDistressbooleanNoDetect acoustic distress signals (Business+).
analyzeQualitybooleanNoAnalyze audio quality metrics (Business+).

Language Support

Currently, audio moderation is only supported for English audio content. Requests with non-English audio may be rejected or return an error.

Example Request

curl -X POST https://api.safecomms.dev/moderation/audio \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{
    "audio": "https://example.com/audio.mp3",
    "extractMetadata": true,
    "transcribe": true,
    "transcriptionMode": "multiple"
  }'

URL Restrictions

When providing a URL for the audio field, please note the following restrictions:

  • Download Timeout: The audio must download within 30 seconds.
  • Redirect Limit: A maximum of 3 redirects are followed.
  • Protocols: Only HTTP and HTTPS are supported.
  • Security: URLs resolving to internal or private IP addresses are immediately skipped.

We enforce these restrictions to ensure a high level of quality and reliability for our service.

Token Usage & Costs

Audio moderation uses a base cost of 50 tokens per request.

FeatureCostTierCondition
Base Request50 TokensBasic+Always applied.
Non-English Language+1Pro+ Applied when language is not 'English'.
Metadata Extraction+1Basic+ Applied when extractMetadata is true.
Transcription+10Basic+ Applied when transcribe is true.
Sentiment Analysis+1Basic+ Applied when analyzeSentiment is true.
Topic Detection+1Basic+ Applied when detectTopics is true.
PII Detection+2Pro+ Applied when detectPii is true.
Timestamps+1Pro+ Applied when includeTimestamps is true.
Custom Keywords+2Business+ Applied when customKeywords is provided.
Distress Detection+2Business+ Applied when detectDistress is true.
Quality Analysis+2Business+ Applied when analyzeQuality is true.

Response

The API returns a JSON object containing the moderation results.

{
  "id": "req_123456789",
  "isClean": false,
  "severity": "High",
  "categoryScores": {
    "HateSpeech": "High",
    "Harassment": "None",
    "SelfHarm": "None",
    "Sexual": "None",
    "Violence": "None"
  },
  "issues": [
    {
      "term": "offensive phrase",
      "context": "The user said 'offensive phrase' at the beginning.",
      "category": "HateSpeech",
      "severity": "High"
    }
  ],
  "metadata": {
    "Format": "mp3",
    "Duration": "00:00:15.5000000",
    "Bitrate": "128000",
    "Size": "248000",
    "AudioStreams": "1",
    "Codec": "mp3",
    "Channels": "2",
    "ChannelLayout": "stereo",
    "SampleRate": "44100",
    "Language": "Unknown",
    "Profile": "LC"
  }
}
POSThttps://api.safecomms.dev/moderation/audio/upload

Alternatively, you can upload an audio file directly for moderation using `multipart/form-data`.

Form Data

FieldTypeRequiredDescription
audiofileYesThe audio file to upload (MP3, WAV, etc.).
languagestringNoThe language to use for the explanation (default: "English").
moderationProfileIdstringNoThe ID of a custom moderation profile to apply.
extractMetadatabooleanNoExtract metadata (format, duration, bitrate, etc.) from the audio file. Default: false.
transcribebooleanNoProvide a transcription of the audio. Requires Basic+ tier. Default: false.
transcriptionModestringNo"single" (Basic+) or "multiple" (Pro+). Defaults to tier capability.
analyzeSentimentbooleanNoAnalyze speaker sentiment (Basic+).
detectTopicsbooleanNoCategorize discussion topics (Basic+).
detectPiibooleanNoDetect PII in audio (Pro+).
includeTimestampsbooleanNoInclude timestamps in transcription (Pro+). Requires transcribe=true.
customKeywordsarrayNoList of keywords to flag (Business+).
detectDistressbooleanNoDetect acoustic distress signals (Business+).
analyzeQualitybooleanNoAnalyze audio quality metrics (Business+).

Example Request

curl -X POST https://api.safecomms.dev/moderation/audio/upload \
  -H "X-Api-Key: YOUR_API_KEY" \
  -F "audio=@/path/to/audio.mp3" \
  -F "extractMetadata=true" \
  -F "transcribe=true" \
  -F "transcriptionMode=multiple"