Audio Moderation

BETA

Analyze audio content for potential moderation issues.

🎬 Video Audio Analysis

All audio moderation options documented here are also available when analyzing audio in videos via the Video Moderation API. The video API uses the same audio analysis pipeline with an additional channel option to select specific audio tracks.

POSThttps://api.safecomms.dev/moderation/audio

The audio moderation endpoint analyzes audio files for unsafe content such as hate speech, harassment, and other policy violations. It performs speech-to-text transcription and then moderates the content.

Request Body

Field	Type	Required	Description
audio	string	Yes	The audio to moderate. Can be a public URL or a Base64 encoded string.
language	string	No	The language to use for the explanation (default: "English").
moderationProfileId	string	No	The ID of a custom moderation profile to apply.
extractMetadata	boolean	No	Extract metadata (format, duration, bitrate, etc.) from the audio file. Default: false.
transcribe	boolean	No	Provide a transcription of the audio. Requires Basic+ tier. Default: false.
transcriptionMode	string	No	"single" (Basic+) or "multiple" (Pro+). Defaults to tier capability.
analyzeSentiment	boolean	No	Analyze speaker sentiment (Basic+).
detectTopics	boolean	No	Categorize discussion topics (Basic+).
detectPii	boolean	No	Detect PII in audio (Pro+).
includeTimestamps	boolean	No	Include timestamps in transcription (Pro+). Requires transcribe=true.
customKeywords	array	No	List of keywords to flag (Business+).
detectDistress	boolean	No	Detect acoustic distress signals (Business+).
analyzeQuality	boolean	No	Analyze audio quality metrics (Business+).

Language Support

Currently, audio moderation is only supported for English audio content. Requests with non-English audio may be rejected or return an error.

Example Request

curl -X POST https://api.safecomms.dev/moderation/audio \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d '{
    "audio": "https://example.com/audio.mp3",
    "extractMetadata": true,
    "transcribe": true,
    "transcriptionMode": "multiple"
  }'

URL Restrictions

When providing a URL for the audio field, please note the following restrictions:

Download Timeout: The audio must download within 30 seconds.
Redirect Limit: A maximum of 3 redirects are followed.
Protocols: Only HTTP and HTTPS are supported.
Security: URLs resolving to internal or private IP addresses are immediately skipped.

We enforce these restrictions to ensure a high level of quality and reliability for our service.

Token Usage & Costs

Audio moderation uses a base cost of 50 tokens per request.

Feature	Cost	Tier	Condition
Base Request	50 Tokens	Basic+	Always applied.
Non-English Language	+1	Pro+	Applied when `language` is not 'English'.
Metadata Extraction	+1	Basic+	Applied when `extractMetadata` is true.
Transcription	+10	Basic+	Applied when `transcribe` is true.
Sentiment Analysis	+1	Basic+	Applied when `analyzeSentiment` is true.
Topic Detection	+1	Basic+	Applied when `detectTopics` is true.
PII Detection	+2	Pro+	Applied when `detectPii` is true.
Timestamps	+1	Pro+	Applied when `includeTimestamps` is true.
Custom Keywords	+2	Business+	Applied when `customKeywords` is provided.
Distress Detection	+2	Business+	Applied when `detectDistress` is true.
Quality Analysis	+2	Business+	Applied when `analyzeQuality` is true.

Response

The API returns a JSON object containing the moderation results.

{
  "id": "req_123456789",
  "isClean": false,
  "severity": "High",
  "categoryScores": {
    "HateSpeech": "High",
    "Harassment": "None",
    "SelfHarm": "None",
    "Sexual": "None",
    "Violence": "None"
  },
  "issues": [
    {
      "term": "offensive phrase",
      "context": "The user said 'offensive phrase' at the beginning.",
      "category": "HateSpeech",
      "severity": "High"
    }
  ],
  "metadata": {
    "Format": "mp3",
    "Duration": "00:00:15.5000000",
    "Bitrate": "128000",
    "Size": "248000",
    "AudioStreams": "1",
    "Codec": "mp3",
    "Channels": "2",
    "ChannelLayout": "stereo",
    "SampleRate": "44100",
    "Language": "Unknown",
    "Profile": "LC"
  }
}

POSThttps://api.safecomms.dev/moderation/audio/upload

Alternatively, you can upload an audio file directly for moderation using `multipart/form-data`.

Form Data

Field	Type	Required	Description
audio	file	Yes	The audio file to upload (MP3, WAV, etc.).
language	string	No	The language to use for the explanation (default: "English").
moderationProfileId	string	No	The ID of a custom moderation profile to apply.
extractMetadata	boolean	No	Extract metadata (format, duration, bitrate, etc.) from the audio file. Default: false.
transcribe	boolean	No	Provide a transcription of the audio. Requires Basic+ tier. Default: false.
transcriptionMode	string	No	"single" (Basic+) or "multiple" (Pro+). Defaults to tier capability.
analyzeSentiment	boolean	No	Analyze speaker sentiment (Basic+).
detectTopics	boolean	No	Categorize discussion topics (Basic+).
detectPii	boolean	No	Detect PII in audio (Pro+).
includeTimestamps	boolean	No	Include timestamps in transcription (Pro+). Requires transcribe=true.
customKeywords	array	No	List of keywords to flag (Business+).
detectDistress	boolean	No	Detect acoustic distress signals (Business+).
analyzeQuality	boolean	No	Analyze audio quality metrics (Business+).

Example Request

curl -X POST https://api.safecomms.dev/moderation/audio/upload \
  -H "X-Api-Key: YOUR_API_KEY" \
  -F "audio=@/path/to/audio.mp3" \
  -F "extractMetadata=true" \
  -F "transcribe=true" \
  -F "transcriptionMode=multiple"