Filter profanity

Automatically filter profanity from streaming transcripts in real time.

Overview

Streaming profanity filtering lets you automatically mask profane words in your streaming transcripts in real time. When enabled, the API replaces profane words with asterisks in both partial and final turns before sending them to the client.

The mask uses the first letter of the word followed by n - 1 asterisks (for example, shit becomes s***). Apostrophes, capitalization, and surrounding punctuation are preserved (for example, shit's becomes s***'s).

Profanity filtering supports all streaming models: u3-rt-pro, universal-streaming-english, and universal-streaming-multilingual. It also works alongside other features such as format_turns and PII redaction.

Pre-recorded profanity filtering

For profanity filtering on pre-recorded audio, see Filter profanity from transcripts.

Connection parameters

ParameterTypeRequiredDefaultDescription
filter_profanitybooleanNofalseEnable real-time profanity filtering. When true, profane words in both partial and final turns are masked with asterisks (first letter preserved). The server accepts the truthy strings true, 1, and yes. Invalid values cause the WebSocket to close with code 3006.
include_partial_turnsbooleanNotrueWhen false, the API only sends final turns. Useful with filter_profanity: true if you display partials directly to end-users and want to avoid any unmasked profanity flashing during word completion.

Quickstart

Get started with streaming profanity filtering using the code below. This example streams 16 kHz mono PCM audio from your microphone and prints each turn with profanity masked.

1

Install the required libraries

$pip install websocket-client pyaudio
2

Create a new file main.py and paste the code below. Replace <YOUR_API_KEY> with your API key.

3

Run with python main.py and speak into your microphone.

1import pyaudio
2import websocket
3import json
4import threading
5import time
6from urllib.parse import urlencode
7
8YOUR_API_KEY = "<YOUR_API_KEY>"
9CONNECTION_PARAMS = {
10 "sample_rate": 16000,
11 "speech_model": "u3-rt-pro",
12 "format_turns": "true",
13 "filter_profanity": "true",
14}
15API_ENDPOINT_BASE_URL = "wss://streaming.assemblyai.com/v3/ws"
16API_ENDPOINT = f"{API_ENDPOINT_BASE_URL}?{urlencode(CONNECTION_PARAMS)}"
17
18FRAMES_PER_BUFFER = 800
19SAMPLE_RATE = CONNECTION_PARAMS["sample_rate"]
20CHANNELS = 1
21FORMAT = pyaudio.paInt16
22
23audio = None
24stream = None
25ws_app = None
26audio_thread = None
27stop_event = threading.Event()
28
29def on_open(ws):
30 print("WebSocket connection opened.")
31
32 def stream_audio():
33 global stream
34 while not stop_event.is_set():
35 try:
36 audio_data = stream.read(FRAMES_PER_BUFFER, exception_on_overflow=False)
37 ws.send(audio_data, websocket.ABNF.OPCODE_BINARY)
38 except Exception as e:
39 print(f"Error streaming audio: {e}")
40 break
41
42 global audio_thread
43 audio_thread = threading.Thread(target=stream_audio)
44 audio_thread.daemon = True
45 audio_thread.start()
46
47def on_message(ws, message):
48 try:
49 data = json.loads(message)
50 msg_type = data.get("type")
51 if msg_type == "Begin":
52 print(f"Session began: ID={data.get('id')}")
53 elif msg_type == "Turn":
54 transcript = data.get("transcript", "")
55 end_of_turn = data.get("end_of_turn", False)
56 if end_of_turn:
57 print(f"\r{' ' * 80}\r{transcript}")
58 elif msg_type == "Termination":
59 print(f"\nSession terminated: {data.get('audio_duration_seconds', 0)}s of audio")
60 except Exception as e:
61 print(f"Error handling message: {e}")
62
63def on_error(ws, error):
64 print(f"\nWebSocket Error: {error}")
65 stop_event.set()
66
67def on_close(ws, close_status_code, close_msg):
68 print(f"\nWebSocket Disconnected: Status={close_status_code}")
69 global stream, audio
70 stop_event.set()
71 if stream:
72 if stream.is_active():
73 stream.stop_stream()
74 stream.close()
75 if audio:
76 audio.terminate()
77
78def run():
79 global audio, stream, ws_app
80 audio = pyaudio.PyAudio()
81 stream = audio.open(
82 input=True,
83 frames_per_buffer=FRAMES_PER_BUFFER,
84 channels=CHANNELS,
85 format=FORMAT,
86 rate=SAMPLE_RATE,
87 )
88 print("Speak into your microphone. Press Ctrl+C to stop.")
89 ws_app = websocket.WebSocketApp(
90 API_ENDPOINT,
91 header={"Authorization": YOUR_API_KEY},
92 on_open=on_open,
93 on_message=on_message,
94 on_error=on_error,
95 on_close=on_close,
96 )
97 ws_thread = threading.Thread(target=ws_app.run_forever)
98 ws_thread.daemon = True
99 ws_thread.start()
100 try:
101 while ws_thread.is_alive():
102 time.sleep(0.1)
103 except KeyboardInterrupt:
104 print("\nStopping...")
105 stop_event.set()
106 if ws_app and ws_app.sock and ws_app.sock.connected:
107 ws_app.send(json.dumps({"type": "Terminate"}))
108 time.sleep(2)
109 if ws_app:
110 ws_app.close()
111 ws_thread.join(timeout=2.0)
112
113if __name__ == "__main__":
114 run()
Suppress unmasked partials with include_partial_turns=false

Profanity filtering applies to both partial and final turns, but during word-completion an unmasked partial can briefly appear before the model resolves the word and applies the mask. If your application surfaces partials directly to end-users (for example a live caption stream or voice-agent UI), set include_partial_turns: false on the connection to suppress all partial turns and only receive masked finals. The default is true (partials enabled), so this requires an explicit opt-out.

Example output

With filter_profanity=true, a final turn might look like:

1s*** is what you say when you stub your toe.

The mask preserves word length, apostrophes, and surrounding punctuation, so a word like shit's is returned as s***'s and motherfucker becomes m***********.

Supported models

Streaming profanity filtering works with all streaming models on both the US and EU endpoints:

  • u3-rt-pro
  • universal-streaming-english
  • universal-streaming-multilingual

Troubleshooting

The streaming filter targets the same word list as pre-recorded profanity filtering and only masks words on that list. Some words you might consider profane, such as crap and damn, are intentionally not masked and pass through unchanged. If you need stricter filtering, apply your own post-processing on top of the masked transcript.

Profanity masking applies during word classification, so an unmasked partial can briefly appear before the word is fully recognized and masked. If your UI surfaces partials directly to users, set include_partial_turns: false on the connection. Final turns are always masked.