Do you offer cross-file Speaker Identification?
Yes! Speaker Identification allows you to identify speakers by their actual names or roles, transforming generic labels like "A", "B", "C", ... into meaningful identifiers that you provide. Speaker identities are inferred based on the conversation content. For more information, see Speaker Identification.
For speaker identification across multiple recordings, you can implement speaker identification using audio embeddings. For this you would first submit your audio file to AssemblyAI for diarization with speaker labels, and then use a model like Nvidia Titanet to generate speaker embeddings from the audio. Then, you would match these embeddings against a vector database of known speakers before replacing our generic labels (“Speaker A/B”) with actual names. Refer to our speaker identification cookbook for more details.