Identify speakers - talat docs

When more than one person is on the other side of a call, talat tries to tell their voices apart so you can follow who said what. Once you’ve named someone, talat will try to recognise that person in future meetings too. The feature is on by default; turn it off in Settings → Transcription.

How voices appear in the transcript

You are always shown on the right, in pink, labelled You.
People talat already knows (from a previous meeting where you named them) are matched automatically and shown with their name and a unique colour.
People talat hasn’t heard before are shown as System audio, in a neutral colour. talat does not guess names for new voices; that’s your job.

The first meeting with someone new will have a lot of System audio bubbles. That’s expected. Once you name them, future meetings get easier.

Naming someone

On any utterance bubble, hover to reveal a row of small action icons in the top right of the bubble. Click the speaker icon (a pencil-on- person), then either:

Type a name and press Enter to create a new person.
Pick someone you’ve named before from the list.

Names are global. A person you call Alice in one meeting will appear under Alice in every other meeting they’re in.

Clicking the speaker label itself (the name above the bubble) opens that person’s profile, not the picker. Use the speaker icon for naming and reassignment.

Helping talat recognise people next time

When you name or reassign an utterance, talat may store a short clip of that person’s voice as a voice reference. Each person has up to three reference slots. In future meetings, talat compares unfamiliar voices against every stored reference and auto-assigns a match when one is close enough.

A few things you can do to make recognition more reliable:

Name people early in the call. The longer talat has a name attached to a voice, the more material it has to learn from.
Reassign mistakes as you spot them. Each correction can refresh or fill a reference slot, which sharpens future matching.
Give it a few seconds of clean speech. Voice references need at least three seconds of audio to be stored. Short utterances (“yeah”, “right”) won’t seed a reference.

Fixing incorrect attributions

Click the speaker icon on any misattributed bubble and pick (or create) the correct person. The transcript updates immediately, and talat may use that correction to improve a voice reference for the target person.

If talat keeps confusing two people, the cause is usually a single bad voice reference. Open one of the affected people, listen back to their references, and delete the worst-sounding one.

Managing voice references

Open People in the sidebar and pick a person to see their detail view. Their three voice-reference slots are listed there. Each filled slot shows:

A play button, to listen back and confirm it really is them.
The duration of the clip and when it was captured.
A trash icon to delete the reference.

Empty slots fill back up the next time talat captures eligible audio for that person, either from auto-matching or from a reassignment you make. Deleting a poor-quality reference (crosstalk, background noise, the wrong person) is almost always better than keeping it; one bad reference degrades matching more than a missing one.

References imported from older versions of talat may show as Old voice reference, no audio available. Matching still works for these (the underlying fingerprint is intact); you just can’t play them back to verify.

Merging two people into one

There’s no dedicated merge command, but the delete-with-reassign flow does the same job. If the same person has been split across two named people (a voice got matched to two different names, say), open the duplicate in People, click the trash icon, set Reassign segments to: the correct person, and confirm. Their utterances move over and the duplicate is removed. Voice references on the deleted person are not carried over.

Deleting a person

On a person’s detail view, click the trash icon in the top right. talat asks where their utterances should go:

Reassign to another person. Pick someone from the list and every utterance moves to them.
No one (unlink). Their utterances stay in the transcript but go back to System audio.

Either way, the person’s voice references are removed with them.

When recognition struggles

Heavily overlapping speech. When two or more people talk over each other, even known voices can be misattributed.
Bluetooth headsets in call mode. The other person’s mic drops to a low-fidelity codec, which makes them harder to recognise from one meeting to the next.
Very short utterances. One-word interjections don’t carry enough voice for a confident match.
Lower-spec machines. Speaker identification needs enough headroom to keep up with real-time audio. If your machine can’t, the toggle in Settings → Transcription is force-disabled and tells you why. Transcription itself still works; you just won’t see the other side broken out into individual people.