I've been experimenting with this recently as well, but with an app on my apple ...

dsalzman · on Nov 15, 2022

Someone is experimenting with diarization (speaker identification) + Whisper here https://github.com/openai/whisper/discussions/264

fragmede · on Nov 15, 2022

If you know how many speakers there are, https://twitter.com/dwarkesh_sp has it working here:

https://colab.research.google.com/drive/1V-Bt5Hm2kjaDb4P1RyM...

waprin · on Nov 15, 2022

Ahh I’m working on exact same project. I applied to YC with the idea and was told that “nobody wants this” during the interview.

There’s a ton of problems in the space around privacy and UX. But I’m incredibly excited about projects in this space because in modern society we’re basically surrounded by a million unhealthy things designed to tempt us. Logging forces you to “stay honest”. I’ve been shocked already by how many unhealthy habits I underestimated and how many healthy habits I overestimated.

My #1 priority is just to improve my own physical and mental health. Whether there’s a market for this stuff, who knows.

Good luck!

dsalzman · on Nov 15, 2022

My original inspiration is to better understand how I talk to others and study my own behavior

waprin · on Nov 15, 2022

A noble goal. One of my bad habits I've been tracking and trying to reduce is rude behavior to people, online or in-person.

TOMDM · on Nov 15, 2022

Check out this model, I've had limited success with it. Best I've done so far is to just add the labels it gives to the overlapping segments whisper spits out, which means some sentences have multiple speakers, but that's mostly the case because of cross-talk. I'd say it gets it right ~80% of the time with the 5 speakers I've done it on across ~16 hours of audio.

https://huggingface.co/pyannote/speaker-diarization

dsalzman · on Nov 15, 2022

I will!

roberdam · on Nov 15, 2022

Speaker identification is the next step, you might want to read about Pyannote's Diarization:

https://lablab.ai/t/whisper-transcription-and-speaker-identi...

jordanlwalker · on Nov 15, 2022

we're experimenting building out a version of this too, but on desktop with www.usebacktrack.com - should have splitting speakers/inputs early next year and seeing what that's like

hipjiveguy · on Nov 18, 2022

what app are you using on the apple watch?