Anna Gát on What Makes Conversation Human
"Spectograms of the Voice"
Today’s post is from Anna Gát, founder of Interintellect. Anna has built a third space for intellectually curious conversation, a venture that is both noble and hard to execute. I read her piece on X today and here on Substack, and wanted to share it with you. I’m particularly taken with her idea that humanity is expressed in the aesthetic dimensions of communication, including silence. It pairs with another tweet I recently read, one that made me laugh, about why the alpha in earnings calls comes not from the transcripts themselves and more from the “spectrograms of the voice.” - Zohar
____
Having run a conversation salon platform for 7 years, we've learned so much about human communication that I don't (yet) see LLMs get right.
1- Musicality:
Human conversation is incredibly musical in that it is all about the rhythm. After the entry point, people relax into the melody or get upset by it. The "music" can be a solo, a duet, or a symphony when it's a group conversation. A human discussion will be as positive or constructive as the "music" that it becomes allows.
As with music, a key element in human conversation is silence. When there is a gap, people can process, connect, think. In the 1970s the couple's therapist John Gottman tried to mathematize his sessions with patients, and found something similar. Esther Perel also told me that in couple's therapy (one of the highest stakes conversations a person can have), the rhythm and musicality are more important than what is being said. Counterintuitive but true.
Even in text messages, people have learned instinctively how to create silent gaps -- those moments of not-speaking which you can use to make a point, to show dissatisfaction, or emphasize love and presence. I don't see LLMs daring to do this yet.
On Interintellect, my salon platform, one of the main things we teach new salon hosts is how to encourage, allow, and manage silence. It is counterintuitive, even scary, for humans too. But to anyone with a body -- for the body is pure rhythm -- the musicality of conversation is viscerally obvious.
2- Priority:
A challenge for anyone hosting a conversation -- or sometimes just participating in one -- is how much people can stay in their own heads while seemingly engaging with another human. How many times someone is talking and you're already fully focused on what *you* want to say next!
On Interintellect, which hosts fixed time, fixed theme, intentional gatherings, we help people come out of their shell by fostering an atmosphere of "easy mic" -- everybody knows they will get the mic soon, and so the impatience element is completely gone. We also, in the case of online salons, use the chat a lot where people can leave notes for others or self. At IRLs salons, I see people taking notes to free up mindspace.
When we have a big celeb on, we ensure it is never 1:1 and then 50 minutes later we open to the audience. We tell attendees in advance that we will do only 10 mins of 1:1, then 10 mins audience, then 10 mins 1:1, ... etc.
This helps prevent the audience's mental constipation: everyone can just be fluid and present, playing with ideas, listening to each other real-time.
This I don't think LLMs got right yet. It happens to me a ton of times that Claude or GPT starts talking, and I am already at my next question, and just skip or stop them.
3 - Phatic love
"Phatic" communication is what we call all parts of speech that don't really convey information, they're just there to make us bond and feel better. From "how are you"s to jokes, small talk is not to be looked down upon! It serves an important physiological purpose: it puts us in the mood, it helps start the "music".
Phatic comms can be very formulaic, e.g., with a total stranger whose store you've just walked into. But with people we know it is full of context. Reminders, repetition, reassurance. The LLM experience would be much warmer if phatic elements were more integral to it. (Claude's warm, changing welcome is a good start.)
4 - Availability
The very first incarnation of Interintellect was an AI powered chat app called Ixy (after "mutual information") aiming at making written communication between loved ones better. The two years of research that I conducted for it independently (this was ancient GPT2 times) were instrumental for today's good vibes on Interintellect, and the fact that after tens of thousands of conversations (across lockdowns, elections, wars) we have had 0 toxic incident at any of our live public salons even though most attendees are strangers.
One thing my old research focused on was asynchrony. A lot of our data pointed at how text conversations can go bad because they simultaneously assume constant availability while cannot guarantee it.
In linguistics, we always look at alignment. Two people are talking in a living room, they will make efforts to speak the same language, find the same volume, use a similar vocabulary. In short, they will try to maximize mutual information.
This is far more complicated over text, where we are both more and less honest and more and less present than in real life. My sense is because LLMs are writing-based (even our audio is transcribed, and the AI "reads out" to us a text it generates in written form) they inherited some of these issues from human texting.
Of course, LLMs are always available. With that, humans cannot compete. But so much of human communication is physical -- rhythm, sensation, excitement, goosebumps, sweat ... and *absence* which makes presence valuable -- that right now I am not worried the literary salon where people can come together to think together could be replaced anytime soon.
But building better communication tools for humans to use with each other -- powered by AI or just plain good human thinking -- remains an essential task ahead.





