The ear and brain use as many cues as possible to separate simultaneous speech. For example, if the direction of different syllables can be determined, each can be assigned to an independent sound stream. Likewise, if the timbre of the two voices are different, timbre can be used as a separation cue. The abililty to localize sound and the ability to determine timbre are both compromised when syllables at the same pitch are spoken at precisely the same time. In the following examples we eliminate both spatial cues and timbre cues to explore the ability to separate sounds by pitch alone.
Two simultaneous monotone conversations can be separated from each other if the difference in pitch is more than one semitone. If they are different by precisely one semitone the task is possible - but it is not easy. It may take some practice to do it reliably.
The examples in the link below demonstrate both the ability and the difficulty of separating two conversations different in pitch by one semitone. Whether you succeed in doing this or not, please notice the enormous difference in subjective clarity between the examples where upper harmonics retain their phase relationships and those where the harmonic phases have been randomized by reflections.
The reflections used in the examples are "all-pass" reflections, which means they change phase without altering the spectrum of the signals. Although the spectrum is the same, the perceived timbre is quite different.