The present invention is directed to
voice communication devices in which an audio
stream is divided into a sequence of individual packets, each of which is routed via pathways that can vary depending on the availability of network resources. All embodiments of the invention rely on an acoustic prioritization agent that assigns a priority value to the packets. The priority value is based on factors such as whether the packet contains
voice activity and the degree of acoustic similarity between this packet and adjacent packets in the sequence. A confidence level, associated with the priority value, may also be assigned. In one embodiment,
network congestion is reduced by deliberately failing to transmit packets that are judged to be acoustically similar to adjacent packets; the expectation is that, under these circumstances, traditional
packet loss concealment algorithms in the receiving device will construct an acceptably accurate replica of the missing packet. In another embodiment, the receiving device can reduce the number of packets stored in its
jitter buffer, and therefore the latency of the speech
signal, by selectively deleting one or more packets within sustained silences or non-varying speech events. In both embodiments, the ability of the
system to drop appropriate packets may be enhanced by taking into account the confidence levels associated with the priority assessments.