Issues with VoIP
Packet loss is when a packet gets dropped during transmission. This can cause clipping and skipping in a voice conversation.
Delay is a major issue with voice traffic. Any delay below 200 milliseconds is acceptable, but anything beyond that can render speech unrecognizable. The two types of delay are propogation delay and handling delay. Propagation delay is how long it takes for a packet to go from one end of a wire (or fiber optic cable) to the other. Handling delay (or serialization delay) is any delay introduced by devices on the voice network, such as codecs and DSPs.
Jitter is the variation between expected receive time and actual receive time. For instance, the average packet trip may be 150ms, with the occasional packet only taking 50ms. Jitter is compensated for using a buffer on a voice device.
End-to-end delay is the total delay from one side to the other.
Echo is voice feedback given to the speaker and is normal in a conversation. However, if the feedback exceeds 25 milliseconds in latency, it can cause problems with a conversation. In regular telephony, the echo is created by impedance mismatches between the four-wire network conversion to a two-wire local loop and is controlled by echo cancellers. In a VoIP network, echo cancellers are built into the low bit-rate codecs and are operated on each DSP. The echo canceller is limited by the amount of time it will wait for reflected speech to be received. This is called the echo trail and is normally 32 milliseconds, though it can be configured to 8, 16, 24 or 32 milliseconds.
Several technologies are used to to minimize these problems. Included are QoS features such as classification, queueing, traffic shaping, Compressed Real-Time Transport Protocol (CRTP), and TCP header compression. There are several item to keep in mind when implementing QoS.
Latency is one issue. Different types of traffic have different levels of latency they can withstand. For instance, voice or video conferencing should have no more than 150-200ms latency one way, while streaming video can live with 4 or 5 seconds of delay. Similarly, jitter causes no issues for streaming video, while live video/voice should have no more than 30ms of jitter.
Another thing to keep in mind is the bandwidth requirements. Voice requires very little guaranteed bandwidth (17-106 Kbps + 150bps control overhead) while live video requires significantly more. Live video should have its bandwidth + 20% reserved, so a 768Kbps video stream should have 922Kbps available.
The way to achieve QoS is through classification of traffic. This can happen at layer 2 or 3. An important concept here is a trust boundary. Do you trust the client to make the QoS determination? If so, your trust boundary encompasses all devices on the network. However, if you don't trust the end-devices to handles QoS prioritization properly, the trust boundary moves back. First it moves to the access layer, but if your access layer hardware is unable to make QoS determination either, it has to move back to the distribution layer.
At layer 2, a CoS (Class of Service) can be assigned to a frame to weigh its relative importance. At layer 3, there is a ToS (type of service) field to make a similar determination.
By default, a Cisco IP phone send its frames tagged with a value of 5, plus an equal value in the ToS field of the IP packet. However, most PCs do not have NICs capable of doing 802.1Q tagging (layer 2 tagging), and thus leave that field set to 0. Specific applications running on a host can set this field, or the TCP/IP stack could be altered to change the default setting. However, if the phone is forwarding PC packets as well, default behaviour of the phone zeroes out the CoS field in all PC packets.
Based on the layer 2 CoS and the layer 3 ToS, routers and switches should be configured with priority queuing and/or QoS ACLs to control traffic. This would place delay-sensitive traffic at the head of any queue, placing data traffic and other, less-important data at the end of the queue. When transferring packets into a WAN, remember that layer 2 information will be lost. Thus, it's important that level 3 ToS be implemented to maintain QoS across a WAN.