Latency is a time delay between the moment something is initiated, and the moment one of its effects begins. The word derives from the fact that during the period of latency the effects of an action are latent, meaning "potential" or "not yet observed". Even within an engineering context, latency has several meanings depending on problem domain (ie: communication, operational, or mechanical latencies).
It is often thought that light travels so quickly that the time taken for light to get from its source to its target would be irrelevant. One would thus expect communication latency to be negligible. Unfortunately, that is not necessarily the case as the following examples illustrate.
Latency in a packet-switched network is measured either one-way (the time from the source sending a packet to the destination receiving it), or round-trip (the time from the source sending a packet to the source receiving a response). Round-trip latency is more often quoted, because it can be measured from a single point, e.g. by using the ping service.
Where precision is important, one-way latency for a link can be more strictly defined as the time from the start of packet transmission to the start of packet reception. The time from the start of packet reception to the end of packet reception is measured separately and called "transmission delay". This definition of latency is independent of the link's throughput and the size of the packet, and is the absolute minimum delay possible with that link.
However, in a non-trivial network, a typical packet will be forwarded over many links via many gateways, each of which will not begin to forward the packet until it has been completely received. In such a network, the minimal latency is the sum of the minimum latency of each link, plus the transmission delay of each link except the final one, plus the forwarding latency of each gateway.
Although intercontinental television signals travel at the speed of light, they nevertheless develop a noticeable latency over long distances. This is best illustrated when a newsreader in a studio talks to a reporter half way around the world. The signal travels via communication satellite situated in geosynchronous orbit to the reporter and then goes all the way back to the studio, resulting in a journey of almost one hundred thousand Kilometers. Not all of this latency is distance-induced, since there are latencies built into the equipment at each end and in the satellite itself. The time difference may not be much more than a second, but is noticeable by humans.
Any individual workflow within a system of workflows can be subject to some type of operational latency. It may even be the case that an individual system may have more than one type of latency, depending on the type of participant or goal-seeking behavior. This is best illustrated by the following two examples involving air travel.
From the point of view of a passenger, latency can be described as follows. Suppose John Doe flies from London to New York. The latency of his trip is the time it takes him to go from his house in England to the hotel he is staying at in New York. This is independent of the throughput of the London-New York air link – whether there were 100 passengers a day making the trip or 10000, the latency of the trip would remain the same.
From the point of view of flight operations personnel, latency can be entirely different. Consider the staff at the London and New York airports. There are only a limited number of planes able to make the transatlantic journey, so when one lands they must prepare it for the return trip as quickly as possible. It may take, for example:
- 30 minutes to clean a plane
- 15 minutes to refuel a plane
- 10 minutes to load the passengers
- 40 minutes to load the cargo
Assuming the above are done one after another, minimum plane turnaround time is:
- 30+15+10+40 = 95
However, cleaning, refueling, and loading the cargo can be done at the same time, reducing the latency to:
- 40+10 = 50
- Minimum latency = 50
And if loading the passengers must happen after cleaning, but can happen during cargo loading:
- 30+10 = 40
- Minimum latency = 40
All of the people involved in the turnaround are only interested in the time it takes for their respective task, not the whole. However when different tasks are done at the same time it may be possible, as in this case, to reduce the latency to the longest task.
However, the more prerequisites every step has, the harder it is to perform the steps in parallel. In the above example, if cleaning a plane took 35 minutes, then the minimum latency would be 35 (cleaning) + 10 (passenger loading) = 45, which is longer than the time of any single task.
Any mechanical process encounters limitations modelled by Newtonian physics. The behaviour of disk drives provide an example of mechanical latency. Here, it is the time needed for the data encoded on a platter to rotate from its current position to a position adjacent to the read-write head. This is also known as rotational delay since the term latency is also applied to the time required by a computer's electronics and software to perform polling, interrupts, and direct memory access.
- M. Brian Blake, "Coordinating Multiple Agents for Workflow-Oriented Process Orchestration", Information Systems and e-Business Management Journal, Springer-Verlag, December 2003.