Inter-Process Communication Affects Application Response Time

From WikiContent

(Difference between revisions)
Jump to: navigation, search
Current revision (07:52, 2 February 2009) (edit) (undo)
 
(5 intermediate revisions not shown.)
Line 1: Line 1:
-
Response time is critical to software usability. Few things are as frustrating as waiting for some software system to respond, especially when our interaction with the software involves repeated cycles of stimulus and response. We feel as if the software is wasting our time and affecting our productivity. However the causes of poor response time are less appreciated, especially in modern applications. Application Performance Management literature still discusses data structures and sorting algorithms, issues that were important decades ago, but that are no longer dominant in the era of multi-core, multi-GHz processors.
+
Response time is critical to software usability. Few things are as frustrating as waiting for some software system to respond, especially when our interaction with the software involves repeated cycles of stimulus and response. We feel as if the software is wasting our time and affecting our productivity. However, the causes of poor response time are less well appreciated, especially in modern applications. Much performance management literature still focuses on data structures and algorithms, issues that can make a difference in some cases but are far less likely to dominate performance in modern multi-tier enterprise applications.
-
When performance is a problem, my experience has been that improvements in data structures and algorithms aren't the right place to look. In modern applications, response time depends most strongly on the number of inter-process communications (IPCs) needed to process some stimulus. While there can be other local bottlenecks, the number of inter-process communications usually dominates. Each inter-process communication contributes some non-negligible latency to the overall response time, and these individual contributions add up, especially when they are incurred sequentially.
+
When performance is a problem in such applications, my experience has been that examining data structures and algorithms isn't the right place to look for improvements. Response time depends most strongly on the number of remote inter-process communications (IPCs) conducted in response to a stimulus. While there can be other local bottlenecks, the number of remote inter-process communications usually dominates. Each remote inter-process communication contributes some non-negligible latency to the overall response time, and these individual contributions add up, especially when they are incurred in sequence.
-
A prime example is "ripple loading" in an application using object-relational mapping. Ripple loading refers to the sequential execution of many database calls to select the data needed for building a graph of objects (see Lazy Load in Martin Fowler's ''Patterns of Enterprise Application Architecture''). When the database client is a middle-tier application server rendering a web page, these database calls are usually executed sequentially in a single thread, and their individual latencies contribute to the overall response time. Even if each database call takes only 10ms, a page requiring 1000 calls (which is not uncommon) will exhibit at least a 10-second response time. Other examples include web service invocation, HTTP requests from a web browser, distributed object invocation, request-reply messaging, and data grid interaction over custom network protocols. The more IPCs are needed to respond to a stimulus, the greater the response time will be.
+
A prime example is ''ripple loading'' in an application using object–relational mapping. Ripple loading describes the sequential execution of many database calls to select the data needed for building a graph of objects (see [http://martinfowler.com/eaaCatalog/lazyLoad.html Lazy Load] in Martin Fowler's ''Patterns of Enterprise Application Architecture''). When the database client is a middle-tier application server rendering a web page, these database calls are usually executed sequentially in a single thread. Their individual latencies accumulate, contributing to the overall response time. Even if each database call takes only 10ms, a page requiring 1000 calls (which is not uncommon) will exhibit at least a 10-second response time. Other examples include web-service invocation, HTTP requests from a web browser, distributed object invocation, request–reply messaging, and data-grid interaction over custom network protocols. The more remote IPCs needed to respond to a stimulus, the greater the response time will be.
-
The strategies for reducing the ratio of inter-process communications per stimulus are relatively few, and relatively obvious or well known. One is to apply the principle of parsimony, by optimizing the interface between processes so that exactly the right data for the purpose at hand is exchanged with the minimum amount of interaction. Another is to parallelize the inter-process communications where possible, so that the overall response time becomes driven mainly by the longest-latency IPC. And a third is to cache the results of previous IPCs, so that future IPCs may be avoided by hitting local cache instead.
+
There are a few relatively obvious and well-known strategies for reducing the number of remote inter-process communications per stimulus. One strategy is to apply the principle of parsimony, optimizing the interface between processes so that exactly the right data for the purpose at hand is exchanged with the minimum amount of interaction. Another strategy is to parallelize the inter-process communications where possible, so that the overall response time becomes driven mainly by the longest-latency IPC. A third strategy is to cache the results of previous IPCs, so that future IPCs may be avoided by hitting local cache instead.
-
When you're designing an application, be mindful of the ratio of inter-process communications per stimulus. When analyzing applications that suffer from poor performance, I have often found IPC-to-stimulus ratios of thousands-to-one. Reducing this ratio, whether by caching or parallelizing or some other technique, will pay off much more than implementing the best search algorithm.
+
When you're designing an application, be mindful of the number of inter-process communications in response to each stimulus. When analyzing applications that suffer from poor performance, I have often found IPC-to-stimulus ratios of thousands-to-one. Reducing this ratio, whether by caching or parallelizing or some other technique, will pay off much more than changing data structure choice or tweaking a sorting algorithm.
By [[Randy Stafford]]
By [[Randy Stafford]]
-
This work is licensed under a Creative Commons Attribution 3
+
This work is licensed under a [http://creativecommons.org/licenses/by/3.0/us/ Creative Commons Attribution 3]
 +
 
 +
 
 +
 
 +
Back to [[97 Things Every Programmer Should Know]] home page

Current revision

Response time is critical to software usability. Few things are as frustrating as waiting for some software system to respond, especially when our interaction with the software involves repeated cycles of stimulus and response. We feel as if the software is wasting our time and affecting our productivity. However, the causes of poor response time are less well appreciated, especially in modern applications. Much performance management literature still focuses on data structures and algorithms, issues that can make a difference in some cases but are far less likely to dominate performance in modern multi-tier enterprise applications.

When performance is a problem in such applications, my experience has been that examining data structures and algorithms isn't the right place to look for improvements. Response time depends most strongly on the number of remote inter-process communications (IPCs) conducted in response to a stimulus. While there can be other local bottlenecks, the number of remote inter-process communications usually dominates. Each remote inter-process communication contributes some non-negligible latency to the overall response time, and these individual contributions add up, especially when they are incurred in sequence.

A prime example is ripple loading in an application using object–relational mapping. Ripple loading describes the sequential execution of many database calls to select the data needed for building a graph of objects (see Lazy Load in Martin Fowler's Patterns of Enterprise Application Architecture). When the database client is a middle-tier application server rendering a web page, these database calls are usually executed sequentially in a single thread. Their individual latencies accumulate, contributing to the overall response time. Even if each database call takes only 10ms, a page requiring 1000 calls (which is not uncommon) will exhibit at least a 10-second response time. Other examples include web-service invocation, HTTP requests from a web browser, distributed object invocation, request–reply messaging, and data-grid interaction over custom network protocols. The more remote IPCs needed to respond to a stimulus, the greater the response time will be.

There are a few relatively obvious and well-known strategies for reducing the number of remote inter-process communications per stimulus. One strategy is to apply the principle of parsimony, optimizing the interface between processes so that exactly the right data for the purpose at hand is exchanged with the minimum amount of interaction. Another strategy is to parallelize the inter-process communications where possible, so that the overall response time becomes driven mainly by the longest-latency IPC. A third strategy is to cache the results of previous IPCs, so that future IPCs may be avoided by hitting local cache instead.

When you're designing an application, be mindful of the number of inter-process communications in response to each stimulus. When analyzing applications that suffer from poor performance, I have often found IPC-to-stimulus ratios of thousands-to-one. Reducing this ratio, whether by caching or parallelizing or some other technique, will pay off much more than changing data structure choice or tweaking a sorting algorithm.

By Randy Stafford

This work is licensed under a Creative Commons Attribution 3


Back to 97 Things Every Programmer Should Know home page

Personal tools