Posted by editor
on April 9, 2012 at 4:19 PM PDT
With the advent of multicore processors on everything from desktop computers to tablets, pads, and even mobile phones, parallel processing is gaining increasing attention. This is at least in part what's behind the release of the Fork/Join Framework in Java 7, and the inclusion of Lambda Expressions...
With the advent of multicore processors on everything from desktop computers to tablets, pads, and even mobile phones, parallel processing is gaining increasing attention. This is at least in part what's behind the release of the Fork/Join Framework in Java 7, and the inclusion of Lambda Expressions (aka "closures," JSR 335 ) in the upcoming Java 8.
While the "do-it-yourself" approach to developing an application, or a component of an application, that performs processing in parallel, is fraught with complexity (thread management, etc.) and potential for error (memory overwrites, threads that die, threads that never return...), if you parallelize your code using the Fork/Join Framework or Java 8 Lambda Expressions, much of the overhead associated with thread management will be taken care of for you.
Still, parallelizing your application isn't always as beneficial as you might think. For starters, there's Amdahl's Law , which defines the theoretical limit for speeding up your application, or component, based on the amount of processing that can be parallelized and the amount of parallelization that can be applied. Here's one form of the Amdahl's Law equation:
MaximumSpeedUp = 1 / [(1 - P) + P/N]
where P is the fraction of the total processing that can be parallelized, and N is the parallelization factor. In practice, you can think of N as being the number of processor cores you'll be utilizing for running the parallelized portion of your application.
So, let's look at what this implies...
The plot (from Wikipedia) shows the theoretical maximum realizable speed-up for four different applications (where 50%, 75%, 90%, and 95% of the processing can be parallelized) if the portion of the code that can run in parallel utilizes from 1 to 65536 processor cores.
What you realize pretty quickly is that in all cases there's a limit after which throwing more processors at the problem gains almost nothing that would be noticed by a user of an application. The limit in possible speed-up is readily calculable: set N to infinity, and Amdahl's Equation reduces to:
MaximumSpeedUp = 1 / [1 - P] = 1/S
Here we're defining S as the fraction of the processing that must be run in serial mode (i.e., the part that cannot be parallelized, or 1 - P). From this, you see that the maximum speedup limit is entirely determined by the the portion of the application that must be run in serial mode, using a single processor.
But, Amdahl's Law actually is a simplification of reality. Parallelizing an application component involves processing overhead that does not exist in serial processing. It takes time and processing to divide the task into multiple chunks that can be divied out to the available processors, time and processing to manage the threads/processes as they are running, and time and processing to collate the subresults produced by each of the parallel processes into the data unit that is the final product of the processing.
In A Look at Java Thread Overhead I showed that, using standard Java threads, there is indeed a limit with respect to size of task after which thread overhead comes to occupy more time than the task itself. My point was to show that, indeed, creating threads and joining them back does occupy processing time.
Bringing the additional overhead into Amdahl's Equation, we get something like this:
ActualSpeedUp = 1 / [S + P/N + Ttm + Tdiv + Tjoin]
- S is the fraction of the processing that runs serially;
- P is the fraction of the processing that can be parallelized;
- N is the number of parallel processes/threads utilized;
- Ttm is the time spent on thread management;
- Tdiv is the time spent on dividing the task into N chunks; and
- Tjoin is the time spent on joining the subresults into the final product unit.
From this you can see that there are many possibilities whereby an application component that has been parallelized might actually take longer to execute than a serial version of the same component.
I'll be writing much more about the topic of parallel processing in the future. This is just a start!
Since my last blog post , Java.net Community Manager Sonya Barry posted a new java.net blogs :
Our current Java.net poll asks How often do you attend Java conferences? . Voting will be open until Friday, April 20.
Here are the stories we've recently featured in our Java news section:
Our latest Java.net
href="http://www.java.net/archive/spotlight">Spotlight is Patrick Curran's Merging the Executive Committees :
As I explained in this blog last year, we use the Process to change the Process . The first of three planned JSRs to modify the way the JCP operates (JSR 348: Towards a new version of the Java Community Process ) completed in October 2011. That JSR focused on changes to make our process more transparent and to enable broader participation...
Subscriptions and Archives: You can subscribe to this blog using the java.net Editor's Blog Feed . You can also subscribe to the Java Today RSS feed and the java.net blogs feed . You can find historical archives of what has appeared the front page of Java.net in the java.net home page archive .
-- Kevin Farnham