

This can mean a lot of memory, networking bandwidth and time-consuming. This means that each of the data items needs to be serialized (i.e., copied in memory), packaged, communicated to and accepted by each of the workers. One of the things that is often overlooked in such simple loop transformations is that a large part of the data used within the loop needs to be copied ( broadcast) to each of the workers separately. On the other hand, it also hides a lot under the hood. This transformation was intentionally made simple by MathWorks (which is great!). In many cases, all we need to do is to add the “par” prefix to the for keyword and we’re done (assuming we have no incompatibly-used variables that should be converted into sliced variables etc.). It is often easy, too-easy, to convert for loops into parfor loops.
#Matlab for loop backwards code
So how can we let worker A run a different code path than worker B?Īn obvious answer is to create a parfor loop having as many elements as there are separate code paths, and use a switch-case mechanism to run the separate paths, as follows:ĭataList = ) But what if we wanted some workers to run a different code path than the other workers? In spmd blocks we could use a conditional based on the labindex value, but unfortunately labindex is always set to the same value 1 within parfor loops. Of course, we can always use conditional constructs (such as if or switch) based on the data. The conventional wisdom is that parfor loops (and loops in general) can only run a single code segment over all its iterations. Naturally, this specific tip is equally valid for both parfor loops and spmd blocks, since both of them use the pool of workers started by parpool. MATLAB is not using all logical cores because hyper-threading is enabled. MATLAB was assigned: 4 logical cores by the OS. I just know that in many cases I found it beneficial to reduce the number of workers to the actual number of physical cores: Maybe this is system-dependent, and maybe there is a switch somewhere that controls this, I don’t know. I know the documentation and configuration panel seem to imply that parpool uses the number of physical cores by default, but in my tests I have seen otherwise (namely, logical cores). Coupled with the non-negligible overhead of starting, coordinating and communicating with twice as many Matlab instances (workers are headless Matlab processes after all), we reach a conclusion that it may actually be better in many cases to use only as many workers as physical (not logical) cores. However, in many situations, hyperthreading does not improve the performance of a program and may even degrade it (I deliberately wish to avoid the heated debate over this: you can find endless discussions about it online and decide for yourself). On Intel CPUs, the OS reports two logical cores per each physical core due to hyper-threading, for a total of 4 workers on a dual-core machine. By default, Matlab creates as many workers as logical CPU cores. The first tip is to not use the default number of workers created by parpool (or matlabpool in R2013a or earlier). Furthermore, I limit myself only to parfor in this post: much can be said about spmd, GPU and other parallel constructs, but not today. In today’s post I will try not to reiterate the official tips, but rather those that I have not found mentioned elsewhere, and/or are not well-known (my apologies in advance if I missed an official mention of one or more of the following).


Naturally, to use any of today’s tips, you need to have MathWorks’ Parallel Computing Toolbox (PCT).īefore diving into the technical details, let me say that MathWorks has extensive documentation on PCT.
#Matlab for loop backwards full
The overall effect can be dramatic: The performance (speed) difference between a sub-optimal and optimized parfor‘ed code can be up to a full order of magnitude, depending on the specific situation. In today’s post I plan to expand on these tips, as well as provide a few others that for lack of space and time I did not mention in the presentation.

During the presentation, I skimmed over a few tips for improving performance of parallel-processing ( parfor) loops. Matlab Expo 2016 keynote presentation A few days ago, MathWorks uploaded a video recording of my recent keynote presentation at the Matlab Expo 2016 in Munich, Germany.
