Shared-Memory Parallel COMSOL
Modern computer hardware supports the shared-memory model, which allows data in memory to be accessed by all of the CPU cores. When running on a cluster, COMSOL Multiphysics uses shared-memory parallelism on each node; and distributed parallelism across the cluster nodes. The solvers, assembly, and meshing in COMSOL Multiphysics benefit from shared-memory parallelism. By default, the COMSOL software uses all cores available on the machine for shared-memory parallelism.
Benefits of Running COMSOL in Shared-Memory Parallel mode
All iterative solvers and smoothers except Incomplete LU are parallelized. Some smoothers have blocked versions. The blocked versions usually benefit more from running in parallel than the nonblocked versions. The finite element assembly also runs in parallel. Usually the speedup depends on the problem size; problems using a lot of memory usually have better speedup.
The PARDISO and SPOOLES sparse direct linear solvers and the MUMPS direct solver all run in parallel.
The orthonormal null-space function runs in parallel.
The tet mesher in 3D runs in parallel over the faces and domains of the geometry object being meshed. For this reason, the speedup when running on several processors depends strongly on the domain partitioning of the corresponding geometry. Meshing a geometry with only one domain, such as an imported CAD part, gives almost no speedup at all. However, meshing a geometry with several domains, such as an imported CAD assembly with many parts, can give significant speedup, especially if the number of elements in the mesh is large.
The evaluation part of all plot types runs in parallel. In addition, the computations of contours, isosurfaces, and streamlines run in parallel.
A significant part of the parallel speedup in computations comes from functions of the BLAS type (basic linear algebra subprogram; see the next section). If you want to run the software in parallel, it is important that the BLAS library you use supports parallelism. The BLAS libraries shipped with COMSOL Multiphysics do that.
Running in parallel usually requires extra memory. If you run out of memory, try to lower the number of used cores as explained in the COMSOL Multiphysics Installation Guide. The speedup depends on the processor load. For instance, if your system has m processors and n of them are used by other active programs, do not set the number of cores to a number that is greater than m − n. The reason is that the programs compete for the same resources, which slows all of them considerably.