User's Guide

Optimizing Subprogram Calls

If a program has many subprogram calls, you can use the -Q option to turn on inlining, which reduces the overhead of such calls. Consider using the -p or -pg option with prof or gprof, respectively, to determine which subprograms are called most frequently and to list their names on the command line.

To make inlining apply to calls where the calling and called subprograms are in different source files, include the -qipa option also.

# Let the compiler decide (relatively cautiously) what to inline
xlf95 -O3 -Q inline.f
 
# Encourage the compiler to inline particular subprograms
xlf95 -O3 -Q -Q+called_100_times:called_1000_times inline.f
 
# Extend the inlining to calls across files
xlf95 -O3 -Q -Q+called_100_times:called_1000_times -qipa inline.f

Related Information:

See -Q Option and -qipa Option.

Finding the Right Level of Inlining

Getting the right amount of inlining for a particular program may require some work on your part. The compiler has a number of safeguards and limits to avoid doing an excessive amount of inlining. Otherwise, it might perform less overall optimization because of storage constraints during compilation, or the resulting program might be much larger, and run slower because of more frequent cache misses and page faults. However, these safeguards may prevent the compiler from inlining subprograms that you do want inlined. If this happens, you will need to do some analysis or rework or both to get the performance benefit.

As a general rule, consider identifying a few subprograms that are called most often, and inline only those subprograms.

Some common conditions that prevent -Q from inlining particular subprograms are:

The calling and called procedures are in different files. If so, you can use the -qipa option to enable cross-file inlining.
A subprogram is not inlined by the basic -Q option unless it is quite small. In general, this means that it contains no more than several source statements (although the exact cutoff is difficult to determine). A subprogram named by -Q+ can be up to approximately 20 times larger and still be inlined.
After the compiler has expanded a subprogram by a certain amount as a result of inlining, it does not inline subsequent calls from that subprogram. Again, there are different limits, which depend on whether the subprogram being called is named by a -Q+ option.
Consider an example with three procedures: A is the caller, B and C at the upper size limit for automatic inlining. They are all in the same file, which is compiled like this:
```
xlf -Q -Q+c file.f
```
The -Q option means that calls to B or C can be inlined. -Q+c means that calls to C are more likely to be inlined. If B and C were twice as large, calls to B would not be inlined at all, while some calls to C could still be inlined.
Although these limits might prevent some calls from A to B or A to C from being inlined, the process starts over after the compiler finishes processing A.
Any interface errors, such as different numbers, sizes, or types of arguments or return values, might prevent a call from being inlined. To locate such errors, compile with the -qextchk option or define Fortran 90/Fortran 95 interface blocks for the procedures being called.
Actual or potential aliasing of dummy arguments or automatic variables might prevent a procedure from being inlined. For example, inlining might not occur in the following cases:
- If you compile the file containing either the calling or called procedure with the option -qalias=nostd, and there are any arguments to the procedure being called.
- If there are more than approximately 31 arguments to the procedure being called.
- If any automatic variables in the called procedures are involved in an EQUIVALENCE statement.
- If the same variable argument is passed more than once in the same call: for example, CALL SUB(X,Y,X).
Some procedures that use computed GO TO statements, where any of the corresponding statement labels are also used in an ASSIGN statement, might not be inlined.

To change the size limits that control inlining, you can specify -qipa=limit=n, where n is 0 through 9. Larger values allow more inlining.