The note documents the inlining strategy in CH4.

The goal of inlining is to avoid the function call overhead (especially
for functions with many arguments). It is the most beneficial for the
latency sensitive code path---communication related code path. For
other codes like Init/Finalize and Comm/Win creation/destroy function
inlining is less beneficial.

The guidelines for inlining are as follow.

1. PT2PT, Collective and RMA communication functions need to be inlined.
   This also means any function that is used in this code path need to
   be inlined as well. For example, request creation, AM fallback for
   communication.

2. Any function that does not fall in the first rule should not be
   inlined.

3. For a given component, some of its functions may need to be inlined
   while others do not. For example, in ch4r_buf.h/.c the creation and
   destroy of buf pool is only used in init/finalize code, therefore no
   inlining is needed. But the functions for obtaining and returning
   individual buffers are used in the communication code path,
   therefore, inlining is required.

Uninlining procedure for CH4 layer

1. Create a src/*.c file, include `mpidimpl.h`.
2. For functions need to be uninlined, move them to *.c and create
   create declarations in the *.h file. Note that the functions
   implemented the ADI do not need this as the prototype is already
   declared in `mpidch4.h`.
3. Remove `static inline` or `MPL_STATIC_INLINE_PREFIX` of the uninlined
   functions (both declarations `mpidch4.h`  and definitions)
4. For uninlined functions that is only used in the *.c file, make them
   “private”. 1) remove their namespace prefix; 2) create declarations
   in *.c file; 3) make them `static`.
5. Remove the *.h file if they are empty. `ch4r_*.h` files are kept for
   the decls
6. Keep CVAR to the place where it is used.

Uninlining procedure for netmod/shm layer

1. Create *.c file, include `mpidimpl.h`.
2. For functions need to be uninlined, move them to *.c. Rename from
   `MPIDI_NM_*` to `MPIDI_OFI_*`. Create decls and #define in
   `ofi_noinline.h`
3. Remove `static inline` or `MPL_STATIC_INLINE_PREFIX` of the uninlined
   functions (both declarations `netmod.h` and definitions). Move
   wrapper funcs in `netmod_impl.h` to `netmod/src/netmod_impl.c`.
4. Make internal funcs “private”.
5. Remove the *.h file if they are empty.
6. Keep CVAR to the place where it is used.

For netmod, why do we need to create the #define in `*_noinline.h`?

  The problem is sort of backward from the case when NETMOD_INLINE
  disabled---both netmod_impl.c/h and the ofi_init.h/c will have their
  own MPIDI_NM_* (the one in netmod_impl.c/h is the wrapper). To solve
  this name collision, we have to change the name of the uninlined
  functions to MPIDI_OFI_*. Then we run into the problem when
  NETMOD_INLINE is enabled, the function name does not match the defined
  interface name anymore, hence the #define solution.

Why the MPIDI_NM_comm_get_gpid function is moved to ofi_proc.h?

  This function is used in MPIDIG RMA code and need to be inlined.
  Because it was the only function in ofi_init.h that should be inlined
  and it is ugly to keep ofi_init.h just for this function.

  It cannot be put in ofi_impl.h because ofi_impl.h is included in many
  .c file which in turn will create their own copy of
  MPIDI_NM_comm_get_gpid and conflict with each other. Therefore, I put
  it in ofi_proc.h.

Tips for resolving conflict when rebasing.

  If you run into conflict in rebasing the file that is uninlined, try
  doing a diff between the original .h file and the new .c file. I tried
  to maintain the relative location of the functions when moving them to
  the .c file so that it is easier to do a diff in editor. I would
  usually do something like ```vimdiff ofi_win.h ofi_win.c``` to figure
  out what was changed during conflict resolving.
