Name Mangling

Reference 1

A technique to resolve unique names for programming entities in many modern programming languages.

Provides a way of encoding additional information in the name of a function, structure, class or another datatype in order to pass more semantic information from compilers to linkers.

Linker: To link different object code together, linker needs a great deal of information on each program entity. For example, to correctly link a function, it needs the function name, the number of arguments and their types, and so on.

C: distinguish pieces of code by function name only, ignoring any other information like parameter types or return types.

C++: distinguish by function name, parameter types, return type, and calling convention of a function. To implement this, additional information was encoded in the name of a symbol.

C

no function overloading, but have different calling conventions to be used when being called.

Compilers targeted at Microsoft Windows platforms support a variety of calling conventions.

A consistant mangling scheme across different language, allows subroutines in those languages to call or be called correctly.

Example of name mangling for C in Windows:

int _cdecl f (int x) {return 0;}
int _stdcall g (int y) {return 0;}
int _fastcall h (int z) {return 0;}

After compiled ni 32-bit compilers:

_f
_g@4
@h@4

C++

Example 1. Before mangling:

int  f (void) { return 1; }
int  f (int)  { return 0; }
void g (void) { int i = f(), j = f(0); }

After mangling:

int  __f_v (void) { return 1; }
int  __f_i (int)  { return 0; } 
void __g_v (void) { int i = __f_v(), j = __f_i(0); }

Example 2. Source code with mangled names in comments:

namespace wikipedia 
{
   class article 
   {
   public:
      std::string format (void); 
         /* = _ZN9wikipedia7article6formatEv */

      bool print_to (std::ostream&); 
         /* = _ZN9wikipedia7article8print_toERSo */

      class wikilink 
      {
      public:
         wikilink (std::string const& name);
            /* = _ZN9wikipedia7article8wikilinkC1ERKSs */
      };
   };
}

Mangled symbols begin with _Z, followed by N for nested names, then a series of , where length is the length of next identifier, and finally E, then type information. For example, wikipedia::article::format becomes:

format: _ZN9Wikipedia7article6formatEv
print_to: _ZN9Wikipedia7article8print_toERSo

  • v: function parameter type is void
  • So: function parameter type is the standard type std::ostream
  • RSo: function parameter is a reference to So.
  • i: int
  • c: char

C/C++ compatibility

the common C++ idiom:

#ifdef __cplusplus
extern "C" {
#endif

#ifdef __cplusplus
}
#endif

is used to avoid the name mangling by C++ compilers to be applied to C code.

Created Apr 23, 2020 // Last Updated Apr 23, 2020

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?