[LINUX] Flow in which malloc parameters are set from environment variables

Introduction

malloc related parameters can be changed with mallopt (3). For example, you can change the threshold for using mmap inside malloc to 1MB by doing the following:

mallopt(M_MMAP_THRESHOLD, 1024*1024);

Some of the parameters that can be set with mallopt can also be changed with environment variables. For example, the M_MMAP_THRESHOLD mentioned above can also be set with the MALLOC_MMAP_THRESHOLD_ environment variable.

$ export MALLOC_MMAP_THRESHOLD_=1048576
$ ./a.out

Environment variables can be applied without changing the program, which is useful when changing system defaults. Changing mallopt with this environment variable uses glibc's Tunables mechanism.

In this article, the flow of setting glibc tunables Make a note of the MALLOC_MMAP_THRESHOLD_ environment variable as an example. If you have any mistakes, I would appreciate it if you could point them out.

The environment is as follows.

$ arch
x86_64
$ uname -r
5.4.0-42-generic
$ lsb_release -d
Description:	Ubuntu 18.04.4 LTS
$ dpkg -l | grep libc-bin
ii  libc-bin             2.27-3ubuntu1.2 amd64           GNU C Library: Binaries

MALLOC_MMAP_THRESHOLD_ Flow that reflects environment variables

1. Check the auxiliary vector with the loader (ld-linux.so)

Auxiliary vector can be obtained with getauxval (3), Auxiliary information of the executable file passed from the ELF loader of the kernel.

In the loader, get the ʻAT_SECURE information from the auxiliary vector and If ʻAT_SECURE is true, the global variable __libc_enable_secure will be set.

Here, ʻAT_SECURE is a name that is a little misleading, This means that the current process needs to run securely. More specifically, ʻAT_SECURE is set in any of the following cases.

--Programs with set-user-ID or set-group-ID valid --Programs with capabilities

glibc-2.27/elf/dl-sysdep.c


ElfW(Addr)
_dl_sysdep_start (void **start_argptr,
		  void (*dl_main) (const ElfW(Phdr) *phdr, ElfW(Word) phnum,
				   ElfW(Addr) *user_entry, ElfW(auxv_t) *auxv))
{
...

  for (av = GLRO(dl_auxv); av->a_type != AT_NULL; set_seen (av++))
    switch (av->a_type)
      {
...
      case AT_SECURE:
        __libc_enable_secure = av->a_un.a_val;
        break;
...
      }

  __tunables_init (_environ);

...
  (*dl_main) (phdr, phnum, &user_entry, GLRO(dl_auxv));
  return user_entry;
}

2. In the loader, update tunable_list [] from the value of the environment variable

In __tunables_init, if the item intunable_list []is included in the environment variable, The value of the environment variable is reflected by tunable_initialize ().

However, if __libc_enable_secure is set (= ʻAT_SECUREis set), Processing is performed according to the following three types ofsecurity_level`.

--SXID_ERASE: ʻAT_SECURE, delete it from the environment variable so that the child process cannot read it. --Ignore if SXID_IGNORE: ʻAT_SECURE (remains in environment variable) --NONE: Always reflect

glibc-2.27/elf/dl-tunables.c


void
__tunables_init (char **envp)
{
...
  while ((envp = get_next_env (envp, &envname, &len, &envval,
                               &prev_envp)) != NULL)
    {
...
      for (int i = 0; i < sizeof (tunable_list) / sizeof (tunable_t); i++)
        {
          tunable_t *cur = &tunable_list[i];
...
              if (__libc_enable_secure)
                {
                  if (cur->security_level == TUNABLE_SECLEVEL_SXID_ERASE)
                    {
                      /* Erase the environment variable.  */
                      ...
                    }

                  if (cur->security_level != TUNABLE_SECLEVEL_NONE)
                    continue;
                }

              tunable_initialize (cur, envval);
              break;
            }
        }
    }
}

The security_level of each item in the tunable is defined in the .list file along with the environment variable name and type.

glibc-2.27/elf/dl-tunables.list


glibc {
  malloc {
    ...
    mmap_threshold {
      type: SIZE_T
      env_alias: MALLOC_MMAP_THRESHOLD_
      security_level: SXID_IGNORE
    }
    ...
  }
}

Automatically generate dl-tunable-list.h from dl-tunables.list by scripts / gen-tunables.awk tunable_list [] is defined there.

dl-tunable-list.h


static tunable_t tunable_list[] attribute_relro = {
  ...
  {TUNABLE_NAME_S(glibc, malloc, mmap_threshold), {TUNABLE_TYPE_SIZE_T, 0, SIZE_MAX}, {}, NULL, TUNABLE_SECLEVEL_SXID_IGNORE, "MALLOC_MMAP_THRESHOLD_"},
  ...
};

3. Reflected from tunable_list [] to mp_ at the first malloc

From now on, it's not about tunable, but about malloc parameters. Also, the execution timing is not the loader, but the first time malloc is called after entering the normal main function.

The first time the program calls the memory allocation function, malloc_hook_ini is called and Call ptmalloc_init from there.

glibc-2.27/malloc/hooks.c


static void *
malloc_hook_ini (size_t sz, const void *caller)
{
  __malloc_hook = NULL;
  ptmalloc_init ();
  return __libc_malloc (sz);
}

In ptmalloc_init, getmmap_threshold from the tunable_list set in 2. Set to the structure mp_ that holds the malloc parameters together.

glibc-2.27/malloc/arena.c


static void
ptmalloc_init (void)
{
...
  TUNABLE_GET (mmap_threshold, size_t, TUNABLE_CALLBACK (set_mmap_threshold));
...
}

glibc-2.27/elf/dl-tunables.c


void
__tunable_get_val (tunable_id_t id, void *valp, tunable_callback_t callback)
{
  tunable_t *cur = &tunable_list[id];

  switch (cur->type.type_code)
    {
...
    case TUNABLE_TYPE_SIZE_T:
	{
	  *((size_t *) valp) = (size_t) cur->val.numval;
	  break;
	}
...

  if (cur->initialized && callback != NULL)
    callback (&cur->val);
}

The actual setting is the do_set_mmap_threshold function. If you look here, there is actually an upper limit for the set value, You can see that dynamic threshold control is disabled when setting mmap_threshold.

glibc-2.27/malloc/malloc.c


static inline int
__always_inline
do_set_mmap_threshold (size_t value)
{
  /* Forbid setting the threshold too high.  */
  if (value <= HEAP_MAX_SIZE / 2)
    {
      LIBC_PROBE (memory_mallopt_mmap_threshold, 3, value, mp_.mmap_threshold,
		  mp_.no_dyn_threshold);
      mp_.mmap_threshold = value;
      mp_.no_dyn_threshold = 1;
      return 1;
    }
  return 0;
}

Now it's sunny and the value of the MALLOC_MMAP_THRESHOLD_ environment variable will be used to determine if to mmap. I'm happy.

glibc-2.27/malloc/malloc.c


static void *
sysmalloc (INTERNAL_SIZE_T nb, mstate av)
{
...
  if (av == NULL
      || ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
	  && (mp_.n_mmaps < mp_.n_mmaps_max)))
    {
...
          mm = (char *) (MMAP (0, size, PROT_READ | PROT_WRITE, 0));
    }
}

in conclusion

When setting the malloc parameter with an environment variable, it is better to check whether the setting is properly reflected by the debugger.

reference

Following the operation of malloc (Environment Variables) --Qiita mallopt(3) - Linux manual page The GNU C Library getauxval(3) - Linux manual page

Recommended Posts

Flow in which malloc parameters are set from environment variables
What are environment variables?
Get environment variables in Python, otherwise set default values
What are environment variables? (Linux)
Handle environment variables in Python
HTTP environment variables in Flask
Set environment variables with lambda-uploader
How to read environment variables from .env file in PyCharm (on Mac)
How to access environment variables in Python
Set up Pipenv in Pycharm in Windows environment
To reference environment variables in Python in Blender