[LINUX] Page fragments

Originally, it is a part of the Linux Kernel source code, so it will be treated as GPLv2 (recognition that it should be).

https://www.kernel.org/doc/html/latest/index.html

Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.

https://www.kernel.org/doc/html/latest/process/license-rules.html#kernel-licensing

https://www.kernel.org/doc/html/latest/vm/page_frags.html

Page fragments

A page fragment is an arbitrary-length arbitrary-offset area of memory which resides within a 0 or higher order compound page. Multiple fragments within that page are individually refcounted, in the page’s reference counter.

A page fragment is an area of memory of any length, with any offset, and exists in a composite page of 0 or more. Multiple fragments are individually referenced and counted in the page reference counter.

The page_frag functions, page_frag_alloc and page_frag_free, provide a simple allocation framework for page fragments. This is used by the network stack and network device drivers to provide a backing region of memory for use as either an sk_buff->head, or to be used in the “frags” portion of skb_shared_info.

The page_flag functions of page_frag_alloc and page_frag_free provide a simple allocation framework for page fragments. This is used by network stacks and network device drivers to provide a backing region of memory for sk_buff-> head. It is also used in the "frags" position of skb_shared_info.

In order to make use of the page fragment APIs a backing page fragment cache is needed. This provides a central point for the fragment allocation and tracks allows multiple calls to make use of a cached page. The advantage to doing this is that multiple calls to get_page can be avoided which can be expensive at allocation time. However due to the nature of this caching it is required that any calls to the cache be protected by either a per-cpu limitation, or a per-cpu limitation and forcing interrupts to be disabled when executing the fragment allocation.

Backing page fragment cache is required to use the page fragment API ;. It provides a central position for fragment allocation and makes the cached page available in multiple calls. The advantage of this is that you can avoid multiple calls to get_page. It is very costly to secure the time. However, due to the characteristics of this cache, calls to the cache must be protected by a per-CPU limit, or per-CPU limit and interrupts must be forcibly disabled when fragment allocation is executed.

The network stack uses two separate caches per CPU to handle fragment allocation. The netdev_alloc_cache is used by callers making use of the __netdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is used by callers of the __napi_alloc_frag and __napi_alloc_skb calls. The main difference between these two calls is the context in which they may be called. The “netdev” prefixed functions are usable in any context as these functions will disable interrupts, while the “napi” prefixed functions are only usable within the softirq context.

The network stack uses two different caches per CPU to control fragment allocation. netdrv_alloc_cache is used by callers whose callers call __netdev_alloc_frag or __net_dev_alloc_skg. napi_alloc_cache is used by the caller when the caller calls __napi_alloc_frag and __napi_alloc_skb. The main difference between the two calls is the context in which they are called. The "netdev" prefix function disables interrupts and can be used in any context. On the other hand, the [napi] prefix functions are only available within the softirq context.

Many network device drivers use a similar methodology for allocating page fragments, but the page fragments are cached at the ring or descriptor level. In order to enable these cases it is necessary to provide a generic way of tearing down a page cache. For this reason __page_frag_cache_drain was implemented. It allows for freeing multiple references from a single page via a single call. The advantage to doing this is that it allows for cleaning up the multiple references that were added to a page in order to avoid calling get_page per allocation.

Many network device drivers use a similar method for allocating page fragments. However, the page flagment is cached at the link or descriptor level. For these cases to take effect, you must provide a general means of destroying the page cache.

For this reason, __pagE_frag_cache_drain was introduced. It allows you to free multiple references from a single page with a single call. The advantage of doing this is that you don't call get_page on the assignment, so you can clean up multiple references added to the page.

Alexander Duyck, Nov 29, 2016.