HEAP

 

 

Memory allocator for Ace

 

 

Introduction

Dynamic memory allocation is the allocation of memory storage for use in a program during the runtime of that program. Heap is a subsystem/library which maintains the dynamically allocated memory. Typically it provides malloc and free operations to the program to allocate and free dynamic memory.

 

An OS can be coded without heap. On such a system, all memory is allocated from the VM subsystem. VM subsystem can deal with memory only in PAGE_SIZE granularity. (PAGE_SIZE granularity is architecture dependent and usually multiple of 4KB). So whoever asks for any amount of memory gets memory rounded off to a higher value, in multiples of page sizes. So even if someone wants 10 bytes of memory, we end up giving 1 page (4 KB) of memory. It is waste of virtual memory. So it’s for sure that memory will soon be dried up.

 

The solution is an intermediate layer between VM and other subsystems. This intermediate layer is called “Heap” in Ace OS. All memory allocation requests goes to Heap instead of VM. Heap allocates memory pages from VM and gives memory in byte granularity to the caller. The other dynamic memory allocation algorithms are “Buddy memory allocation” and “Slab allocation”.

 

 “Dynamic memory allocation” algorithms allocates memory pages from VM and splits it into small pieces called chunks/slab and gives user a chunk/slab when requested. When the user frees the chunk it is put it into a list called free list and when the whole VM memory page is free it is returned to VM.

 

Allocation

To satisfy the memory allocation, the following algorithms are used:

 

First fit – search for the first free block that is greater or equal to the requested size. After finding the first big free block, split it. This is fast compared to best fit but it creates external fragmentation.

 

Best fit – search for the smallest free block that is greater or equal to the requested size. Slower than first fit but memory is less fragmented.

 

Eg: (same free list for all sizes)

The free list is linked list of free chunks of memory. The linked list is formed by using the first byte in the free chunk as next pointer to the next free chunk.

 

If the free list is implemented as unsorted linked list then it will take O(n) for both “first fit” allocation and “best fit” allocation”. For example if 35 is the requested size then “first fit” will split 50 to allocate 35. And in case of best fit it will allocate the 3rd element which is 40 after searching the whole list. A sorted list or tree can be used to improve performance.

 

The above two algorithms are used only if only one free pool is used, which is not very common. Instead, an array of free pool/list is used in most of the implementations to make searching faster and the algorithm is called as “Quick fit”. In quick fit, each size or range of size has its own free block so allocating is just taking the first element from that free block.

 

Eg: (different free list for different size)

 

Free

Freeing an allocated buffer is done by just linking the given address with the free list. However it is a bit complicated if it should return memory pages to the VM subsystem. To return memory pages to VM, all the chunks/slabs in that page should be in free stage. Finding all the free chunks in a given page is difficult if it is maintained just as linked list. So here comes buddy allocator and slab allocator.

 

Buddy Allocator

The buddy memory allocation technique divides memory into partitions to try to satisfy a memory request as suitably as possible. This system makes use of splitting memory into halves to try to give a best-fit. When a memory chunk/partition is freed and the buddy partition is also free then it is joined and forms a big partition, this operation is repeated until one allocated partition is found.

 

Slab Allocator

Buddy allocation minimizes the external fragmentation[1] and creates internal fragmentation[2]. Slab allocation solves this problem. Slab allocator is a concept introduced by Sun Microsystems.

 

Ace Heap is based on slab allocator. Slab based design is chosen because of the following reasons.

 

Buddy

Slab

Internal fragmentation

More

Less

External fragmentation

Less

Unlikely

Allocation

Medium

Faster

Freeing

Medium

Faster

Initialization

Slow

Faster


Heap

A heap consists of many individual caches which can be used to allocate and free memory objects.

 

A cache consists of free_slab list, allocated slab tree and free page list.

 

A slab consists of buffers of equal size. A slab consists of 1 or more vm pages. Each group of vm_pages are represented as slab and meta data is maintained in this slab. This meta data is stored at the end of last page of the group of vm_pages in a slab. So slab doesn’t need to be allocated any memory explicitly. Meta data consists of: Reference count of buffers, next and previous pointers in the slab list.

 

A Buffer is the smallest unit of memory that is allocated to memory requests within the subsystem.

 

For cache size > 1/8th of a PAGE_SIZE, we maintain a Allocated Slab tree. This tree is needed for the free operation to associate a buffer to it’s slab. For a group of vm_pages, metadata is maintained only once at the end of the vm_page of the group. This metadata is the node of this tree.

For cache size < 1/8th of a PAGE_SIZE, we don’t have multiple vm_pages inside a slab and hence we don’t need a tree to search the associated slab. We maintain the buffers as a linked list only in the free list.


 

 

Metadata

 

Unused part

 

Partially free slab list

It is ordered list of slabs which has at least one free buffer and at least one in use buffer. It is used during buffer allocation to find a free buffer in the slab.

 

Inuse slab tree

It is tree of slabs which has at least one in-use buffer. It is used during buffer free to find the slab to which the buffer belongs to.

 

Completely Free slab list

It is a list of completely free slabs (all the buffers are free). It is used to release the memory pages back to the vm or to re-use when memory pages are wanted.

 

States

Buffer

State

Partially free list

Completely free list

In use tree

All used

USED

No

No

Yes

All free

FREE

No

Yes

No

Some free

MIXED

Yes

No

Yes

 

Finding State

if ( slab.used_buffer_count == 0 )

          State = FREE;

else if ( slab.used_buffer_count == cache. slab_buffer_count )

          State = USED;

else

          State = MIXED;

 

State Transition

Free to Mixed

1)     Add the slab to partially free list

2)     Remove the slab from completely free list

3)     Add the slab to in use tree

 

Mixed to Free

1)     Remove the slab from partially free list

2)     Add the slab to completely free list

3)     Remove the slab from in use tree

 

Mixed to Used

1)     Remove the slab from partially free list

 

Used to Mixed

1)     Add the slab to partially free list

 

 

Slab Meta data

It contains following

1) bitmap of free/use buffers(if a bit is set the corresponding buffer is used)

2) List for partially free slab list

3) Tree for inuse slab tree.


Functions

 

1.      HeapCreate()
Creates a heap and returns the heap pointer.

 

2.      HeapDestroy (heap_ptr )
Destroys the given heap.

 

3.      HeapAlloc (size, heap_ptr)
Allocates the specified size of memory from the given heap.

 

4.      HeapFree (heap_ptr)
Deletes the heap blindly.

 

5.      InitCache(new_cache_ptr, size, free_slabs_threshold, min_buffers, max_slabs, constructor, destructor)

Initializes an empty cache of specified buffer size.

 

6.      GetVAFromCache (cache_ptr, flag)

Tries to get a free buffer from partially free slab list.

If found, return that buffer.

If not found, tries to get a free slab from completely free slab list.

If found, returns a free buffer from that slab.

If not found, tries to get memory pages from VM.

 

 

7.      InitSlabAllocator(page_size, v_alloc, v_free, v_protect)

Initializes a slab allocator. This is an 1 time operation

 

8.      FreeBuffer(buffer, cache_ptr)

Free A buffer in it's slab. If all buffers in the slab are free, move the slab to completely free slab list

 

 


Data structures

Cache

Field

Description

Size

Size of the slab entries in the cache

Minimum buffers

Minimum no of buffers to be present always

Should be multiple of (PAGE_SIZE/Size)

MaximumSlabs

Maximum number of slabs that can be present in a cache at any time

Should be greater than MinimumSlabs

Should be multiple of (PAGE_SIZE/Size)

FreeSlabsThreshold

Threshold to start VM free operation.

Should be between MinimumSlabs and MaximumSlabs.

Should be multiple of (PAGE_SIZE/Size)

Constructor

Function pointer to initialize a newly allocated slab

Destructor

Function pointer to reuse a slab

CompletelyFreeSlabList

Linked list of completely free slabs.

InUseTree

AVL tree is used to maintain a good balance. This can be null for a cache which is of size < 1/8. For a cache with size > 1/8 it contains used page tree with at least one slab allocated. Virtual address is the key for the tree.

PartiallyFreeSlabList

List of partially free slabs which have some buffers free

 

Slab

Field

Description

List

Points to the next and prev slab

 

 

 

InUseSlabTree Page Header

Field

Description

ReferenceCount

Number of buffers allocated (in use).

AVLTree

Points to left/right nodes

 


Algorithm

Allocation

1.      If the “in-use slab list” is not empty, go to step 5

2.      If the “free slab list” is not empty, go to step 4

3.      Allocate one more memory pages from VM and fill slab metadata and put it in “free slab list”

4.      Remove one slab from the “free slab list” and add it to “in-use slab list” and “in-use slab tree”

5.      Get one free buffer from the head of the “in-use slab list”

6.      If the head of the “in-use slab list” has no free element, remove the slab from the “in-use slab list”.

 

Free

1.      Get the slab from the buffer address by searching the “in-use slab tree”

2.      If the slab is completely free, remove the slab from the “in-use slab tree” and put it in the “free slab list”

3.      If the slab is has only one free buffer, put the slab in the “in-use slab list”.

 


Usage

A subsystem to dynamically allocate memory can use either Heap or Cache. Cache can be used if a allocation of particular object (same size) is going happen frequently in the life cycle of the subsystem. Heap is better for the allocation where the size is variable.

Any subsystem that wants to use HEAP should link the HEAP library to itself. Initially a subsystem should call HeapCreate() to create a heap. After creation of heap it should call HeapAlloc() to allocate memory from that heap. To free memory back to the Heap call HeapFree()

Similarly a program should call CacheCreate() to create a cache and CacheAlloc() to allocate memory from the cache. At the end of the program call CacheFree() for each cache to release all allocated slabs in the cache. The call CacheDestroy() to destroy the cache

 

Implementation

Heap

HEAP consists of caches. An array of cache is created. These cache entries are in the increasing order of power of 2 sized.

HeapCreate will create 10 cache entries of sizes:

8, 16, 32, 128, 256, 512, 1024, 2048, 4096, 8192..

 

Cache


Reference

 

1) Buddy allocation - http://en.wikipedia.org/wiki/Buddy_memory_allocation

 

2) Slab Allocation - http://citeseer.ist.psu.edu/cache/papers/cs/8756/http:zSzzSzluthien.nuclecu.unam.mxzSz~miguelzSzbonwick.pdf/bonwick94slab.pdf

 

3) Hoard allocator - http://www.cs.umass.edu/%7Eemery/hoard/asplos2000.pdf



[1] External fragmentation is the phenomenon in which free storage becomes divided into many small pieces over time. The result is that, although free storage is available, it is effectively unusable because it is divided into pieces that are too small to satisfy the demands of the application.

 

[2] Internal fragmentation is the phenomenon in which free storage is allocated in large pieces and over time the free memory becomes unavailable. The result is that, although .free storage is available, it is effectively unusable because it exists within allocated memory which is unknown and unwanted to the user.