Memory allocator for Ace
Dynamic memory allocation is the allocation of
memory storage for use in a program during the runtime of that program. Heap is
a subsystem/library which maintains the dynamically allocated memory. Typically
it provides malloc and free operations to the program to allocate and free
dynamic memory.
An OS can be coded without heap. On such a system,
all memory is allocated from the VM subsystem. VM subsystem can deal with
memory only in PAGE_SIZE granularity. (PAGE_SIZE granularity is architecture
dependent and usually multiple of 4KB). So whoever asks for any amount of memory
gets memory rounded off to a higher value, in multiples of page sizes. So even
if someone wants 10 bytes of memory, we end up giving 1 page (4 KB) of memory.
It is waste of virtual memory. So it’s for sure that memory will soon be dried
up.
The solution is an intermediate layer between VM
and other subsystems. This intermediate layer is called “Heap” in Ace OS. All
memory allocation requests goes to Heap instead of VM. Heap allocates memory
pages from VM and gives memory in byte granularity to the caller. The other
dynamic memory allocation algorithms are “Buddy memory allocation” and “Slab
allocation”.
“Dynamic memory
allocation” algorithms allocates memory pages from VM and splits it into small
pieces called chunks/slab and gives user a chunk/slab when requested. When the
user frees the chunk it is put it into a list called free list and when the
whole VM memory page is free it is returned to VM.
To satisfy the memory
allocation, the following algorithms are used:
First
fit – search for the first
free block that is greater or equal to the requested size. After finding the
first big free block, split it. This is fast compared to best fit but it creates
external fragmentation.
Best
fit – search for the smallest
free block that is greater or equal to the requested size. Slower than first
fit but memory is less fragmented.
Eg: (same free list for all sizes)

The free list is linked list of free chunks of
memory. The linked list is formed by using the first byte in the free chunk as
next pointer to the next free chunk.
If the free list is implemented as unsorted linked
list then it will take O(n) for both “first fit” allocation and “best fit”
allocation”. For example if 35 is the requested size then “first fit” will split
50 to allocate 35. And in case of best fit it will allocate the 3rd
element which is 40 after searching the whole list. A sorted list or tree can
be used to improve performance.
The above two algorithms are used only if only one
free pool is used, which is not very common. Instead, an array of free
pool/list is used in most of the implementations to make searching faster and
the algorithm is called as “Quick fit”.
In quick fit, each size or range of size has its own free block so allocating
is just taking the first element from that free block.
Eg: (different free list for different size)

Freeing an allocated buffer is done by just linking
the given address with the free list. However it is a bit complicated if it
should return memory pages to the VM subsystem. To return memory pages to VM,
all the chunks/slabs in that page should be in free stage. Finding all the free
chunks in a given page is difficult if it is maintained just as linked list. So
here comes buddy allocator and slab allocator.
Buddy
Allocator
The buddy memory allocation technique divides
memory into partitions to try to satisfy a memory request as suitably as
possible. This system makes use of splitting memory into halves to try to give
a best-fit. When a memory chunk/partition is freed and the buddy partition is
also free then it is joined and forms a big partition, this operation is repeated
until one allocated partition is found.
Slab
Allocator
Buddy allocation minimizes the external
fragmentation[1]
and creates internal fragmentation[2].
Slab allocation solves this problem. Slab allocator is a concept introduced by
Sun Microsystems.
Ace Heap is based on slab allocator. Slab based
design is chosen because of the following reasons.
|
|
Buddy |
Slab |
|
Internal fragmentation |
More |
Less |
|
External fragmentation |
Less |
Unlikely |
|
Allocation |
Medium |
Faster |
|
Freeing |
Medium |
Faster |
|
Initialization |
Slow |
Faster |
A heap consists of many individual caches which can
be used to allocate and free memory objects.
A cache consists of free_slab list, allocated slab tree
and free page list.
A slab consists of buffers of equal size. A slab
consists of 1 or more vm pages. Each group of vm_pages are represented as slab
and meta data is maintained in this slab. This meta data is stored at the end
of last page of the group of vm_pages in a slab. So slab doesn’t need to be
allocated any memory explicitly.
A Buffer is the smallest unit of memory that is
allocated to memory requests within the subsystem.
For cache size > 1/8th of a
PAGE_SIZE, we maintain a Allocated Slab tree. This tree is needed for the free
operation to associate a buffer to it’s slab. For a group of vm_pages, metadata
is maintained only once at the end of the vm_page of the group. This metadata
is the node of this tree.
For cache size < 1/8th of a PAGE_SIZE,
we don’t have multiple vm_pages inside a slab and hence we don’t need a tree to
search the associated slab. We maintain the buffers as a linked list only in
the free list.


Metadata Unused
part

Partially free slab list
It is ordered list of slabs
which has at least one free buffer and at least one in use buffer. It is used
during buffer allocation to find a free buffer in the slab.
Inuse slab tree
It is tree of slabs which
has at least one in-use buffer. It is used during buffer free to find the slab
to which the buffer belongs to.
Completely Free slab list
It is a list of completely
free slabs (all the buffers are free). It is used to release the memory pages
back to the vm or to re-use when memory pages are wanted.
States
|
Buffer |
State |
Partially free list |
Completely free list |
In use tree |
|
All used |
USED |
No |
No |
Yes |
|
All free |
FREE |
No |
Yes |
No |
|
Some free |
MIXED |
Yes |
No |
Yes |
if ( slab.used_buffer_count
== 0 )
State = FREE;
else if (
slab.used_buffer_count == cache. slab_buffer_count )
State = USED;
else
State = MIXED;
State Transition

Free to Mixed
1)
Add the
slab to partially free list
2)
Remove
the slab from completely free list
3)
Add the
slab to in use tree
Mixed to Free
1)
Remove
the slab from partially free list
2)
Add the
slab to completely free list
3)
Remove
the slab from in use tree
Mixed to Used
1)
Remove
the slab from partially free list
Used to Mixed
1)
Add the
slab to partially free list
Slab
It contains following
1) bitmap of free/use
buffers(if a bit is set the corresponding buffer is used)
2) List for partially free
slab list
3) Tree for inuse slab
tree.
1.
HeapCreate()
Creates a heap and returns the heap pointer.
2.
HeapDestroy
(heap_ptr )
Destroys the given heap.
3.
HeapAlloc
(size, heap_ptr)
Allocates the specified size of memory from the given heap.
4.
HeapFree
(heap_ptr)
Deletes the heap blindly.
5.
InitCache(new_cache_ptr,
size, free_slabs_threshold, min_buffers, max_slabs, constructor, destructor)
Initializes an empty cache of specified buffer
size.
6.
GetVAFromCache
(cache_ptr, flag)
Tries to get a free buffer from partially free slab
list.
If found, return that buffer.
If not found, tries to get a free slab from
completely free slab list.
If found, returns a free buffer from that
slab.
If not found, tries to get memory pages from
VM.
7.
InitSlabAllocator(page_size,
v_alloc, v_free, v_protect)
Initializes a slab allocator. This is an 1 time
operation
8.
FreeBuffer(buffer,
cache_ptr)
Free A buffer in it's slab. If all buffers in the
slab are free, move the slab to completely free slab list
Cache
|
Field |
Description |
|
Size |
Size of the slab
entries in the cache |
|
Minimum buffers |
Minimum no of
buffers to be present always Should be
multiple of (PAGE_SIZE/Size) |
|
MaximumSlabs |
Maximum number of
slabs that can be present in a cache at any time Should be greater
than MinimumSlabs Should be
multiple of (PAGE_SIZE/Size) |
|
FreeSlabsThreshold |
Threshold to
start VM free operation. Should be between
MinimumSlabs and MaximumSlabs. Should be
multiple of (PAGE_SIZE/Size) |
|
Constructor |
Function pointer
to initialize a newly allocated slab |
|
Destructor |
Function pointer
to reuse a slab |
|
CompletelyFreeSlabList |
Linked list of completely
free slabs. |
|
InUseTree |
AVL tree is used
to maintain a good balance. This can be null for a cache which is of size
< 1/8. For a cache with size > 1/8 it contains used page tree with at
least one slab allocated. Virtual address is the key for the tree. |
|
PartiallyFreeSlabList |
List of partially
free slabs which have some buffers free |
Slab
|
Field |
Description |
|
List |
Points to the
next and prev slab |
|
|
|
InUseSlabTree Page
Header
|
Field |
Description |
|
ReferenceCount |
Number of buffers
allocated (in use). |
|
AVLTree |
Points to left/right
nodes |
1.
If the “in-use slab list” is not empty, go to step 5
2.
If the “free slab list” is not empty, go to step 4
3.
Allocate one more memory pages from VM and fill
slab metadata and put it in “free slab list”
4.
Remove one slab from the “free slab list” and add
it to “in-use slab list” and “in-use slab tree”
5.
Get one free buffer from the head of the “in-use
slab list”
6.
If the head of the “in-use slab list” has no free
element, remove the slab from the “in-use slab list”.
1.
Get the slab from the buffer address by searching
the “in-use slab tree”
2.
If the slab is completely free, remove the slab
from the “in-use slab tree” and put it in the “free slab list”
3.
If the slab is has only one free buffer, put the
slab in the “in-use slab list”.
1) Buddy allocation - http://en.wikipedia.org/wiki/Buddy_memory_allocation
2) Slab Allocation - http://citeseer.ist.psu.edu/cache/papers/cs/8756/http:zSzzSzluthien.nuclecu.unam.mxzSz~miguelzSzbonwick.pdf/bonwick94slab.pdf
3) Hoard allocator -
http://www.cs.umass.edu/%7Eemery/hoard/asplos2000.pdf
[1]
External fragmentation is the phenomenon in which free storage becomes
divided into many small pieces over time. The result is that, although free
storage is available, it is effectively unusable because it is divided into
pieces that are too small to satisfy the demands of the application.
[2] Internal fragmentation is the phenomenon in which free storage is allocated in large pieces and over time the free memory becomes unavailable. The result is that, although .free storage is available, it is effectively unusable because it exists within allocated memory which is unknown and unwanted to the user.