【ebpf】记一次BCC框架字符串传递用户空间遇'\0'截断问题

枫MapleLCG 发表于 2024-3-17 14:11

## BCC框架下的字符串拷贝

ebpf技术允许用户将程序加载到内核中获取数据。为了安全，linux要求只能使用bpf辅助函数来拷贝字符串。而BCC框架提供的接口为：bpf_probe_read_kernel, bpf_probe_read_kernel_str。我们使用"bpf_probe_read_kernel"函数将“wget\\0baidu.com"拷贝到结构体的char数组

```c_cpp
struct data_t{
char a;
}
```

我发现，我是可以将“wget\\0baidu.com"完整拷贝到char a[64[里的。但结构体通过perf_submit传递到python时，会发现只能接收到wget，而'\0'之后的内容被丢弃了。这并不是我们想看到的结果

并且此时的数据转换为了python中的bytes类型

## 问题定位

来到bcc的github仓库，从源码找找是哪里的问题。

先搜索perf_submit，来到src/cc/frontends/clang/b_frontend_action.cc的982行

```c_cpp
   } else if (memb_name == "perf_submit") {
      string name = string(Ref->getDecl()->getName());
      string arg0 = rewriter_.getRewrittenText(expansionRange(Call->getArg(0)->getSourceRange()));
      string args_other = rewriter_.getRewrittenText(expansionRange(SourceRange(GET_BEGINLOC(Call->getArg(1)),
                                                      GET_ENDLOC(Call->getArg(2)))));
      txt = "bpf_perf_event_output(" + arg0 + ", (void *)bpf_pseudo_fd(1, " + fd + ")";
      txt += ", CUR_CPU_IDENTIFIER, " + args_other + ")";

      // e.g.
      // struct data_t { u32 pid; }; data_t data;
      // events.perf_submit(ctx, &data, sizeof(data));
      // ...
      //                   &data -> data ->typeof(data)    -> data_t
      auto type_arg1 = Call->getArg(1)->IgnoreCasts()->getType().getTypePtr()->getPointeeType().getTypePtrOrNull();
      if (type_arg1 && type_arg1->isStructureType()) {
         auto event_type = type_arg1->getAsTagDecl();
         const auto *r = dyn_cast<RecordDecl>(event_type);
         std::vector<std::string> perf_event;

         for (auto it = r->field_begin(); it != r->field_end(); ++it) {
         // After LLVM commit aee49255074f
         // (https://github.com/llvm/llvm-project/commit/aee49255074fd4ef38d97e6e70cbfbf2f9fd0fa7)
         // array type change from `comm#char ` to `comm#char`
         perf_event.push_back(it->getNameAsString() + "#" + it->getType().getAsString()); //"pid#u32"
         }
         fe_.perf_events_ = perf_event;
      }
```

BCC提供的接口，本质上是根据用户提供的参数，来拼接并调用bpf原生辅助函数。并且做一些c语言与python之间数据交换的准备，将"char a” 转换为"a#char"

接下来我们找跟数据交换有关的地方，逐一排查即可定位到问题源头。我们从内核往用户空间提交数据的时候，会用到BPF_TABLE。BCC将跟TABLE有关的操作做了python上的封装，所以我们只需要去到跟table有关的地方就可以继续了。我们来到源码仓库的src/python/bcc/table.py，这是BCC封装TABLE的地方，我们看到214行有一个函数

```python
import ctypes as ct
def _get_event_class(event_map):
ct_mapping = {
   'char'          : ct.c_char,
   's8'             : ct.c_char,
   'unsigned char' : ct.c_ubyte,
   'u8'             : ct.c_ubyte,
   'u8 *'          : ct.c_char_p,
   'char *'          : ct.c_char_p,
   'short'          : ct.c_short,
   's16'             : ct.c_short,
   'unsigned short' : ct.c_ushort,
   'u16'             : ct.c_ushort,
   'int'             : ct.c_int,
   's32'             : ct.c_int,
   'enum'          : ct.c_int,
   'unsigned int'    : ct.c_uint,
   'u32'             : ct.c_uint,
   'long'          : ct.c_long,
   'unsigned long' : ct.c_ulong,
   'long long'       : ct.c_longlong,
   's64'             : ct.c_longlong,
   'unsigned long long': ct.c_ulonglong,
   'u64'             : ct.c_ulonglong,
   '__int128'       : (ct.c_longlong * 2),
   'unsigned __int128' : (ct.c_ulonglong * 2),
   'void *'          : ct.c_void_p,
}

# handle array types e.g. "int ", "char" or "unsigned char"
array_type = re.compile(r"(\S+(?: \S+)*) ?\[(+)\]$")

fields = []
num_fields = lib.bpf_perf_event_fields(event_map.bpf.module, event_map._name)
i = 0
while i < num_fields:
   field = lib.bpf_perf_event_field(event_map.bpf.module, event_map._name, i).decode()
   m = re.match(r"(.*)#(.*)", field)
   field_name = m.group(1)
   field_type = m.group(2)

   if re.match(r"enum .*", field_type):
         field_type = "enum"

   m = array_type.match(field_type)
   try:
         if m:
            fields.append((field_name, ct_mapping * int(m.group(2))))
         else:
            fields.append((field_name, ct_mapping))
   except KeyError:
         # Using print+sys.exit instead of raising exceptions,
         # because exceptions are caught by the caller.
         print("Type: '%s' not recognized. Please define the data with ctypes manually."
               % field_type, file=sys.stderr)
         sys.exit(1)
   i += 1
return type('', (ct.Structure,), {'_fields_': fields})
```

这个函数做了一个事情，将"name#int"、"name#char"等按照一定规则正则匹配，拆开为“名字+类型”两个group。并根据ct_mapping将类型映射为python能够接受的ctypes格式，将名字和ctypes格式添加到ctypes定义的结构体中。

根据以上两个文件的源码内容，我们得知，BCC是使用ctypes库在python和c语言之间交换和转换数据的。在经过尝试后，发现ctypes要把结构体中的char数组传递给python时，会将char数组映射为python的bytes类。可能是为了避免冗余，即便数据实际长度远小于定义的char数组长度，ctypes在转换的过程中会将“\\0"之后的内容尽数抛弃。

## 解决方案

解决办法很简单，使用c_ubyte，将数据类型映射为c_ubyte_Array即可解决这个问题。即，将char a 改为 unsigned char a。

不过输出的时候没有办法直接print，需要一个循环。改为unsigned char后，你定义了多大的空间，ctypes就会相应的接收多大的空间。并且在输出的时候，末尾添加一个“S+数字"来表明该数组占用的空间

## 解决原理

c_char对应的是python的one-charactor bytes object，所以会将“\\0"视为空字符。

而c_ubyte 或者是c_byte对应的是python的int，空字符会被解读为数字0。

mmSmm 发表于 2024-3-17 14:40

感谢楼主

ayahoostar 发表于 2024-3-17 14:52

感谢咯，支持下！

soglog 发表于 2024-3-17 15:07

谢谢·学习了················感谢给出这样的好思路

stu 发表于 2024-3-20 21:26

感谢楼主，学习一下

debug_cat 发表于 2024-4-29 10:43

整个排查思路分析非常细，感谢

页: [1]

吾爱破解 - 52pojie.cn's Archiver

【ebpf】记一次BCC框架字符串传递用户空间遇'\0'截断问题