8.1版本dex加载流程笔记--第三篇: OatFile::Open流程与OatDexFile的获得
本帖最后由 L剑仙 于 2020-3-18 15:38 编辑菜鸟最近破事比较多,磕磕绊绊总算把oat_file.cc大致流程看完了,论坛记录笔记方便以后查询
这个函数重点就是如何打开oat_file文件,然后通过解析oat文件构建出oat_dex_file数据结构,这个oat_dex_file存储了完整的dex信息,
如果走通过oat_file获得 dex_file这条路,从OpenDexFilesFromOat直到DexFile::Open就是主要通过解析oat_dex_file的数据结构获得dex_file的OatFile::Open
依然先贴一下流程,一些不重要的函数就省略了
OatFile::Open
{
GetVdexFilename
OatFileBase::OpenOatFile<DlOpenOatFile>
OatFileBase::OpenOatFile<ElfOatFile>
{
PreLoad
LoadVdex
{
VdexFile::Open
}
Load
{
Dlopen
}
ComputeFields
{
FindDynamicSymbolAddress
}
PreSetup
{
dl_iterate_phdr(dl_iterate_context::callback,&context)
}
Setup
{
GetOatHeader
GetInstructionSetPointerSize
GetOatDexFilesOffset//这里达到了OatDexFile的起始
GetDexFileCount
ReadOatDexFileData&dex_file_location_size//
ResolveRelativeEncodedDexLocation
ReadOatDexFileData&dex_file_checksum
ReadOatDexFileData&dex_file_offset
ReadOatDexFileData&class_offsets_offset
ReadOatDexFileData&lookup_table_offset//加快类查找速度
ReadOatDexFileData&dex_layout_sections_offset
ReadOatDexFileData&method_bss_mapping_offset
FindDexFileMapItem&call_sites_item//调用站点标识符
new OatDexFile
}
}
}
1.不管 DlOpenOatFile还是 ElfOatFile,都进入OpenOatFile,依次调用了,用于后面获得dex和调用等,流程比较清楚
OatFileBase* OatFileBase::OpenOatFile(const std::string& vdex_filename,
const std::string& elf_filename,
const std::string& location,
uint8_t* requested_base,
uint8_t* oat_file_begin,
bool writable,
bool executable,
bool low_4gb,
const char* abs_dex_location,
std::string* error_msg) {
std::unique_ptr<OatFileBase> ret(new kOatFileBaseSubType(location, executable));//不管是DlOpenOatFile还是ElfOatFile都先转换成OatFileBase指针
ret->PreLoad();
if (kIsVdexEnabled && !ret->LoadVdex(vdex_filename, writable, low_4gb, error_msg)) {
return nullptr;
}
if (!ret->Load(elf_filename,
oat_file_begin,
writable,
executable,
low_4gb,
error_msg)) {
return nullptr;
}
if (!ret->ComputeFields(requested_base, elf_filename, error_msg)) {
return nullptr;
}
ret->PreSetup(elf_filename);
if (!ret->Setup(abs_dex_location, error_msg)) {
return nullptr;
}
return ret.release();
}
2.先看 PreLoad,通过dl_iterate_phdr遍历所有加载的elf对象获得它们的dl_phdr_info,每次循环count+1, 然后把count存储在shared_objects_before_,下面PreSetup 会使用shared_objects_before_这个变量
这里重点关注一下结构dl_phdr_info ,存储了elf的address,name,Pointer to array of ELF program headers等几个重要字段。这里的 struct dl_iterate_context只有一个count字段,用于存储计数遍历的elf对象,callback功能也比较简单,下面PreSetup还有一个dl_iterate_context 结构,它的callback函数就比较复杂了,遍历并且映射了oat_file的program segments
void DlOpenOatFile::PreLoad() {
#ifdef __APPLE__
UNUSED(shared_objects_before_);
LOG(FATAL) << "Should not reach here.";
UNREACHABLE();
#else
// Count the entries in dl_iterate_phdr we get at this point in time.//遍历所有elf的phdr
struct dl_iterate_context {
static int callback(struct dl_phdr_info *info ATTRIBUTE_UNUSED,
size_t size ATTRIBUTE_UNUSED,
void *data) {
// struct dl_phdr_info {
// ElfW(Addr) dlpi_addr;/* Base address of object */
// const char *dlpi_name;/* (Null-terminated) name of
// object */
// const ElfW(Phdr) *dlpi_phdr;/* Pointer to array of
// ELF program headers
// for this object */
// ElfW(Half) dlpi_phnum; /* # of items in dlpi_phdr */
// }
reinterpret_cast<dl_iterate_context*>(data)->count++;//每次循环count自增
return 0;// Continue iteration.
}
size_t count = 0;
} context;
dl_iterate_phdr(dl_iterate_context::callback, &context);//遍历所有elf对象获得dl_phdr_info并调用callback,这里的callback就是count自增1
shared_objects_before_ = context.count; //把count最终值存储到shared_objects_before_
#endif
}
3.然后 LoadVdex,最终调用了 VdexFile::Open,这里的vdex是8.0以后的新变化,原先存储在oat里的dexfile现在似乎被quickene后放在在vdex里,组合oat_file和vdex_才能获得完整的oat_dex_file
bool OatFileBase::LoadVdex(const std::string& vdex_filename,
bool writable,
bool low_4gb,
std::string* error_msg) {
vdex_ = VdexFile::Open(vdex_filename, writable, low_4gb, /* unquicken*/ false, error_msg);//打开并获得vdex_
if (vdex_.get() == nullptr) {
*error_msg = StringPrintf("Failed to load vdex file '%s' %s",
vdex_filename.c_str(),
error_msg->c_str());
return false;
}
return true;
}
vdex简单结构vdex_file.h,包含dex_files和QuickeningInfo
// File format:
// VdexFile::Header fixed-length header
//
// DEX array of the input DEX files
// DEX the bytecode may have been quickened
// ...
// DEX
// QuickeningInfo
// uint8[] quickening data
// unaligned_uint32_t[] table of offsets pair:
// uint32_t contains code_item_offset
// uint32_t contains quickening data offset from the start
// of QuickeningInfo
// unalgined_uint32_t start offsets (from the start of QuickeningInfo) in previous
// table for each dex file
4.下面是 Load函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename,
uint8_t* oat_file_begin,
bool writable,
bool executable,
bool low_4gb,
std::string* error_msg) {
// Use dlopen only when flagged to do so, and when it's OK to load things executable.
// TODO: Also try when not executable? The issue here could be re-mapping as writable (as
// !executable is a sign that we may want to patch), which may not be allowed for
// various reasons.
if (!kUseDlopen) {
*error_msg = "DlOpen is disabled.";
return false;
}
if (low_4gb) {
*error_msg = "DlOpen does not support low 4gb loading.";
return false;
}
if (writable) {
*error_msg = "DlOpen does not support writable loading.";
return false;
}
if (!executable) {
*error_msg = "DlOpen does not support non-executable loading.";
return false;
}
// dlopen always returns the same library if it is already opened on the host. For this reason
// we only use dlopen if we are the target or we do not already have the dex file opened. Having
// the same library loaded multiple times at different addresses is required for class unloading
// and for having dex caches arrays in the .bss section.
if (!kIsTargetBuild) {
if (!kUseDlopenOnHost) {
*error_msg = "DlOpen disabled for host.";
return false;
}
}
bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_
DCHECK(dlopen_handle_ != nullptr || !success);
return success;
}
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename,
uint8_t* oat_file_begin,
std::string* error_msg) {
#ifdef __APPLE__
// The dl_iterate_phdr syscall is missing.There is similar API on OSX,
// but let's fallback to the custom loading code for the time being.
UNUSED(elf_filename, oat_file_begin);
*error_msg = "Dlopen unsupported on Mac.";
return false;
#else
{
UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr));
if (absolute_path == nullptr) {
*error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str());
return false;
}
#ifdef ART_TARGET_ANDROID
android_dlextinfo extinfo = {};
// typedef struct {
// uint64_t flags;
// void* reserved_addr;
// size_treserved_size;
// int relro_fd;
// int library_fd;
// } android_dlextinfo;
extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle
// (open oat files multiple
// times).
ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute
// (non-pic boot image).
if (oat_file_begin != nullptr) { //
extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if
extinfo.reserved_addr = oat_file_begin; // vaddr = 0.
} // (pic boot image).
dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里
#else
UNUSED(oat_file_begin);
static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles");
MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_);
dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_
if (dlopen_handle_ != nullptr) {
if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中
dlclose(dlopen_handle_);
dlopen_handle_ = nullptr;
*error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str());
return false;
}
}
#endif// ART_TARGET_ANDROID
}
if (dlopen_handle_ == nullptr) {
*error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror());
return false;
}
return true;
#endif
}
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base,
const std::string& file_path,
std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots
std::string symbol_error_msg;
begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg);
if (begin_ == nullptr) {
*error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s",
file_path.c_str(),
symbol_error_msg.c_str());
return false;
}
if (requested_base != nullptr && begin_ != requested_base) {
// Host can fail this check. Do not dump there to avoid polluting the output.
if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) {
PrintFileToLog("/proc/self/maps", LogSeverity::WARNING);
}
*error_msg = StringPrintf("Failed to find oatdata symbol at expected address: "
"oatdata=%p != expected=%p. See process maps in the log.",
begin_, requested_base);
return false;
}
end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg);
if (end_ == nullptr) {
*error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s",
file_path.c_str(),
symbol_error_msg.c_str());
return false;
}
// Readjust to be non-inclusive upper bound.
end_ += sizeof(uint32_t);
bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg));
if (bss_begin_ == nullptr) {
// No .bss section.
bss_end_ = nullptr;
} else {
bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg));
if (bss_end_ == nullptr) {
*error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str());
return false;
}
// Readjust to be non-inclusive upper bound.
bss_end_ += sizeof(uint32_t);
// Find bss methods if present.
bss_methods_ =
const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg));
// Find bss roots if present.
bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关
}
return true;
}
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以PreSetup 大致功能如下:
声明一个dl_iterate_context 结构通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,
当shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr.p_memsz与dlpi_phdr.p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions
#ifdef __APPLE__
UNUSED(elf_filename);
LOG(FATAL) << "Should not reach here.";
UNREACHABLE();
#else
struct dl_iterate_context {
static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) {
/*
struct dl_phdr_info {
ElfW(Addr) dlpi_addr;
const char* dlpi_name;
const ElfW(Phdr)* dlpi_phdr;
ElfW(Half) dlpi_phnum;}
*/
auto* context = reinterpret_cast<dl_iterate_context*>(data);
context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比
if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题
// We haven't been called yet for anything we haven't seen before. Just continue.
// Note: this is aggressively optimistic. If another thread was unloading a library,
// we may miss out here. However, this does not happen often in practice.
return 0;
}
// See whether this callback corresponds to the file which we have just loaded.
bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_
for (int i = 0; i < info->dlpi_phnum; i++) {
if (info->dlpi_phdr.p_type == PT_LOAD) {
uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr +
info->dlpi_phdr.p_vaddr);
size_t memsz = info->dlpi_phdr.p_memsz;
if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) {
contains_begin = true;
break;
}
}
}
// Add dummy mmaps for this file.
if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存
for (int i = 0; i < info->dlpi_phnum; i++) {
if (info->dlpi_phdr.p_type == PT_LOAD) {
uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr +
info->dlpi_phdr.p_vaddr);
size_t memsz = info->dlpi_phdr.p_memsz;
MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz);
context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_
}
}
return 1;// Stop iteration and return 1 from dl_iterate_phdr. //结束循环
}
return 0;// Continue iteration and return 0 from dl_iterate_phdr when finished.
}
const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_
std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_;
const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数
size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数
};//到这一行structdl_iterate_context结束
dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context
if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap
// Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0
// before giving up. This should be unusual.
VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before="
<< shared_objects_before_;
dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0};
if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) {
// OK, give up and print an error.
PrintFileToLog("/proc/self/maps", LogSeverity::WARNING);
LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps.";
}
}
#endif
}
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup
{
GetOatHeader
GetInstructionSetPointerSize
GetOatDexFilesOffset//这里达到了OatDexFile的Offset
GetDexFileCount
ReadOatDexFileData&dex_file_location_size//
ResolveRelativeEncodedDexLocation
ReadOatDexFileData&dex_file_checksum
ReadOatDexFileData&dex_file_offset
ReadOatDexFileData&class_offsets_offset
ReadOatDexFileData&lookup_table_offset//加快类查找速度
ReadOatDexFileData&dex_layout_sections_offset
ReadOatDexFileData&method_bss_mapping_offset
FindDexFileMapItem&call_sites_item//调用站点标识符
new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得
}
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) {
if (!GetOatHeader().IsValid()) {
std::string cause = GetOatHeader().GetValidationErrorMessage();
*error_msg = StringPrintf("Invalid oat header for '%s': %s",
GetLocation().c_str(),
cause.c_str());
return false;
}
PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet());
size_t key_value_store_size =
(Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u;
if (Size() < sizeof(OatHeader) + key_value_store_size) {
*error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, "
"size = %zu < %zu + %zu",
GetLocation().c_str(),
Size(),
sizeof(OatHeader),
key_value_store_size);
return false;
}
size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset();
if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) {
*error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: "
"%zu is not in [%zu, %zu]",
GetLocation().c_str(),
oat_dex_files_offset,
GetOatHeader().GetHeaderSize(),
Size());
return false;
}
const uint8_t* oat = Begin() + oat_dex_files_offset;// Jump to the OatDexFile records.//oat指针跳到OatDexFile去
DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>));
if (!IsAligned<kPageSize>(bss_begin_) ||
!IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) ||
!IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) ||
!IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) {
*error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): "
"begin = %p, methods_ = %p, roots = %p, end = %p",
GetLocation().c_str(),
bss_begin_,
bss_methods_,
bss_roots_,
bss_end_);
return false;
}
if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) ||
(bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) ||
(bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) {
*error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: "
"begin = %p, methods_ = %p, roots = %p, end = %p",
GetLocation().c_str(),
bss_begin_,
bss_methods_,
bss_roots_,
bss_end_);
return false;
}
uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_;// May be null.
uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_;
uint8_t* dex_cache_arrays_end =
(bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_;
DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr);
uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count
oat_dex_files_storage_.reserve(dex_file_count);
for (size_t i = 0; i < dex_file_count; i++) {
uint32_t dex_file_location_size;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file "
"location size",
GetLocation().c_str(),
i);
return false;
}
if (UNLIKELY(dex_file_location_size == 0U)) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name",
GetLocation().c_str(),
i);
return false;
}
if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file "
"location",
GetLocation().c_str(),
i);
return false;
}
const char* dex_file_location_data = reinterpret_cast<const char*>(oat);
oat += dex_file_location_size;
std::string dex_file_location = ResolveRelativeEncodedDexLocation(
abs_dex_location,
std::string(dex_file_location_data, dex_file_location_size));
uint32_t dex_file_checksum;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after "
"dex file checksum",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
uint32_t dex_file_offset;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated "
"after dex file offsets",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
if (UNLIKELY(dex_file_offset == 0U)) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex "
"file offset",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
if (UNLIKELY(dex_file_offset > DexSize())) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file "
"offset %u > %zu",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
dex_file_offset,
DexSize());
return false;
}
if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file "
"offset %u of %zu but the size of dex file header is %zu",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
dex_file_offset,
DexSize(),
sizeof(DexFile::Header));
return false;
}
const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset;
if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid "
"dex file magic '%s'",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
dex_file_pointer);
return false;
}
if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid "
"dex file version '%s'",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
dex_file_pointer);
return false;
}
const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer);
if (DexSize() - dex_file_offset < header->file_size_) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file "
"offset %u and size %u truncated at %zu",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
dex_file_offset,
header->file_size_,
DexSize());
return false;
}
uint32_t class_offsets_offset;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated "
"after class offsets offset",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
if (UNLIKELY(class_offsets_offset > Size()) ||
UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated "
"class offsets, offset %u of %zu, class defs %u",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
class_offsets_offset,
Size(),
header->class_defs_size_);
return false;
}
if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned "
"class offsets, offset %u",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
class_offsets_offset);
return false;
}
const uint32_t* class_offsets_pointer =
reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset);
uint32_t lookup_table_offset;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated "
"after lookup table offset",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
const uint8_t* lookup_table_data = lookup_table_offset != 0u
? Begin() + lookup_table_offset
: nullptr;
if (lookup_table_offset != 0u &&
(UNLIKELY(lookup_table_offset > Size()) ||
UNLIKELY(Size() - lookup_table_offset <
TypeLookupTable::RawDataLength(header->class_defs_size_)))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated "
"type lookup table, offset %u of %zu, class defs %u",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
lookup_table_offset,
Size(),
header->class_defs_size_);
return false;
}
uint32_t dex_layout_sections_offset;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated "
"after dex layout sections offset",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0
? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset)
: nullptr;
uint32_t method_bss_mapping_offset;
if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated "
"after method bss mapping offset",
GetLocation().c_str(),
i,
dex_file_location.c_str());
return false;
}
const bool readable_method_bss_mapping_size =
method_bss_mapping_offset != 0u &&
method_bss_mapping_offset <= Size() &&
IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) &&
Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0);
const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size
? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset)
: nullptr;
if (method_bss_mapping_offset != 0u &&
(UNLIKELY(method_bss_mapping == nullptr) ||
UNLIKELY(method_bss_mapping->size() == 0u) ||
UNLIKELY(Size() - method_bss_mapping_offset <
MethodBssMapping::ComputeSize(method_bss_mapping->size())))) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or "
" truncated method bss mapping, offset %u of %zu, length %zu",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
method_bss_mapping_offset,
Size(),
method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u);
return false;
}
if (kIsDebugBuild && method_bss_mapping != nullptr) {
const MethodBssMappingEntry* prev_entry = nullptr;
for (const MethodBssMappingEntry& entry : *method_bss_mapping) {
CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size));
CHECK_LT(entry.bss_offset, BssSize());
CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size),entry.bss_offset);
size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u;
CHECK_LE(index_mask_span, entry.method_index);
if (prev_entry != nullptr) {
CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span);
}
prev_entry = &entry;
}
CHECK_LT(prev_entry->method_index,
reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_);
}
uint8_t* current_dex_cache_arrays = nullptr;
if (dex_cache_arrays != nullptr) {
// All DexCache types except for CallSite have their instance counts in the
// DexFile header. For CallSites, we need to read the info from the MapList.
//对于CallSites,必须从MapList中读取,他不存储在header中
const DexFile::MapItem* call_sites_item = nullptr;
if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析
DexEnd(),
DexFile::MapItemType::kDexTypeCallSiteIdItem,
&call_sites_item)) {
*error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map",
GetLocation().c_str());
return false;
}
size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_;
DexCacheArraysLayout layout(pointer_size, *header, num_call_sites);
if (layout.Size() != 0u) {
if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) {
*error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with "
"truncated dex cache arrays, %zu < %zu.",
GetLocation().c_str(),
i,
dex_file_location.c_str(),
static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays),
layout.Size());
return false;
}
current_dex_cache_arrays = dex_cache_arrays;
dex_cache_arrays += layout.Size();
}
}
std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str());
// Create the OatDexFile and add it to the owning container.
OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file
dex_file_location,
canonical_location,
dex_file_checksum,
dex_file_pointer,
lookup_table_data,
method_bss_mapping,
class_offsets_pointer,
current_dex_cache_arrays,
dex_layout_sections);
oat_dex_files_storage_.push_back(oat_dex_file);
// Add the location and canonical location (if different) to the oat_dex_files_ table.
StringPiece key(oat_dex_file->GetDexFileLocation());
oat_dex_files_.Put(key, oat_dex_file);
if (canonical_location != dex_file_location) {
StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation());
oat_dex_files_.Put(canonical_key, oat_dex_file);
}
}
if (dex_cache_arrays != dex_cache_arrays_end) {
// We expect the bss section to be either empty (dex_cache_arrays and bss_end_
// both null) or contain just the dex cache arrays and optionally some GC roots.
*error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.",
GetLocation().c_str(),
static_cast<size_t>(bss_end_ - dex_cache_arrays));
return false;
}
return true;
}
还有一种打开ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
ps:有没有大佬指点一下,我对presetup的dl_iterate_phdr理解是不是有啥问题啊。。。
因为MemMap::MapDummy这个函数我没找到在哪,我猜功能是类似mmap的映射还有这个dex结构新增的call_sites_item有啥用我也没搞明白,求大佬解惑,感激不尽
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622原贴在这:https://bbs.pediy.com/thread-258125.htm
图片防盗链 lu_ 发表于 2020-3-18 13:07
图片防盗链
感谢提醒 {:1_909:} 看不懂,醉了 Alluretoo 发表于 2020-3-18 17:00
看不懂,醉了
我也是有点不懂想引出个大佬带带我{:1_909:} 太复杂的梗,怎么看不懂
页:
[1]