【原理与实践篇】利用Mono.Cecil的Bug来Anti .NET Decompiler by Wwh / NCK

wwh1004 发表于 2018-8-6 17:17

之前发的那个帖子只算是讨论了一种现象，还没涉及到本质，也就是CLR加载过程。
所以我们现在来讨论一下CLR是如何处理压缩的元数据与未压缩的元数据的。

我们先来说说原理，这样才能理解接下来的实践。
首先我们需要CoreCLR源码一份，CLR底层架构是不会变的，所以研究元数据时用CoreCLR是比用SSCLI20好的。
如果没有源码也没关系，帖子用会把关键部分都贴出来。

几十万行的代码我们应该如何定位到我们需要的位置？
我们需要思考一下，CoreCLR源码几乎不会在传递参数的时候直接使用常量，而是使用了宏（#define）
所以我们利用这一点，先在解决方案中搜索"#Strings" "#US"，找到了mdcommon.h这个文件

接着查找所有引用，可以找到2处读取元数据流的地方，和1处判断元数据类型的地方

我们先分析下CLR如何判断元数据类型

我把代码化简了一下
*pFormat = MDFormat_Invalid;
pStream = MDFormat::GetFirstStream_Verify(&sHdr, pData, &cbStreamBuffer);
// Loop through each stream and pick off the ones we need.
for (i = 0; i < sHdr.GetiStreams(); i++)
{
// Get next stream.
PSTORAGESTREAM pNext = pStream->NextStream_Verify();

if (strcmp(pStream->GetName(), "#~") == 0)
{
   // Validate that only one of compressed/uncompressed is present.
   if (*pFormat != MDFormat_Invalid)
         // Already found a good stream.
         goto ErrExit;
   // Found the compressed meta data stream.
   *pFormat = MDFormat_ReadOnly;
}
else if (strcmp(pStream->GetName(), "#-") == 0)
{
   // Validate that only one of compressed/uncompressed is present.
   if (*pFormat != MDFormat_Invalid)
         // Already found a good stream.
         goto ErrExit;
   // Found the ENC meta data stream.
   *pFormat = MDFormat_ReadWrite;
}
else if (strcmp(pStream->GetName(), "#Schema") == 0)
{
   // Found the uncompressed format
   *pFormat = MDFormat_ICR;
}

// Pick off the next stream if there is one.
pStream = pNext;
}
这段代码是什么意思呢？
CLR会遍历所有元数据流头，
如果出现"#~"，设置元数据格式为MDFormat_ReadOnly，
如果出现"#-"，设置元数据格式为MDFormat_ReadWrite，
如果出现"#Schema"，设置元数据格式为MDFormat_ICR，
同时还会检查"#~"和"#-"有没有重复出现过。

接下来，CLR判断元数据格式

我化简的代码
if ( format == MDFormat_ReadOnly )
{
// Found a fully-compressed, read-only format.
pInternalRO = new (nothrow) MDInternalRO;
IfFailGo( pInternalRO->Init(const_cast<void*>(pData), cbData) );
}
else
{
// Found a not-fully-compressed, ENC format.
IfFailGo( GetInternalWithRWFormat( pData, cbData, flags, riid, ppIUnk ) );
}
意思是如果只存在"#~"，那就是全压缩的元数据，
如果存在"#-"或"#Schema"，那就是未完全压缩的元数据

接下来我们看看CLR如何加载压缩的元数据

我化简的代码
pStream = MDFormat::GetFirstStream_Verify(&sHdr, pData, &cbStreamBuffer);

// Loop through each stream and pick off the ones we need.
for (i = 0; i < sHdr.GetiStreams(); i++)
{
void *pvCurrentData = (void *)((BYTE *)pData + pStream->GetOffset());
ULONG cbCurrentData = pStream->GetSize();

// Get next stream.
PSTORAGESTREAM pNext = pStream->NextStream_Verify();

// String pool.
if (strcmp(pStream->GetName(), "#Strings") == 0)
{
   // Initialize string heap with null-terminated block of data
   IfFailGo(m_MiniMd.m_StringHeap.Initialize(
         MetaData::DataBlob((BYTE *)pvCurrentData, cbCurrentData),
         FALSE));    // fCopyData
}

// Literal String Blob pool.
else if (strcmp(pStream->GetName(), "#US") == 0)
{
   METADATATRACKER_ONLY(MetaDataTracker::NoteSection(TBL_COUNT + MDPoolUSBlobs, pvCurrentData, cbCurrentData, 1));
   // Initialize user string heap with block of data
   IfFailGo(m_MiniMd.m_UserStringHeap.Initialize(
         MetaData::DataBlob((BYTE *)pvCurrentData, cbCurrentData),
         FALSE));    // fCopyData
}

// GUID pool.
else if (strcmp(pStream->GetName(), "#GUID") == 0)
{
   METADATATRACKER_ONLY(MetaDataTracker::NoteSection(TBL_COUNT + MDPoolGuids, pvCurrentData, cbCurrentData, 1));
   // Initialize guid heap with block of data
   IfFailGo(m_MiniMd.m_GuidHeap.Initialize(
         MetaData::DataBlob((BYTE *)pvCurrentData, cbCurrentData),
         FALSE));    // fCopyData
}

// Blob pool.
else if (strcmp(pStream->GetName(), "#Blob") == 0)
{
   METADATATRACKER_ONLY(MetaDataTracker::NoteSection(TBL_COUNT + MDPoolBlobs, pvCurrentData, cbCurrentData, 1));
   // Initialize blob heap with block of data
   IfFailGo(m_MiniMd.m_BlobHeap.Initialize(
         MetaData::DataBlob((BYTE *)pvCurrentData, cbCurrentData),
         FALSE));    // fCopyData
}

// Found the compressed meta data stream.
else if (strcmp(pStream->GetName(), "#~") == 0)
{
   IfFailGo( m_MiniMd.InitOnMem(pvCurrentData, cbCurrentData) );
   bFoundMd = true;
}

// Pick off the next stream if there is one.
pStream = pNext;
cbStreamBuffer = (ULONG)((LPBYTE)pData + cbData - (LPBYTE)pNext);}
意思是一切以最后一次发现的表和堆为准

然后我们看看CLR如何加载未压缩的元数据

我化简的代码
// Load the string pool.
if (SUCCEEDED(hr = pStorage->OpenStream(L"#Strings", &cbData, &pvData)))
{
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolStrings, pvData, cbData, bReadOnly));
}
else
{
if (hr != STG_E_FILENOTFOUND)
{
   IfFailGo(hr);
}
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolStrings, NULL, 0, bReadOnly));
}

// Load the user string blob pool.
if (SUCCEEDED(hr = pStorage->OpenStream(L"#US", &cbData, &pvData)))
{
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolUSBlobs, pvData, cbData, bReadOnly));
}
else
{
if (hr != STG_E_FILENOTFOUND)
{
   IfFailGo(hr);
}
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolUSBlobs, NULL, 0, bReadOnly));
}

// Load the guid pool.
if (SUCCEEDED(hr = pStorage->OpenStream(L"#GUID", &cbData, &pvData)))
{
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolGuids, pvData, cbData, bReadOnly));
}
else
{
if (hr != STG_E_FILENOTFOUND)
{
   IfFailGo(hr);
}
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolGuids, NULL, 0, bReadOnly));
}

// Load the blob pool.
if (SUCCEEDED(hr = pStorage->OpenStream(L"#Blob", &cbData, &pvData)))
{
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolBlobs, pvData, cbData, bReadOnly));
}
else
{
if (hr != STG_E_FILENOTFOUND)
{
   IfFailGo(hr);
}
IfFailGo(m_MiniMd.InitPoolOnMem(MDPoolBlobs, NULL, 0, bReadOnly));
}

// Open the metadata.
hr = pStorage->OpenStream(L"#~", &cbData, &pvData);
if (hr == STG_E_FILENOTFOUND)
{
IfFailGo(pStorage->OpenStream(ENC_MODEL_STREAM, &cbData, &pvData));
}
意思是从开始找到结束，以第一次出现的表和堆为准。
这里的OpenStream，判断字符串是忽略大小写的，可以看我上面发的图。
用的是stricmp。
dnlib写得和这些完全一样，该判断大小写不该判断大小写都写对了，不得不佩服0xd4d这位大神。

我们了解了CLR内部流程，再来看看如何自己为.NET程序集添加无效元数据让Mono.Cecil这个不严谨的类库出错
这里我们就需要dnlib了，因为dnlib可以正确区分。

这个程序将作为本次演示的受害者

using dnlib.DotNet;
using dnlib.DotNet.Writer;

namespace ConsoleApp1 {
internal unsafe class Program {
   private static void Main(string[] args) {
         using (ModuleDef moduleDef = ModuleDefMD.Load(@"E:\Projects\UnpackMe.Net\UnpackMe.Net\bin\Release\UnpackMe.Net.exe")) {
            ModuleWriter writer;

            writer = new ModuleWriter(moduleDef, new ModuleWriterOptions(moduleDef));
            writer.TheOptions.WriterEvent += TheOptions_WriterEvent;
            writer.Write("x.exe");
         }
   }

   private static void TheOptions_WriterEvent(object sender, ModuleWriterEventArgs e) {
         ModuleWriterBase writer = (ModuleWriterBase)sender;
         if (e.Event != ModuleWriterEvent.MDEndCreateTables)
            return;
         writer.TheOptions.MetadataOptions.CustomHeaps.Add(new InvalidHeap("#Strings"));
         writer.TheOptions.MetadataOptions.CustomHeaps.Add(new InvalidHeap("#US"));
         writer.TheOptions.MetadataOptions.CustomHeaps.Add(new InvalidHeap("#GUID"));
         writer.TheOptions.MetadataOptions.CustomHeaps.Add(new InvalidHeap("#Blob"));
         writer.TheOptions.MetadataOptions.CustomHeaps.Add(new InvalidHeap("#Schema"));
   }

   private sealed class InvalidHeap : HeapBase {
         private readonly string _name;

         public override string Name => _name;

         public InvalidHeap(string name) => _name = name;

         public override uint GetRawLength() => 1;

         protected override void WriteToImpl(DataWriter writer) => writer.WriteByte(0);
   }
}
}

dnlib的设计有些奇怪，我们需要订阅模块写入事件，在写入事件中判断是否刚刚完成ModuleWriterEvent.MDEndCreateTables
如果是，我们添加上混淆用的堆，但不能是表（CLR有检测，上文已经说了），这里堆的名字可以是任意，不一定要是#Strings #US，比如#Wwh #NCK也是可以的，然后记得在这些堆里面再加上#Schema，这样CLR会认为这是未压缩的元数据。
我们用dnSpy反编译制作好的Anti Mono.Cecil的程序集看看

正常反编译并且可以运行
而ILSpy，.NET Reflector，JustDecompiler就不一样了，完全无法反编译

slaiwl 发表于 2018-8-6 17:28

怪不得我想汉化.NET Reflector，反编译了就打不开，还好找到方法汉化了，正在进行中

wwh1004 发表于 2019-1-2 22:23

wgh1256 发表于 2019-1-1 22:48
dnlib没有想象中的那么神奇。我修改过dnlib的代码，调用时会出现栈溢出的情况。而且dnlib在检验模块的时候 ...

特地查了一下，看起来是这样，但是对于.net fx和.net core，这些混淆要用cecil来处理真的太困难。
目前已知的效果好的反编译器，除了dnspy用dnlib，其它的都用cecil，随便加点混淆那些工具就废了。
dnspy现在还是ilspy v2的核心，还有点控制流分析方面的bug可以让ilspy v2的核心直接报错导致反编译失败。
目前ilspy v4没这个bug，但是用cecil就直接不能打开元数据混淆过的程序集。
dnlib的栈溢出的好像还没碰到过。

cyhcuichao 发表于 2018-8-6 17:33

楼主写的不错学习了

蓝晴发表于 2018-8-6 20:07

楼主写的不错学习了

liucq 发表于 2018-8-6 23:54

支持大神分享。。。。。。。。。。

丑到变态 发表于 2018-8-7 08:58

感谢分享，学习了。

一帆_ 发表于 2018-8-7 09:43

感谢分享，学习了。

gongyong728125 发表于 2018-8-7 11:04

学习学习，谢谢分享！

ymb123 发表于 2018-8-7 11:15

多谢分享心得！

nmjk1234 发表于 2018-8-7 13:22

感谢分享，学习了

页: [1] 2 3 4 5 6 7

吾爱破解 - 52pojie.cn's Archiver

【原理与实践篇】利用Mono.Cecil的Bug来Anti .NET Decompiler by Wwh / NCK