Galgame汉化中的逆向(七):动态汉化分析2_以AZsystem引擎为例
本帖最后由 小木曾雪菜 于 2023-1-21 10:44 编辑# Galgame汉化中的逆向(七):动态汉化分析2_以AZsystem引擎为例
好久没发帖了,不知不觉又到了除夕,祝大家新年快乐~
by (https://github.com/YuriSizuku/GalgameReverse), 本贴论坛和[我的博客](https://blog.schnee.moe/posts/GalgameReverse7)同时发布
本贴代码开源详见我的github: (https://github.com/YuriSizuku/GalgameReverse), (https://github.com/YuriSizuku/ReverseUtil)。
上篇链接:(https://www.52pojie.cn/thread-1478048-1-1.html)
## 0x0 前言
上节 (),我们介绍了动态汉化。动态汉化不用分析封包结构,不用分析`opcode`,看上去很方便,但是动态汉化解决同步问题会很麻烦,比如说改完文本后backlog文本仍是日文、返回主界面再载入文本没有变动等问题。动态汉化也有可能出现莫名其妙的崩溃bug,且这些bug不容易被调试。
针对动态汉化的上述缺点,本节我们将介绍一种这种`半动态汉化`的方案。与上节的方法不同,本节不进行文本级替换,而是文件级别的替换。即去`hook`相关函数,动态将解密后的缓冲区替换为我们汉化后的文件。适合于那种**封包与加密特别麻烦或复杂**的游戏。
本文将以`azsystem`为例,来分析:
* 引擎如何加载游戏脚本,如何定位关键点提取脚本
* 引擎如何加载图片,如何解压各通道数据,如何将图片数据送入帧缓存渲染
* 汉化如何用`inline hook`对加载后的内容进行替换
!(https://p.sda1.dev/9/e70aeefc8fa5b631ee37fad8e5a3d15c)
## 0x1 脚本文件分析与提取
### (1) asb文件的分析
和上节相同,第一步先分析文件,无论静态分析算法还是动态dump缓冲区,先把文件提取出来。
由于方法差不多,这里不再详细展开了。
这个游戏封包为`.arc`文件,用文件长度哈希值来作为加密密钥,里面有若干个`.asb`脚本文件。IDA里面直接搜`.asb`字符串就能找到相关函数了,读取脚本文件函数如下:
```c
int __thiscall sub_43112A(_DWORD *this, char *script_name)
{
char *raw_data; // edi
int v4; // eax
unsigned int v5; // ecx
_DWORD *v7; // BYREF
int v8; // BYREF
unsigned int compressed_size; //
unsigned int raw_size; //
int v11; //
int (__thiscall **v12)(void *, char); //
char *compressed_data; //
int v14; //
v7 = off_460A6C;
sub_40BD95(v7);
v14 = 1;
v12 = &off_462CDC;
v11 = 0;
sub_430FC9((int)this);
if ( fopen_40C102(v7, script_name, 0x80000000) != 1 )
{
logprintf_407C41("CScript::Create", byte_4679CC, script_name);
goto LABEL_13;
}
readfile_40C03E(v7, (char *)&v8, 0xC);
if ( v8 == 0x1A425341 ) // asb\x1a
{
compressed_data = (char *)operator new(compressed_size);
raw_data = (char *)operator new(raw_size);
readfile_40C03E(v7, compressed_data, compressed_size);
if ( sub_430F6A(compressed_data, compressed_size, raw_size) )
{
v4 = decompress_40AB65(compressed_data, compressed_size, raw_data, raw_size);// decompress
v5 = raw_size;
if ( v4 == raw_size )
{
this = 0;
this = raw_data;
this = v5;
this = raw_data;
this = raw_data;
v11 = 1;
LABEL_10:
if ( compressed_data )
j__free(compressed_data);
goto LABEL_13;
}
logprintf_407C41("CScript::Create", byte_467A38, script_name);
}
else
{
logprintf_407C41("CScript::Create", byte_467A0C, script_name);// error
}
if ( raw_data )
j__free(raw_data);
goto LABEL_10;
}
LABEL_13:
v14 = -1;
v12 = &off_462CDC;
v7 = off_460A6C;
sub_40BFDD(v7);
return v11;
}
```
!(https://p.sda1.dev/9/a9a3febc9ba880497854bd1a5db2b62a)
简单分析后,我们可以得到`asb`的文件头结构、校验文本函数、解压函数以下结论,具体如下:
```c
typedef struct {
s8 magic; /* "ASB" */
u32 comprlen;
u32 uncomprlen;
u32 unknown;
} asb_header_t;
typedef struct {
s8 magic; /* "ASB\x1a" 通过此magic来定位*/
u32 comprlen;
u32 uncomprlen;
} asb1a_header_t;
// CScript.constructor, 这里不再自己构造了,在游戏调用的时候记录下this指针
void *__thiscall sub_43277F(_DWORD *this)
// check_valid
BOOL __stdcall sub_430F6A(char *compressed_data, int compressed_size, int raw_size)
// decompress
sub_40AB65(char *compressed_data, int compressed_len, char *raw_data, int raw_len)
0043112A | B8 9EE54500| mov eax,lamune.45E59E |load_script(char* name)
004311D4 | FF75 E4| push dword ptr ss:| raw_len
004311D7 | 8D4D EC| lea ecx,dword ptr ss:
004311DA | 57 | push edi| raw_data
004311DB | FF75 E0 | push dword ptr ss: | compressed_len
004311DE | FF75 F0 | push dword ptr ss: | compressed_data
004311E1 | E8 7F99FDFF| call lamune.40AB65| decompress
```
### (2) asb文件的解密与提取
提取只需要hook`sub_40AB65`,frida代码如下:
```js
/*
for lamune.exe v1.0
open the game to title, then
frida -l lamune_hook.js -n lamune.exe
next go to the prologue to dump all asbs
*/
function install_decompress_hook(outdir='./dump')
{
// hook decompress function to dump
const addr_decompress = ptr(0x40AB65);
var raw_asbname = "";
var raw_asbdata = ptr(0);
var raw_asbsize = 0;
Interceptor.attach(addr_decompress, {
onEnter: function(args)
{
raw_asbdata = ptr(args);
raw_asbsize = args.toUInt32();
raw_asbname = ptr(this.context.ebp).add(8).
readPointer().readAnsiString();
},
onLeave: function(retval)
{
//var asbname = asbname_buf.readAnsiString();
var asbname = raw_asbname;
console.log(asbname,
", raw_asbdata addr at", raw_asbdata,
", raw_asbsize ", raw_asbsize)
try{
var fp = new File(outdir+"/"+asbname, 'wb');
fp.write(raw_asbdata.readByteArray(raw_asbsize));
fp.close();
}
catch(e)
{
console.log("file error!", e);
}
}
})
}
function dump_asbs(names, outdir="./dump")
{
const addr_loadscript = ptr(0x43112A);
const load_script = new NativeFunction(addr_loadscript,
'void', ['pointer', "pointer"], 'thiscall');
console.log("load_script at:", load_script)
// use this to store c++ context
var pthis = ptr(0)
Interceptor.attach(addr_loadscript, {
onEnter: function(args)
{
pthis = ptr(this.context.ecx)
}
})
install_decompress_hook(outdir)
// wait for c++ context
while(!pthis.toInt32())
{
Thread.sleep(0.2);
}
// dump all scripts
var name_buf = Memory.alloc(0x100);
for(var i=0;i<names.length;i++)
{
console.log("try to dump", names, ", this=",pthis);
name_buf.writeAnsiString(names);
load_script(pthis, name_buf);
}
console.log("dump asbs finished!\n");
}
function dump_scenario()
{
var names_v103 = ["00suzuk.asb"]
dump_asbs(names_v103)
}
```
用其他工具如`arc unpack`可以得到`arc`封包的文件名,把文件名录入frida脚本,即可dump出全部`asb`脚本。
!(https://p.sda1.dev/9/d601ecdc995c7355030d0de9ba224dc2)
## 0x2 动态替换脚本文件
### (1) 替换解密的asb缓冲区
结合上面文件分析,我们可以在`004311E1| E8 7F99FDFF| call lamune.40AB65| decompress`进行`inlinehook`,在此直接加载我们已经解密并汉化的`asb`文件。解密的缓冲区是前面`new`出来的,我们还需要修改缓冲区大小。另外还要`nop`掉缓冲区`crc`校验的函数。
上节我们用了`detours`,这期我们来手动`inlinehook`,步骤如下:
1. 在需要`hook`的位置用5字节`call(E9)`或 `jmp(E8)` 进行相对跳转到我们的函数上,
机器码为`E8 XXXXXXXX`, `E9 XXXXXXXX`。
`XXXXXXXX`为相对于下一条指令的偏移,即`targetva - (va + 5)`
2. 执行完后`hook`的函数后,结尾手动修复一下被我们修改5字节破坏的代码,跳转到下个指令处。
动态替换解密后的缓冲区脚本代码如下:
```c
/* for hook new decompressed buffer
0043119A | FF75 E0 | push dword ptr ss:
0043119D | E8 A1510000| call lamune.436343 | new
004311A2 | FF75 E4 | push dword ptr ss:| raw_size
004311A5 | 8945 F0| mov dword ptr ss:,eax
004311A8 | E8 96510000 | call lamune.436343| new raw_buf
*/
const DWORD g_newrawbufi_4311A2 = 0x4311A2;
const DWORD g_newrawbufo_4311A8 = 0x4311A8;
/* for hook decompress asb
.text:004311D4 FF 75 E4 push ; raw_len
.text:004311D7 8D 4D EC lea ecx,
.text:004311DA 57 push edi ; raw_data
.text:004311DB FF 75 E0 push ; compressed_len
.text:004311DE FF 75 F0 push ; compressed_data
.text:004311E1 E8 7F 99 FD FF call decompress_40AB65
*/
const DWORD g_decompressasbi_4311E1 = 0x4311E1;
const DWORD g_decompressasbo_40AB65 = 0x40AB65;
// inlinehook stubs
void __declspec(naked) newrawbuf_hook_4311A2()
{
__asm{
pushad;
xor eax, eax;
// size_t __stdcall load_rawasb(char *name, PBYTE buf)
push eax;
push ;
call load_rawasb;
test eax, eax;
je newrawbuf_hook_end;
mov , eax; // change raw buf size
newrawbuf_hook_end:
popad;
// fix origin code
push dword ptr ;
mov dword ptr , eax;
jmp dword ptr ds:;
}
}
void __declspec(naked) decompressasb_hook_4311E1()
{
//sub_40AB65(char *compressed_data, int compressed_len, char *raw_data, int raw_len)
__asm {
push ; // after push ret addr, above, raw_buf
push ;// asbname
call load_rawasb;
test eax, eax;
je decompress_origin;
ret 0x10;
decompress_origin:
mov eax, 0x99E15CB4; // this is the original corrent crc value
mov dword ptr ds:, eax; // this is not worked...
jmp dword ptr ds:;
}
}
// hook install functions
void install_asbhook()
{
/* inlinehook check_valid
.text:0040AB8A 6A 00 push 0
.text:0040AB8C 8D 43 FC lea eax,
.text:0040AB8F 50 push eax
.text:0040AB90 8D 77 04 lea esi,
.text:0040AB93 56 push esi
.text:0040AB94 E8 27 D9 FF FF call makecrc_4084C0
.text:0040AB99 83 C4 0C add esp, 0Ch
.text:0040AB9C 39 07 cmp , eax
.text:0040AB9E 75 64 jnz short loc_40AC04
*/
BYTE nop2={0x90, 0x90};
winhook_patchmemory((LPVOID)0x4311d2,
nop2, sizeof(nop2));
winhook_patchmemory((LPVOID)0x40AB9E,
nop2, sizeof(nop2));
// inlinehook newrawdata
BYTE jmpE8buf={0xE9}; // jmp relative
*(DWORD*)(jmpE8buf+1) = (DWORD)newrawbuf_hook_4311A2-
((DWORD)g_newrawbufi_4311A2 + sizeof(jmpE8buf));
winhook_patchmemory((LPVOID)g_newrawbufi_4311A2,
jmpE8buf, sizeof(jmpE8buf));
// inlinehook decompress
BYTE callE9buf={0xE8}; // call relative
*(DWORD*)(callE9buf+1) =(DWORD)decompressasb_hook_4311E1-
((DWORD)g_decompressasbi_4311E1 + sizeof(jmpE8buf));
winhook_patchmemory((LPVOID)g_decompressasbi_4311E1,
callE9buf, sizeof(callE9buf));
}
```
上面代码中`load_rawasb`即为我们读取对应解密文件的代码,这里为了减少零碎文件,我采取了从`zip`文件中读取的方法。
此处不再赘述,详见我的(https://github.com/YuriSizuku/GalgameReverse)。
### (2) 修改sjis检测字节支持gbk编码
导入中文文本后,经测试发现一大堆半角乱码。
这是因为有`sjis`首字节字符编码范围检测,不在`sjis`范围内的字符将被解析为单字节文本。
!(https://p.sda1.dev/9/2aa7896b718ff70342efef7321d5a6b1)
与其他游戏不同,此游戏不是用`cmp ax, 0x81`等指令来检测`sjis`字符,而且位置过多过于分散,修改起来很麻烦。
这部分定位我们可以在`TextOutA`下断点,往上慢慢找,可以看到下图位置:
!(https://p.sda1.dev/9/30c611ff8a9e5a6142283c9b747e0405)
这里非常巧妙,用一条`c^0x20 + 0x5f > 0x3B`就可以判断是否为sjis首字符了,具体分析如下:
```asm
.text:004340F6 loc_4340F6:
.text:004340F6 mov ecx,
.text:004340F9 mov cl,
.text:004340FB mov dl, cl
.text:004340FD xor dl, 20h
.text:00434100 add dl, 5Fh ; '_'
.text:00434103 cmp dl, 3Bh ; ';'
.text:00434106 ja loc_434215
```
修改方法也很简单,把上面`xor`和`add`用`nop`patch,编码检测改为`cmp dl, 0x80`即可。
修改完后,虽然文本框正确了,但我们发现`backlog`中文本还有乱码。
这时候就要在搜索其他地方的检测字符函数了,可以试着搜`cmpal|bl|cl|dl, 0x3b`,逐个下断点,启动`backlog`看哪里断下。
!(https://p.sda1.dev/9/0339f57bd6ff6008d81ce04b34e6f855)
!(https://p.sda1.dev/9/0dfc51c5af5213d8c9813d7de81baa41)
### (3) asb opcode分析
以`0nana.asb`为例,这个opcode是对齐的,很工整,如下图:
!(https://p.sda1.dev/9/81d3bf91111abfc5a48ca9f01bee24c8)
总结起来就是`optype 4, oplengh 4, payload n`结构,超长文本只需要修正一下`oplengh`和`jmp`相关的指令就行了,如下:
```c
// from the file start, there are several opcodes entries
optype 4, oplengh 4, payload n
, oplengh 4, *0x10, optext // 26 music, 27 text
, oplengh 4, *4, option_num 4, * 8, text1, 00, text2 ... // option
, , addr 4, *4, unknow1 4, unknow2 4 // jmp
, , addr 4, *4, unknow1 4, unknow2 4 // option jmp
00 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 00
00 00 00 00 00 00 00 00 // end with that
```
将测试文本导入后,我们可以完成超长文本的汉化测试了~
!(https://p.sda1.dev/9/f5e65b38070226b31129db7e5bd08f8d)
## 0x3 图片文件的加载和渲染分析
### (1) 定位图片显示缓冲区
这个游戏是通过`Windows compatible DC`进行绘图的,我们可以在`CreateDIBinfo`下断点,然后一层层往上跟,找到在缓冲区填充像素的函数,之后`bitblt`到帧缓冲位置。这里有个麻烦事,这游戏有很多虚函数通过虚表来寻址,如`v3=(*(**v7+12))(*v7, v5, v10,a3`这种。静态跟起来很费劲,可以尝试动态来看虚表。由于跟踪过于繁琐了,具体流程从略了,`callback`和具体调用流程如下:
```c
0019FD800040EF75 50 return to lamune.sub_40EE6F+106 from ??? User // CreateDIBinfo
0019FDD000401E770040EE6F34 return to lamune.sub_401D0F+168 from lamune.sub_40 User
0019FE040040955D 24 return to lamune.sub_40951B+42 from ??? User
0019FE280040686C0040951B24 return to lamune.sub_406813+59 from lamune.sub_409 User
0019FE4C004383EA00406813A4 return to lamune.EntryPoint+184 from lamune.sub_40 User
0019FEF00043827E0043F21084 return to lamune.EntryPoint+18 from lamune.sub_43F User
DWORD __thiscall sub_42A199(int *this) // neko_logo.cpb
| loadimg_419E03(off_473088, "neko_logo.cpb", this + 0x214);
| readcpb_40C03E((_DWORD **)this + 1, cpb_header, 0x10);
| (*(_DWORD *)*v7+4))(*v7, cpb_header) //check magic cpb\x1a
| (*(v8 + 0x3C))(v10, v10) // 0041D3FB, read full
| v3=(*(**v7+12))(*v7, v5, v10,a3);// 0041D453, check depth
|return (*(*v6 + 0x10))(v6, a2, a4);// 0041E36F, 0041ddb8,decompress
|decompress2_40AA38(char *compressed_buf, size_t |compressed_size, char *raw_buf, size_t raw_len) // lzss?
| sub_40C9C1(DWORD *this, int a2, int a3, int *a4, DWORD *a5)
|sub_4101EB(v9 + 2, a2, a3, a4, *a5, a5, a5, a5, 0);// bltalpha
|(*(this + 0x48))(this + 2, a2, a3, a4, *a5, a5, a5, a5, 0xCC0020);// 004123E1, to bitblt
```
### (2) cpb图片加载
上面我们来讲了一下定位方法,和整体加载流程。在这节我们来分析一下`cpb`文件如何读取和加载渲染到屏幕上的。
#### .1 cpb结构
`cpb`中像素是分通道存储的,数据结构如下:
```C
00000000 cpb1a_header_tstruc ; (sizeof=0x20, mappedto_128)
00000000 ; XREF: decompresscpb_41E36F/r
00000000 magic db 4 dup(?) ; string(C)
00000004 unknow1 db ?
00000005 color_depth db ?
00000006 unknow2 db ?
00000007 version db ?
00000008 width dw ? ; XREF: decompresscpb_41E36F+39/r
0000000A height dw ? ; XREF: decompresscpb_41E36F+3E/r
0000000C max_comprlen dd ? ; XREF: decompresscpb_41E36F+56/r
00000010 comprlen dd 4 dup(?); XREF: decompresscpb_41E36F+93/r
00000010; decompresscpb_41E36F+B7/r ...
00000020 cpb1a_header_tends
```
### .2prepare DC
在渲染图片之前,游戏引擎先进行`DC`的初始化。
```c
void *__thiscall sub_40FDC2(void **this, LONG width, int height)
{
void *result; // eax
HBITMAP v5; // eax
HDC dc; // eax
int (__thiscall **v7)(void **, _DWORD); // eax
BITMAPINFO pbmi; // BYREF
if ( (void *)width == this && (void *)height == this )
return (void *)(*((int (__thiscall **)(void **, _DWORD))*this + 26))(this, 0);
(*((void (__thiscall **)(void **))*this + 13))(this);
if ( width > 0 && height > 0 )
{
memset(&pbmi, 0, sizeof(pbmi));
pbmi.bmiHeader.biHeight = -height;
pbmi.bmiHeader.biSize = 0x28;// struct size
pbmi.bmiHeader.biWidth = width;
pbmi.bmiHeader.biPlanes = 1;// must be 1
pbmi.bmiHeader.biBitCount = 32; // rgba
pbmi.bmiHeader.biCompression = 0;
v5 = CreateDIBSection(0, &pbmi, 0, this + 0x28, 0, 0); // this+0x28
this = v5;
if ( v5 )
{
dc = CreateCompatibleDC(0);
this = dc;
if ( dc )
{
this = SelectObject(dc, this);
this = (void *)width;
this = (void *)height;
this = (void *)(height - 1);
v7 = (int (__thiscall **)(void **, _DWORD))*this;
this = 0;
this = 0;
this = (void *)(width - 1);
result = (void *)v7(this, 0); // 0041295A, FillRect
this = result;
return result;
}
}
(*((void (__thiscall **)(void **))*this + 0xD))(this);
}
return 0;
}
```
#### .3 load cpb
这部分是读取`cpb`到内存里,并检验文件头等信息
```c
int __thiscall loadimg_419E03(_DWORD *this, char *filename, int *a3)
{
int v3; // ebx
_DWORD **v5; // edi
_DWORD *v7; // esi
int v8; // eax
int v9; // eax
int v10; // BYREF
char cpb_header; // BYREF
int i; //
v3 = 0;
if ( !filename || !a3 )
return 0;
v5 = (this + 1);
if ( fopen_40C102(this + 1, filename, 0x80000000) != 1 )
{
logprintf_407C41("CGraphicLoader::GDILoad", "指定されたファイルが見つかりません [%s]", filename);
return 0;
}
readcpb_40C03E(this + 1, cpb_header, 0x10); // this+1 fp
i = 0;
v7 = this + 5;// for test magic?
do
{
if ( *v7 )
{
if ( (*(**v7 + 4))(*v7, cpb_header) == 1 )// 0041D0E8, 0041D3E9
// check magic cpb\x1a,
{
sub_40C0A0(v5, 0, 0);
memset(v10, 0, sizeof(v10));
v3 = (*(**v7 + 8))(*v7, v5, v10); // 0041D3FB, read full header
if ( v3 == 1 )
{
v8 = *a3; // 0041D3FB, read full header
v9 = v10 == 1 ? (*(v8 + 0x3C))(v10, v10) : (*(v8 + 0x38))(v10, v10);
v3 = v9;
if ( v9 == 1 )
{
sub_40C0A0(v5, 0, 0);
v3 = (*(**v7 + 12))(*v7, v5, v10, a3);// 0041D453, check depth and decompress
if ( v3 == 1 )
break;
}
}
}
}
++i;
++v7;
}
while ( i < 4 );
sub_40BFDD(v5);
return v3;
}
```
加载后,会根据通道数不同调用不同的解压缩函数。
```C
int __thiscall sub_41D453(_DWORD *this, int a2, int a3, int a4)
{
int v6; // esi
int v8; // ebx
int v9; //
v9 = 0;
if ( (*(*a4 + 0x2C))(a4) != 1 ) // 00401291, mov
return 0;
v6 = this[*(a3 + 0x18) + 1];
if ( !v6 )
return 0;
if ( (*(*a4 + 0x1C))(a4) == 8 )// 00401278, mov
{
if ( *(a3 + 4) == 8 ) // 8bit with color panel
return (*(*v6 + 4))(v6, a2, a4);
}
else if ( (*(*a4 + 0x1C))(a4) == 32 ) // 32bit rgba
{
v8 = *(a3 + 4);
if ( v8 == 8 )
return (*(*v6 + 8))(v6, a2, a4);
if ( v8 == 24 )
return (*(*v6 + 0xC))(v6, a2, a4);// 0041e1c8 decompresscpb24
if ( v8 != 32 || (*(*a4 + 0x30))(a4) != 1 ) // 00401298, mov
return v9;
return (*(*v6 + 0x10))(v6, a2, a4); // 0041E36F,decompresscpb32
}
return v9;
}
```
#### .4 decompress cpb
这个游戏有多个`cpb`解压函数,对应着不同通道数的文件,这里以32位图为例分析。
注意这里`vv1 = (*(*obja + 0xC))(obja)`中的`vv1`值为`prepare dc`中的` v5 = CreateDIBSection(0, &pbmi, 0, this + 0x28, 0, 0)` 此句的DIB缓冲区。
我们可以替换`decompress_channel_40AA38`后的缓冲区为汉化后的图片,然后让游戏引擎帮我们复制到`DIB`缓冲区内。
```c
int __thiscall decompresscpb32_41E36F(void *this, int *obj)
{
int v3; // eax
size_t pixels; // esi
char *raw_buf; // eax MAPDST
char *pchanel1; // ebx
int vv2; // eax
char *pchannel0; // edi
_BYTE *v11; // esi
_BYTE *v12; // eax
int v13; // edx
cpb1a_header_t cpb_header; // BYREF
int v16; //
int v17; //
char *v18; //
int width; // MAPDST
char *compressed_buf; // MAPDST
int pcurvv2; //
int i; //
char *pchanel3; //
_BYTE *vv1; //
int j; //
int v27; //
char *obja; // MAPDST
char *pchanel2; //
v27 = 0;
j = 0;
if ( readcpb_40C03E(obj, cpb_header.magic, 0x20) )
{
v3 = *obja;
width = cpb_header.width;
i = cpb_header.height;
pixels = cpb_header.width * cpb_header.height;
v17 = (*(v3 + 0x24))(obja); // 00401283, mov
compressed_buf = operator new(cpb_header.max_comprlen);
raw_buf = operator new(4 * pixels);
pchanel1 = &raw_buf;
pchanel2 = &raw_buf;
pchanel3 = &pchanel2;
vv1 = (*(*obja + 0xC))(obja); // 0040125C, mov this+0x28, get hdc buffer
//CreateDIBSection(0, &pbmi, 0, this + 0x28, 0, 0);
vv2 = (*(*obja + 0x20))(obja);
pcurvv2 = vv2;
if ( readcpb_40C03E(obj, compressed_buf, cpb_header.comprlen)
&& decompress_channel_40AA38(compressed_buf, cpb_header.comprlen, raw_buf, pixels) != -1
&& readcpb_40C03E(obj, compressed_buf, cpb_header.comprlen)
&& decompress_channel_40AA38(compressed_buf, cpb_header.comprlen, pchanel1, pixels) != -1
&& readcpb_40C03E(obj, compressed_buf, cpb_header.comprlen)
&& decompress_channel_40AA38(compressed_buf, cpb_header.comprlen, pchanel2, pixels) != -1
&& readcpb_40C03E(obj, compressed_buf, cpb_header.comprlen)
&& decompress_channel_40AA38(compressed_buf, cpb_header.comprlen, pchanel3, pixels) != -1 )
{
if ( i > 0 )
{
pchannel0 = &pchanel1[-pixels];
++vv1;
j = i;
do // copy data to dc buf
{
if ( width > 0 )
{
v11 = vv1;
v16 = pchanel2 - pchanel1;
v18 = (pchanel3 - pchanel1);
v12 = pchanel1;
v13 = pcurvv2 - pchanel1;
i = width;
do
{
v11 = v12;
*v11 = *v12;
*(v11 - 1) = v12;
v12 = v12;
++v12;
v11 += 4;
--i;
}
while ( i );
}
pchanel2 += width;
pchanel3 += width;
vv1 += 4 * width;
pcurvv2 += v17;
pchanel1 += width;
pchannel0 += width;
--j;
}
while ( j );
}
j = 1;
}
if ( raw_buf )
j__free(raw_buf);
if ( compressed_buf )
j__free(compressed_buf);
}
return j;
}
```
解压各通道算法,看起来有点像`lzss`改版?
```c
int __stdcall decompress_channel_40AA38(char *compressed_buf, size_t compressed_size, char *raw_buf, size_t raw_len)
{
char *v5; // ebx
char *v6; // edx
char *v7; // esi
char *v8; // edi
unsigned int v9; // ecx
signed int v10; // eax
unsigned int v11; // ecx
char *v12; // esi
char v13; // cf
bool v14; // cc
unsigned int v15; //
signed int dstsizea; //
if ( *(compressed_buf + 4) > raw_len )
return -1;
v5 = compressed_buf + 20;
v6 = &compressed_buf[*(compressed_buf + 1) + 20];
v7 = &v6[*(compressed_buf + 2)];
dstsizea = *(compressed_buf + 4);
v8 = raw_buf;
v15 = 0x80808080;
do
{
if ( (v15 & *v5) != 0 )
{
v9 = *v6;
v6 += 2;
v10 = (v9 >> 13) + 3;
qmemcpy(v8, &v8[-(v9 & 0x1FFF) - 1], v10);
v8 += v10;
}
else
{
v11 = *v7 + 1;
v12 = v7 + 1;
v10 = v11;
qmemcpy(v8, v12, v11);
v7 = &v12;
v8 += v11;
}
v13 = v15 & 1;
v15 = __ROR4__(v15, 1);
if ( v13 )
++v5;
v14 = dstsizea <= v10;
dstsizea -= v10;
}
while ( !v14 );
return v8 - raw_buf;
}
```
### .5 bitblt screen dc
最后再通过`bitblt`到屏幕帧缓存中,至此整个游戏图片渲染分析完毕。
```c
// for bitblt
BOOL __thiscall sub_4123E1(void *this, int x, int y, int a4, int x1, int a6, int a7, int a8, DWORD rop)
{
int v10; // edi
int v11; // ebx
int v12; // eax
int y1; // edi
int v14; // eax
int x_c; // ebx
HDC hdc; // eax
HDC srchdc; //
v10 = a7 - x1 + 1;
v11 = a8 - a6 + 1;
if ( x >= 0 )
{
if ( x + v10 > (*(*this + 16))(this) )// 00401263, mov
{
if ( (*(*this + 16))(this) - x <= 0 )
return 0;
a7 = (*(*this + 16))(this) + x1 - x - 1;
}
}
else
{
if ( v10 + x <= 0 )
return 0;
x1 -= x;
v12 = (*(*this + 16))(this) + x1 - 1;
if ( v12 < a7 )
a7 = v12;
x = 0;
}
if ( y >= 0 )
{
if ( y + v11 > (*(*this + 20))(this) )// 0040126A, mov
{
if ( (*(*this + 20))(this) - y <= 0 )
return 0;
a8 = (*(*this + 20))(this) + a6 - y - 1;
}
y1 = a6;
}
else
{
if ( v11 + y <= 0 )
return 0;
y1 = a6 - y;
v14 = (*(*this + 20))(this) + a6 - 1 - y;
if ( v14 < a8 )
a8 = v14;
y = 0;
}
x_c = a7 - x1 + 1;
if ( x_c > 0 && a8 - y1 + 1 > 0 )
{
srchdc = (*(*a4 + 4))(a4);// 0040124E, mov
hdc = (*(*this + 4))(this);// 0040124E, mov
return BitBlt(hdc, x, y, x_c, a8 - y1 + 1, srchdc, x1, y1, rop);
}
return 0;
```
## 0x4 动态替换图片文件
为了搞明白这个游戏游戏引擎图像如何渲染的,我把很多的虚函数都跟了一遍。
其实汉化图片只需要逆向到如何解压`cpb`文件那里就足够了。这个游戏麻烦地方在于不同通道对应的不同处理函数,要依次来`hook`替换缓冲区。另外在读取文件适合要记录一些文件名,用于缓冲区动态替换我们汉化的图片。
以24位图片代码替换为例,代码如下:
```C
/* for hook decompressed cpb24 buffer
0041E2DB | 8B55 0C| mov edx,dword ptr ss:
0041E2DE | 8BC7 | mov eax,edi
0041E2E0 | 2BC6 | sub eax,esi
0041E2E2 | 42 | inc edx
0041E2E3 | 8955 0C| mov dword ptr ss:,edx
0041E2E6 | 894D EC| mov dword ptr ss:,ecx
0041E2E9 | 85DB | test ebx,ebx
0041E2EB | 7E 35 | jle lamune_chs.41E322
*/
const char* g_curcpbname = NULL;
const DWORD g_copycpb24i_41E2DB = 0x41E2DB;
const DWORD g_copycpb24o_41E2E0 = 0x41E2E0;
void __declspec(naked) loadcpb_hook_419E03()
{
__asm {
push eax;
mov eax, dword ptr ; // after push eax
mov g_curcpbname, eax;
pop eax;
// fix origin code
push ebp;
mov ebp, esp;
sub esp, 0x2c;
jmp dword ptr ds:;
}
}
void __declspec(naked) copycpb24_hook_41E2DB()
{
__asm {
pushad;
push ;
push g_curcpbname;
// size_t __stdcall load_rawcpb(char *name, PBYTE buf)
call load_rawcpb;
popad;
// fix origin code
mov edx,dword ptr ;
mov eax,edi;
jmp dword ptr ds:;
}
}
void install_cpbhook()
{
// inlinehook loadcpb
BYTE jmpE8buf={0xE9}; // jmp relative
*(DWORD*)(jmpE8buf+1) = (DWORD)loadcpb_hook_419E03-
((DWORD)g_loadcpbi_419E03 + sizeof(jmpE8buf));
winhook_patchmemory((LPVOID)g_loadcpbi_419E03,
jmpE8buf, sizeof(jmpE8buf));
// inlinehook copycpb24
*(DWORD*)(jmpE8buf+1) = (DWORD)copycpb24_hook_41E2DB-
((DWORD)g_copycpb24i_41E2DB + sizeof(jmpE8buf));
winhook_patchmemory((LPVOID)g_copycpb24i_41E2DB,
jmpE8buf, sizeof(jmpE8buf));
}
```
这里采取的是`png`格式存储的汉化图片,为了方便用了(https://github.com/nothings/stb)进行加载。
```c
size_t __stdcall load_rawcpb(char *name, PBYTE buf)
{
char path = {SYSGRAPH_DIR "/" "\0"};
strcat(path, name);
strcpy(path + strlen(path)-
strlen(SYSGRAPH_EXT),SYSGRAPH_EXT);
int width, height, channel;
printf("load_rawcpb(%s, %p)", path, buf);
size_t entry_size = load_arc_entry(path, NULL);
const BYTE *tmpbuf = (BYTE*)malloc(entry_size);
load_arc_entry(path, (PBYTE)tmpbuf);
char* img = (char*)stbi_load_from_memory(tmpbuf,
entry_size, &width, &height, &channel, 0);
free((void*)tmpbuf);
if(!img)
{
printf(" not found!\n");
return 0;
}
printf(" width=%d, heigth=%d, channel=%d\n",
width, height, channel);
for(int y=0;y<height;y++)
{
for(int x=0;x<width;x++)
{
char r = *(img + channel * (width*y + x) + 0);
char g = *(img + channel * (width*y + x) + 1);
char b = *(img + channel * (width*y + x) + 2);
*(buf + 0*height*width + width*y+x) = r;
*(buf + 1*height*width + width*y+x) = g;
*(buf + 2*height*width + width*y+x) = b;
if(channel==4)
{
char a = *(img + channel * (width*y + x) + 3);
*(buf + 3*height*width + width*y+x) = a;
}
}
}
stbi_image_free(img);
return width*height*channel;
}
```
加载后遇到渲染bug,我们把对应缓冲区dump出来放到ct2中进行查看,确定原因。
!(https://p.sda1.dev/9/3a7959612007d1b3ee9259797459978d)
!(https://p.sda1.dev/9/c4a730e8bd299446095bef9ae94fa7cd)
这里发现原来是`stbi_load_from_memory`函数对于`tga`格式有些问题,换成`png`格式最后参数为0,问题解决。
!(https://p.sda1.dev/9/30391bdfc277149ff891a9e31d68eaa5)
至此,图片汉化问题全部解决。
## 0x5 后记
这个游戏我逆向了一周多把引擎的加载方式搞明白了,之后又测试导入翻译断断续续修复bug一个月,基本上汉化完美了。这里有个坑,通关后没法打开`gallary`。这是官方的bug,下载了升级补丁可以修复。但是之前给我的文件是初版游戏,我说基于这个版本分析的。还得把旧版搬到新版上,非常麻烦。这个故事告诉我们,以后汉化要第一时间检查更新补丁。
整体来讲,这游戏有三大难点。难点之一在封包上,有加密和校验非常麻烦,因此我们采取了动态替换解密后的缓冲区;其二,图像缓冲区不好找,里面有大量虚函数,需要一点点跟;其三,`sjis`字符检测过于分散,需要手动一个个调整,而且也是用非主流方式判断的。因此,我认为此游戏比较适合`半动态汉化`。这种基于文件的替换方式可以免去复杂的封包,同时相比文本层面上的全动态汉化,可以更方便调试,少引发一些文本同步之类的问题。
另外我用(https://github.com/nothings/stb)加载图片,这里遇到了问题,xp上运行会崩溃。!(https://p.sda1.dev/9/a563233a6f8a90b549654931e787f437)
调试定位在了`mov eax, large fs:2Ch`上,这是因为这个库用了`__declspec(thread)`,在`win xp`上`LoadLibrary`遇到`tls`就会崩,定义宏`#define STBI_NO_THREAD_LOCALS`即可解决。
然后进行了若干测试,我这个汉化兼容补丁性还不错~`win xp`, `win7`, `win8` , `win10`甚至连`linux wine`,`exagear`都测试了,可以说是全平台兼容了~ 完结撒花~
!(https://p.sda1.dev/9/91a7f31e3126b904ba472392fdfd0266)
!(https://p.sda1.dev/9/be2988b90e65b2222afa66f20e20b776)
!(https://p.sda1.dev/9/b519c36a720352c2e347c37dcf0f0efb) 小木曾雪菜 发表于 2023-1-21 15:31
这个图床国内看不了吗? 我这里看显示正常呀
看不了,套个cdn,或者上传论坛{:301_998:} 正己 发表于 2023-1-21 11:52
大佬新年快乐,还有图片全炸了
这个图床国内看不了吗? 我这里看显示正常呀 大佬新年快乐,还有图片全炸了{:301_988:} 来的及时雨 ,我拿下了 学习一下很不错 大佬新年快乐 学习了 wine还有exagear都测了
好诶,上课的时候有东西玩了! 太厉害了,膜拜! 的确是大牛,分析不易,学习