Unity手游BlueArchive蔚蓝档案剧情文本提取

k96e 发表于 2024-11-20 22:07

一直没见着有人做，以为是啥很难的活，实际上难度不大。
下手的是国际服（因为一下子就能解出五种语言的脚本），剧情是能不换包体动态更新的，所以肯定得在数据目录下找。
翻目录找到可疑的文件Android/data/com.nexon.bluearchive/files/PUB/Resource/Preload/TableBundles下的ExcelDB.db和Excel.zip
先复制出来再说，简单看一下Excel.zip，似乎全是角色和关卡数据之类的，和咱们要找的剧情文本没关系。
转战ExcelDB，SQLiteStudio打开

欸，ScenarioScriptDBSchema，这表名一看就对劲了，再看一眼数据

更对劲了
接下来就是这个Bytes怎么解析的问题。
你要是现在能看出来这是flatbuffers，那此贴到这就完结了，但是铸币楼主当时就愣是没看出来，还以为是啥自行实现的序列化方法。
然后就对于这类U3D il2cpp的游戏，公式化的提取libil2cpp.so和global-metadata.dat，然后用Il2CppDumper反编译
..\Il2CppDumper.exe libil2cpp.so global-metadata.dat ..\output
运行完先别急着开IDA，打开dump.cs，搜索ScenarioScriptExcel，直接就能找到结构体定义

（IFlatbufferObject善意提醒大家这是flatbuffers）
根据定义，写出schema文件（fbs文件）：
namespace MX.Data.Excel;

table ScenarioScriptExcel {
GroupId:long;
SelectionGroup:long;
BGMId:long;
Sound:string;
Transition:uint;
BGName:uint;
BGEffect:uint;
PopupFileName:string;
ScriptKr:string;
TextJp:string;
TextTh:string;
TextTw:string;
TextEn:string;
VoiceId:uint;
TeenMode:bool;
}

root_type ScenarioScriptExcel;
安装flatbuffers的schema编译器flatc，运行
flatc --python ScenarioScriptExcel.fbs
得到用来反序列化的python代码，然后写个脚本
import flatbuffers
import MX.Data.Excel.ScenarioScriptExcel as ScenarioScriptExcel
import json
import sqlite3

conn = sqlite3.connect('../ExcelDB.db')
cursor = conn.cursor()
d = cursor.execute("select * from ScenarioScriptDBSchema")
l = list(d)
datas = []
def read_bin_file(file_path):
with open(file_path, 'rb') as f:
   return f.read()

def parse_scenario_script(buffer):
buf = bytearray(buffer)
fb = flatbuffers.Builder(0)
fb.Bytes = buf
scenario = ScenarioScriptExcel.ScenarioScriptExcel.GetRootAsScenarioScriptExcel(buf, 0)
return scenario

def process(buffer):
global datas
scenario = parse_scenario_script(buffer)
datas.append({
   "GroupId": scenario.GroupId(),
   "SelectionGroup": scenario.SelectionGroup(),
   "BGMId": scenario.Bgmid(),
   "Transition": scenario.Transition(),
   "BGName": scenario.Bgname(),
   "BGEffect": scenario.Bgeffect(),
   "ScriptKr": scenario.ScriptKr().decode('utf-8'),
   "TextJp": scenario.TextJp().decode('utf-8'),
   "TextTh": scenario.TextTh().decode('utf-8'),
   "TextTw": scenario.TextTw().decode('utf-8'),
   "TextEn": scenario.TextEn().decode('utf-8'),
   "VoiceId": scenario.VoiceId(),
   "TeenMode": scenario.TeenMode()
})

for i in l:
process(i)

with open("scenario_script.json", "w", encoding="utf-8") as f:
json.dump(datas, f, ensure_ascii=False, indent=4)
完事

hhao7780 发表于 2024-11-21 10:32

不懂也来看看

rnaapp 发表于 2024-11-21 10:48

很厉害的感觉，旺德福

Jiu9Hao 发表于 2024-11-21 10:48

经验学习了，感谢分享

hayatecn 发表于 2024-11-21 11:05

有没有原神的？

xieyinghao 发表于 2024-11-21 11:13

很厉害的大佬，666，感谢恩人

无奈的地刺王 发表于 2024-11-21 11:53

感谢分享，学习了

zyh5028 发表于 2024-11-21 12:10

学习了，感谢

xy1687 发表于 2024-11-21 13:21

感谢分享，学习了

LHHK 发表于 2024-11-21 15:41

感谢分享，学习了

页: [1] 2 3 4

吾爱破解 - 52pojie.cn's Archiver

Unity手游BlueArchive蔚蓝档案剧情文本提取