在抓取某信宝数据时,发现有几个字段的值与真实值不符,分析发现源码中由class qxb-num修饰的标签数据都是错乱的。
查看qxb-num样式,发现特殊字体
由此断定开发人员是在字体库上动的手脚。
字体的绘制与ttf
在查阅相关文档后,总结字体的绘制过程为:
- 根据字符的unicode编码找到glyph名称
- 根据glyph名称找到glyph
- 使用glyph进行绘制
A TrueType font file consists of a sequence of concatenated tables. A table is a sequence of words. Each table must be long aligned and padded with zeroes if necessary.
一个TrueType Font字体库包含几个table。这里需要用到的两个table如下(tag为table的名称)
tag |
table |
cmap |
character to glyph mapping |
glyf |
glyph data |
破解过程
根据字体的绘制过程,可以猜测有两种方式实现字体加密
- 打乱字符编码与glyph映射(即cmap table)
- 打乱glyph名称与glyph数据(即glyf table)
利用fonttools,使用如下代码将字体转为xml
from fontTools.ttLib import TTFont
from io import BytesIO
import requests
font_content = requests.get('https://cache.qixin.com/pcweb/font-awesome-qxb-1bd55e43.woff2').content
font_file = BytesIO(font_content)
font = TTFont(font_file)
font.saveXML('font.xml')
查看生成的xml文件,发现cmap节点部分数据
<map code="0x30" name="icon-number_9"/><!-- DIGIT ZERO -->
<map code="0x31" name="icon-number_3"/><!-- DIGIT ONE -->
<map code="0x32" name="icon-number_7"/><!-- DIGIT TWO -->
<map code="0x33" name="icon-number_2"/><!-- DIGIT THREE -->
<map code="0x34" name="icon-number_0"/><!-- DIGIT FOUR -->
<map code="0x35" name="icon-number_1"/><!-- DIGIT FIVE -->
<map code="0x36" name="icon-number_5"/><!-- DIGIT SIX -->
<map code="0x37" name="icon-number_8"/><!-- DIGIT SEVEN -->
<map code="0x38" name="icon-number_6"/><!-- DIGIT EIGHT -->
<map code="0x39" name="icon-number_4"/><!-- DIGIT NINE -->
<map code="0x41" name="icon-upper_S"/><!-- LATIN CAPITAL LETTER A -->
<map code="0x42" name="icon-upper_Q"/><!-- LATIN CAPITAL LETTER B -->
<map code="0x43" name="icon-upper_K"/><!-- LATIN CAPITAL LETTER C -->
由此可以断定这个字体库通过打乱cmap table实现实体加密。
破解
对于打乱的cmap,只要找到字符对应的glyph名称就可以了
"""
企信宝字体加密: cmap table应对方案
"""
def decrypt(ss, font_file):
"""
根据字体文件解密字符串
:param ss: str or list of str
:param font_file: file like object or file path
:return:
"""
with TTFont(font_file) as font:
cmap = font['cmap'].getBestCmap()
def _decrypt(s):
predict = ''
for c in s:
if c in PLAIN_CHARS:
predict += cmap[ord(c)][-1]
else:
predict += c
return predict
if isinstance(ss, str):
return _decrypt(ss)
else:
return [_decrypt(s) for s in ss]
对于打乱的glyf,先标记glyph数据与真实字符,之后通过比对glyph数据找到对应的真实字符就可以了。
"""
企信宝字体加密: glyf table应对方案
"""
from fontTools.ttLib import TTFont
import string
def _get_glyph_name(c: str, font):
return font.getBestCmap()[ord(c)]
PLAIN_BOOK = string.ascii_uppercase + string.ascii_lowercase + '1234567890'
def _load_refer_glyph_data():
"""
加载已知宋体库,取得真实的glyphdata与name字典
:return: {glyphdata->bytes: char_name}
"""
import os.path
font_file = os.path.join(os.path.dirname(__file__), 'font-awesome-qxb-5ffe2d46.woff2')
with TTFont(font_file) as font:
cipherbook = 'XSQRTWFCZHDIN' \
'LAKEUGBMPOVJY' \
'rpfcbtdnajuhg' \
'zyikxovqleswm' \
'7658419203'
glyphset = font['glyf'].glyphs
return {glyphset[_get_glyph_name(p, font)].data: c for p, c in zip(PLAIN_BOOK, cipherbook)}
refer_glyph_data = _load_refer_glyph_data()
def decrypt(ss, font_file):
"""
根据字体文件解密字符串
:param ss: str or list of str
:param font_file: file like object or file path
:return:
"""
font = TTFont(font_file)
glyphs = font['glyf'].glyphs
def _decrypt(s):
predict = ''
for c in s:
if c in PLAIN_BOOK:
glyph_data = glyphs[_get_glyph_name(c, font)].data
predict += refer_glyph_data[glyph_data]
else:
predict += c
return predict
if isinstance(ss, str):
return _decrypt(ss)
else:
return [_decrypt(s) for s in ss]
def _test():
import os
cipher = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890'
plain = 'XSQRTWFCZHDINLAKEUGBMPOVJYrpfcbtdnajuhgzyikxovqleswm7658419203'
predict = decrypt(cipher, os.path.join(os.path.dirname(__file__), 'font-awesome-qxb-5ffe2d46.woff2'))
print(predict)
print(predict == plain)
if __name__ == '__main__':
_test()