吾爱破解 - 52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 4234|回复: 8
收起左侧

[Web逆向] 利用AST对抗某网站的javascript抽取型混淆

  [复制链接]
漁滒 发表于 2021-1-19 21:27
因为网站比较敏感,所以省略的一部分内容,主要讲逻辑部分
python中处理js代码需要用到slimit这个库,使用pip install slimit即可安装
首先对网站源代码进行分析,,发现需要的js代码在script标签中,并且用flashvars开头的变量储存
1.jpg
首先将这段js代码拿出来
[Python] 纯文本查看 复制代码
1
2
3
4
5
6
requests = requests_html.HTMLSession()
response = requests.get(shareurl, headers=headers)
for script in response.html.xpath('//script[@type="text/javascript"]'):
    script = script.xpath('//text()')[0]
    if 'flashvars' in script:
        # 此时可以获取script内所有的js代码

这里都比较简单,就不多说了,拿到这段js代码后,格式化先看一下
2.jpg
这里可以看到,所有的视频地址都被抽取出来了,继续往后面看
3.jpg
经过一段拼接后,重新形成正确的视频地址,接下来就是要使用ast来还原这个地址
[Python] 纯文本查看 复制代码
1
2
# 转化为ast结构树
tree = Parser().parse(script)

通过Parser类可以将js代码转换为ast结构树
获取到结构树后,需要自己通过继承ASTVisitor类来编写自定义访问者来遍历节点
这里我首先还原被抽取的mediaDefinitions列表
[Python] 纯文本查看 复制代码
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
flashvars = []
 
class VarStatement_Visitor(ASTVisitor):
    # 自定义访问者,重写VarStatement节点访问逻辑
    def visit_VarStatement(self, node):
        Identifier, Object = node.children()[0].children()
        # 获取flashvars定义的节点
        if 'flashvars' in Identifier.value:
            for each in Object.properties:
                left, right = each.children()
                # 找到mediaDefinitions数组
                if left.value == '"mediaDefinitions"':
                    # 还原每一个字典
                    for item in right.items:
                        data = {}
                        for key in item.properties:
                            keyleft, keyright = key.children()
                            if isinstance(keyright, ast.Array):
                                datalist = [i.value for i in keyright.items]
                                data[keyleft.value[1:-1]] = datalist
                            else:
                                if keyright.value == '"defaultQuality"':
                                    data[keyleft.value[1:-1]] = keyright.value
                                else:
                                    data[keyleft.value[1:-1]] = keyright.value[1:-1]
                        flashvars.append(data)

其中类名可以自定义,必须继承于ASTVisitor,然后重写【visit_+类型】这个方法,来指定在什么类型的节点进入函数
例如我这里定义的是visit_VarStatement方法,那就是访问所有的VarStatement节点,运行代码后可以得到还原的mediaDefinitions数组
4.jpg
接着就是继续编写访问者,来还原视频地址
[Python] 纯文本查看 复制代码
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class media_Visitor(ASTVisitor):
 
    def __init__(self, i, *args, **kwargs):
        # 视频所在的序号
        self.i = i
        # 用于添加映射关系
        self.identifier = {}
        # 用于添加映射顺序
        self.identifiers = []
        super(*args, **kwargs)
 
    # 递归获取映射顺序
    def get_Identifier(self, node, identifierlist):
        left, right = node.children()
        identifierlist.append(self.identifier[right.value])
        if isinstance(left, ast.BinOp):
            self.get_Identifier(left, identifierlist)
        else:
            identifierlist.append(self.identifier[left.value])
 
    def visit_VarStatement(self, node):
        Identifier, BinOp = node.children()[0].children()
        # 函数地址的映射顺序
        if 'media_'+str(self.i) == Identifier.value:
            # 计算真实视频地址
            self.get_Identifier(BinOp, self.identifiers)
            # 填充视频地址
            flashvars[self.i]['videoUrl'] = ''.join(self.identifiers[::-1])
        # 映射的定义
        elif isinstance(BinOp, ast.String) or (len(BinOp.children()) == 2 and isinstance(BinOp.children()[0], ast.String) and isinstance(BinOp.children()[1], ast.String)):
            if isinstance(BinOp, ast.String):
                self.identifier[Identifier.value] = BinOp.value[1:-1]
            else:
                self.identifier[Identifier.value] = ''.join([i.value[1:-1] for i in BinOp.children()])

这里进入的节点依然是VarStatement
因为视频地址的公式是由多个变量拼接得到的,我们并不知道会有多少个变量,所以定义了递归方法get_Identifier来获取完整的拼接公式
运行后可以看到,所有被抽取的视频地址都已经还原回去(图片就不放了)
下面是完整代码
[Python] 纯文本查看 复制代码
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
import requests_html
# 将js代码转换为ast结构树
from slimit.parser import Parser
# 用于创建自定义访问者
from slimit.visitors.nodevisitor import ASTVisitor
from slimit import ast
 
flashvars = []
 
class VarStatement_Visitor(ASTVisitor):
    # 自定义访问者,重写VarStatement节点访问逻辑
    def visit_VarStatement(self, node):
        Identifier, Object = node.children()[0].children()
        # 获取flashvars定义的节点
        if 'flashvars' in Identifier.value:
            for each in Object.properties:
                left, right = each.children()
                # 找到mediaDefinitions数组
                if left.value == '"mediaDefinitions"':
                    # 还原每一个字典
                    for item in right.items:
                        data = {}
                        for key in item.properties:
                            keyleft, keyright = key.children()
                            if isinstance(keyright, ast.Array):
                                datalist = [i.value for i in keyright.items]
                                data[keyleft.value[1:-1]] = datalist
                            else:
                                if keyright.value == '"defaultQuality"':
                                    data[keyleft.value[1:-1]] = keyright.value
                                else:
                                    data[keyleft.value[1:-1]] = keyright.value[1:-1]
                        flashvars.append(data)
 
class media_Visitor(ASTVisitor):
 
    def __init__(self, i, *args, **kwargs):
        # 视频所在的序号
        self.i = i
        # 用于添加映射关系
        self.identifier = {}
        # 用于添加映射顺序
        self.identifiers = []
        super(*args, **kwargs)
 
    # 递归获取映射顺序
    def get_Identifier(self, node, identifierlist):
        left, right = node.children()
        identifierlist.append(self.identifier[right.value])
        if isinstance(left, ast.BinOp):
            self.get_Identifier(left, identifierlist)
        else:
            identifierlist.append(self.identifier[left.value])
 
    def visit_VarStatement(self, node):
        Identifier, BinOp = node.children()[0].children()
        # 函数地址的映射顺序
        if 'media_'+str(self.i) == Identifier.value:
            # 计算真实视频地址
            self.get_Identifier(BinOp, self.identifiers)
            # 填充视频地址
            flashvars[self.i]['videoUrl'] = ''.join(self.identifiers[::-1])
        # 映射的定义
        elif isinstance(BinOp, ast.String) or (len(BinOp.children()) == 2 and isinstance(BinOp.children()[0], ast.String) and isinstance(BinOp.children()[1], ast.String)):
            if isinstance(BinOp, ast.String):
                self.identifier[Identifier.value] = BinOp.value[1:-1]
            else:
                self.identifier[Identifier.value] = ''.join([i.value[1:-1] for i in BinOp.children()])
 
def geturl(shareurl, headers):
    requests = requests_html.HTMLSession()
    response = requests.get(shareurl, headers=headers)
    for script in response.html.xpath('//script[@type="text/javascript"]'):
        script = script.xpath('//text()')[0]
        if 'flashvars' in script:
            # 转化为ast结构树
            tree = Parser().parse(script)
            # 自定义访问者,访问VarStatement节点
            VarStatement_Visitor().visit(tree)
            for i in range(len(flashvars)):
                media_Visitor(i).visit(tree)
            break
    print(flashvars)



免费评分

参与人数 11吾爱币 +30 热心值 +10 收起 理由
JPK + 1 + 1 谢谢@Thanks!
pwp + 3 + 1 渔哥带我搞P站
风驰夜掣 + 1 + 1 用心讨论,共获提升!
苏紫方璇 + 15 + 1 欢迎分析讨论交流,吾爱破解论坛有你更精彩!
田田爱崽崽 + 2 感谢发布原创作品,吾爱破解论坛因你更精彩!
细水流长 + 1 + 1 热心回复!
ofo + 1 + 1 我很赞同!
踮起脚尖过日子 + 1 + 1 我很赞同!
淡雅香 + 1 + 1 我很赞同!
xixicoco + 2 + 1 我很赞同!
逍遥一仙 + 2 + 1 感谢发布原创作品,吾爱破解论坛因你更精彩!

查看全部评分

本帖被以下淘专辑推荐:

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

xixicoco 发表于 2021-1-19 21:35
牛逼,顶你
lipinghao 发表于 2021-1-19 21:36
netspirit 发表于 2021-1-20 14:48
gaofeng_pj 发表于 2021-1-24 15:56
大佬就是大佬,还是要js基础较好的才能看得懂
安安20152015 发表于 2021-5-30 07:03
有解密工具吗
JPK 发表于 2021-7-30 09:36
写的很细,学习到了。
libaiddufu 发表于 2021-8-4 02:08
看到好多这个AST,就是不会,还没有入门呢
您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则

返回列表

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2025-4-13 14:09

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表