极验反爬虫防护分析之交互流程分析
本帖最后由 besterChen 于 2020-4-23 13:53 编辑### 一些废话
在论坛混了多年,跟着各位大神学到了许多有用的知识,当然也用了好多前辈们分享的软件。索取至此奈何自己能力平平没啥可拿出来分享的东西,又碍于面子怕分享的内容拾人牙慧惹人耻笑。
眼看着我的CB越来越少,总不能一直坐吃山空,所以拿出以前的一点笔记分享出来,试试运气看能否骗个精华,混点CB。如果分享的内容已经有人写过,或者没啥技术含量,还望各位坛友多多包涵~
本文要分享的内容是去年为了抢鞋而分析 极验(GeeTest)反爬虫防护的笔记,由于篇幅较长(为了多混点CB)我会按照我的分析顺序,分成如下四个主题与大家分享:
1. [极验反爬虫防护分析之交互流程分析](https://www.52pojie.cn/thread-1162853-1-1.html)
2. [极验反爬虫防护分析之接口交互的解密方法](https://www.52pojie.cn/thread-1162893-1-1.html)
3. [极验反爬虫防护分析之接口交互的解密方法补遗](https://www.52pojie.cn/thread-1162951-1-1.html)
4. [极验反爬虫防护分析之slide验证方式下图片的处理及滑动轨迹的生成思路](https://www.52pojie.cn/thread-1162979-1-1.html)
本篇是第一篇,下面进入正文~
---
极验(`http://www.geetest.com/`),是国内比较有名的身份验证,反爬虫的产品。其反爬虫的手段较多,大概分为: pencil、beeline、click、slide、voice 等多种验证方式,尤其Slide方式是基于大数据的智能行为验证,用户体验较好很得客户青睐,比如我要抢鞋的官网就是基于它来做用户登录的身份验证。
通过极验官网的文档,我们可以得知Geetest的大致交互流程如下图:
![通讯流程图](https://docs.geetest.com/sensebot/overview/imgs/geetest_netwoking_sequence.jpg)
结合我抢鞋的接口,整理如下:
#### 1. 获取验证ID及验证流水号
* 请求地址: https://www.nike.com.hk/geetest/doget.json?t=1563879949387&type=MEMBER_LOGIN
* 请求方式: POST
* 请求参数说明:
1. t: 当前时间戳(毫秒)
* 响应内容:
```json
{
"challenge": "decd09d6c9f1bb219f7550cea8e18c01",
"gt": "2328764cdf162e8e60cc0b04383fef81",
"success": 1
}
```
* 返回参数说明:
1. gt: 验证id
2. challenge: 验证流水号
#### 2. 获取验证素材资源
* 请求地址: https://api.geetest.com/gettype.php?gt=2328764cdf162e8e60cc0b04383fef81&callback=geetest_1563879954116
* 请求方式: GET
* 请求参数说明:
1. gt: 之前接口返回的验证ID
* 响应内容:
```json
{
"status": "success",
"data": {
"maze": "/static/js/maze.1.0.1.js",
"fullpage": "/static/js/fullpage.8.7.9.js",
"geetest": "/static/js/geetest.6.0.9.js",
"pencil": "/static/js/pencil.1.0.3.js",
"click": "/static/js/click.2.8.1.js",
"beeline": "/static/js/beeline.1.0.1.js",
"type": "fullpage",
"static_servers": ["static.geetest.com/", "dn-staticdown.qbox.me/"],
"aspect_radio": {
"beeline": 50,
"voice": 128,
"pencil": 128,
"click": 128,
"slide": 103
},
"voice": "/static/js/voice.1.2.0.js",
"slide": "/static/js/slide.7.6.0.js"
}
}
```
* 示例:
1. https://static.geetest.com/static/js/fullpage.8.7.9.js
2. https://static.geetest.com/static/js/geetest.6.0.9.js
#### 3. 获取验证的基本参数
* 请求地址: https://api.geetest.com/get.php?gt=2328764cdf162e8e60cc0b0438...
* 请求方式: GET
* 请求参数说明:
1. gt: 之前接口返回的验证ID
2. challenge: 之前接口返回的验证流水号
3. lang=zh-hk
4. pt=0
5. w=4A8S6ZFe9GO43hbI3exYdbebCvxWVtD38od3qc7tqtxxARm4HCI... (待分析)
6. callback=回调函数的方法名
* 响应内容:
```json
{
"status": "success",
"data": {
"i18n_labels": {
"fullpage": "驗證進行中, 請耐心等候",
"goto_homepage": "前往驗證服務 Geetest 官方網站",
"goto_confirm": "前往",
"success": "驗證成功",
"error_content": "請輕觸重試",
"copyright": "Geetest",
"refresh_page": "頁面出現錯誤!要繼續操作,請重新整理此頁面",
"loading_content": "智能驗證中",
"next_ready": "請完成驗證",
"error_title": "網絡逾時",
"next": "載入中",
"error": "網絡不佳",
"reset": "重試",
"goto_cancel": "取消",
"ready": "按此進行驗證 ",
"success_title": "驗證通過"
},
"s": "326b4e30",
"theme": "wind",
"static_servers": ["static.geetest.com", "dn-staticdown.qbox.me"],
"feedback": "",
"c": ,
"theme_version": "1.5.5",
"api_server": "api.geetest.com",
"logo": false
}
}
```
#### 4. 获取验证方式
* 请求地址: https://api.geetest.com/ajax.php?gt=2328764cdf162e8e60cc0b0438...
* 请求方式: GET
* 请求参数说明:
1. gt: 之前接口返回的验证ID
2. challenge: 之前接口返回的验证流水号
3. lang=zh-hk
4. pt=0
5. w=pxIvtqzNcdSbk0QJHeRBsozbYtm8519UAKjSyuCMk01J21VwEEWOlV... (待分析)
6. callback=回调函数的方法名
* 响应内容:
```json
{
"status": "success",
"data": {
"result": "slide"
}
}
```
#### 5. 获取验证素材信息
* 请求地址: https://api.geetest.com/get.php?is_next=true&type=slide3>=2328764...
* 请求方式: GET
* 请求参数说明:
1. gt: 之前接口返回的验证ID
2. challenge: 之前接口返回的验证流水号
3. lang=zh-hk
4. is_next=true
5. type=slide3
6. https=false
7. protocol=https://
8. offline=false
9. product=embed
10. api_server=api.geetest.com
11. isPC=true
12. width=100%
13. callback=回调函数的方法名
* 响应内容:
```json
{
"benchmark": false,
"xpos": 0,
"slice": "pictures/gt/0191efce3/slice/8eba80808.png",
"gt": "2328764cdf162e8e60cc0b04383fef81",
"ypos": 27,
"api_server": "https://api.geetest.com/",
"theme": "ant",
"link": "",
"template": "",
"id": "adecd09d6c9f1bb219f7550cea8e18c01",
"version": "6.0.9",
"product": "embed",
"mobile": true,
"width": "100%",
"hide_delay": 800,
"static_servers": ["static.geetest.com/", "dn-staticdown.qbox.me/"],
"so": 0,
"bg": "pictures/gt/0191efce3/bg/8eba80808.jpg",
"type": "multilink",
"height": 160,
"https": true,
"theme_version": "1.2.3",
"challenge": "decd09d6c9f1bb219f7550cea8e18c01eb",
"fullbg": "pictures/gt/0191efce3/0191efce3.jpg",
"clean": false,
"i18n_labels": {
"error": "出現錯誤,請關閉驗證重試",
"tip": "",
"voice": "視覺障礙",
"success": "sec 秒.超過 score% 的使用者",
"slide": "拉動左邊標示完成上方拼圖",
"forbidden": "哇~怪物吃了拼圖,請 3 秒後重試",
"close": "關閉驗證",
"fail": "拼圖位置不正確",
"cancel": "取消",
"feedback": "說明和反饋",
"loading": "載入中...",
"logo": "Geetest",
"refresh": "重新載入"
},
"s": "6e4d352f",
"c": ,
"feedback": "",
"fullpage": false,
"logo": false,
"show_delay": 250
}
```
* 响应参数说明:
1. https://static.geetest.com/pictures/gt/0191efce3/0191efce3.webp
2. https://static.geetest.com/pictures/gt/0191efce3/bg/8eba80808.webp
3. https://static.geetest.com/pictures/gt/0191efce3/slice/8eba80808.png
#### 6. 提交验证请求
* 请求地址: https://api.geetest.com/ajax.php?gt=2328764cdf162e8e60cc0b...
* 请求方式: GET
* 请求参数说明:
1. gt: 之前接口返回的验证ID
2. challenge: 第六步返回的验证流水号
3. lang=zh-hk
4. pt=0
5. w=cY71fua4Cuvs8Cyrf3rW(VzZHB)oSpARQMV65Dtg... (待分析)
* 响应内容:
```json
{
"validate": "545a86f1a2af6f09afc4d1abbb166679",
"success": 1,
"score": "5",
"message": "success"
}
```
#### 7. 提交登录认证
* 请求地址: https://www.nike.com.hk/member/login.json?loxiaflag=1563879969519
* 请求方式: POST
* 请求参数说明:
1. loginName: xxxxxxxx@qq.com
2. countryCode:
3. password=EJ2JjA3Rw32DrDMHI1Biu7+YKevYJhvU2EyllSQ... (加密后的密码待分析)
4. passwordAgain=EJ2JjA3Rw32DrDMHI1Biu7+YKevYJhvU2EyllSQ... (加密后的密码待分析)
5. rememberLoginName=checked
6. type
7. geetest_challenge=decd09d6c9f1bb219f7550cea8e18c01eb (第六步返回的验证流水号)
8. geetest_validate=545a86f1a2af6f09afc4d1abbb166679(第七步返回的validate)
9. geetest_seccode=545a86f1a2af6f09afc4d1abbb166679|jordan (第七步返回的validate|jordan)
* 响应内容:
```json
{
"email": "datochan@qq.com",
"firstName": "大头",
"loginRedirectPath": "/",
"memberCommand": {
"addEmail": null,
"address": null,
"behaviorScore": null,
"birthday": null,
"birthdayStr": null,
"businessPartnerName": null,
"captcha": null,
"cardNo": null,
"certNo": null,
"certType": null,
"childrenNum": null,
"city": null,
"copExt1": null,
"corpDepartment": null,
"corpPosition": null,
"corporation": null,
"corporationCode": null,
"corporationName": null,
"country": null,
"countryCode": "",
"createTime": null,
"createType": null,
"currPoint": null,
"district": null,
"educationLvl": null,
"email": null,
"familyNum": null,
"firstName": null,
"gender": null,
"height": null,
"homePage": null,
"id": null,
"invitecode": null,
"invitor": null,
"isBookEmail": null,
"isBookMobileMessage": null,
"isCorpSeed": null,
"isDefault": null,
"isFacebook": false,
"isInviteSeed": null,
"isMobileValidated": null,
"isPrivacyPolicy": null,
"isRecEmail": null,
"keyPassword": null,
"lastName": null,
"lifeCycleStatus": null,
"lifeCycleStatusDesc": null,
"loginName": "datochan@qq.com",
"marriage": null,
"memberAttention": null,
"memberCode": null,
"memberRank": null,
"memberType": null,
"mobile": null,
"msn": null,
"nickname": null,
"onboardDate": null,
"password": "A11111111a",
"passwordAgain": "EJ2JjA3Rw32DrDMHI1Biu7+YKevYJhvU2EyllSQCmm5fp+QRaVvlz8AfztnM2u0r8Etx3LoBnverg1myEh5lq9HNCEA5aMeHSNX6bR7oJfkkMRqR3XSZdM+0ineNUnHnVuSNnHLbCIxAoecRjDkUSC/umq8+QE9a3X0DOB6SUm8=",
"pet": null,
"province": null,
"pwdAnswer": null,
"pwdQuestion": null,
"qq": null,
"rankId": null,
"rankName": null,
"realName": null,
"regSource": null,
"registerIp": null,
"remark": null,
"salary": null,
"secondLang": null,
"securityCode": null,
"sendemail": null,
"smsCode": null,
"supervisor": null,
"sysUser": null,
"telephone": null,
"token": null,
"totalPoint": null,
"usedPoint": null,
"version": null,
"weight": null,
"yearsOfWork": null,
"zipcode": null
},
"mobileNumber": "+86 186xxxxyyyy",
"omnitureProp31": 3443350,
"org.springframework.validation.BindingResult.memberCommand": {
"errorCount": 0,
"fieldError": null,
"fieldErrorCount": 0,
"globalError": null,
"globalErrorCount": 0,
"nestedPath": "",
"objectName": "memberCommand"
}
}
```
#### 8. 验证流程整理
经测试不需要所有接口都模拟,比如直接指定验证方式为滑块验证,依次调用接口: 1 --> 3 --> 5 --> 6 --> 7 即可。
至此,整个交互流程分析完毕,产生的问题如下:
1. 所有用于验证的代码都是混淆之后的,如何进行代码还原或者调试分析。
2. 请求与返回的关键数据都是加密的后的字符串如上述3、4、6等接口中的W参数,如何解密。
限于篇幅,本文先写到这里,上述两个问题的解决方法请参考下篇文章《[极验反爬虫防护分析之接口交互的解密方法](https://www.52pojie.cn/thread-1162893-1-1.html)》
考虑到之前精华帖CB奖励太少,我改成一个精华帖奖励500CB了,你这个一个系列可以来篇精华了。 我叫小月亮 发表于 2020-4-22 16:28
来学一下,这验证真的神烦
+1 我也不喜欢验证 谢谢楼主 来学一下,这验证真的神烦
https://img.alicdn.com/.gif 谢谢楼主
https://img.alicdn.com/.gif Hmily 发表于 2020-4-22 16:33
考虑到之前精华帖CB奖励太少,我改成一个精华帖奖励500CB了,你这个一个系列可以来篇精华了。
哈哈,太感谢了~ 向大佬学习{:1_921:} 真正的大佬。已下载收藏。 不知道不久之后反爬手段会不会越来越成熟、反爬范围会不会越来越广{:1_925:}
过个验证也太难了吧