吾爱破解 - 52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 1305|回复: 20
收起左侧

[求助] 如何快速判断页面是否真正存在内容 ?

[复制链接]
⊙⌒⊙ 发表于 2022-12-21 15:53
https://jhl.ke.seewo.com/live/plan/826510684487921664
是有视频存在
https://jhl.ke.seewo.com/live/plan/826514856624701440
也是有视频存在
https://jhl.ke.seewo.com/live/plan/826514856624701445
没有视频,显示404

怎么把所有有视频的地址找出来呢?
找几天老是出错,麻烦帮看看,有没有更快速的方法
谢谢!!
用for i in rang(9999999999999999):
[Python] 纯文本查看 复制代码
import requestsfrom urllib.parse import urlencode
import json
import csv
import base64
from multiprocessing import Pool
 
url2='https://jhl.ke.seewo.com/live/plan/826510684487921664' #实际URL
url='https://jhl.ke.seewo.com/live/fetch?actionName=GET_PLAN_DETAIL&ts=1670989434716' #判断视频的URL
 
def checkurl(num):
        apiUrl="/live/v1/plan/8265"+num+"/open/detail"
        strs='{"method":"GET","apiUrl":'+apiUrl+',"headers":{"userName":"","userType":"","userId":""},"baseURL":"http://live.seewo.com/live-server"}'
        #print(strs)
        result=base64.b64encode(strs.encode('utf-8')).decode('ascii')
        #print(result)
 
        headers={
        'Accept': 'application/json, text/plain, */*',
        'Content-Type': 'application/json',
        "ApiExtend":result
        }
 
        response=requests.post(url=url,headers=headers)
        result=json.loads(response.text)
        if(result['success']):
                print('https://jhl.ke.seewo.com/live/plan/8265'+num)
                with open('url.txt', 'a+', encoding='utf-8') as f:
                        f.write('https://jhl.ke.seewo.com/live/plan/8265'+num+"\n")
                        f.close()
 
 
def main():
    #for i in range (99999999999999):
    # 保存进程
        Process_list = []
    # 创建并启动进程,限制进程数
        p = Pool(10)
    # for (cid,) in cids:
        for i in range (1,99999999999999):
                num=str(i).zfill(14)
        # print(cid)
        # exit()
                p.apply_async(checkurl, args=(num,))
                Process_list.append(p)
                print(i,end=" ")
        p.close()
        p.join()
 
 
 
 
if __name__ == '__main__':
    main()

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

choujie1689 发表于 2022-12-21 15:57
[Python] 纯文本查看 复制代码
    response=requests.post(url=url,headers=headers)
    #result=json.loads(response.text)
    if(response.status_code == 200):
        print('https://jhl.ke.seewo.com/live/plan/8265'+num)
        with open('url.txt', 'a+', encoding='utf-8') as f:
            f.write('https://jhl.ke.seewo.com/live/plan/8265'+num+"\n")
            f.close()

通过判断response.status_code,如果是200有视频,404则没有
qeq66 发表于 2022-12-21 16:10
本帖最后由 qeq66 于 2022-12-21 16:17 编辑
http://live.seewo.com/live-server/live/v1/plan/826514856624701445/open/detail

用过这个接口去判断,把id取出来判断
 楼主| ⊙⌒⊙ 发表于 2022-12-21 16:15
qeq66 发表于 2022-12-21 16:10
[mw_shl_code=javascript,true]"https://jhl.ke.seewo.com/live/fetch?actionName=GET_PLAN_DETAIL&ts=1671 ...

谢谢,这个API能生成,现在是要判断页面的地址,哪些有真正内容:(要去循环判断:(
我用的多线程,request去判断,一段时间后老是出错:(
 楼主| ⊙⌒⊙ 发表于 2022-12-21 16:17
result=base64.b64encode(strs.encode('utf-8')).decode('ascii')
已经生成了api了
qeq66 发表于 2022-12-21 16:17
⊙⌒⊙ 发表于 2022-12-21 16:15
谢谢,这个API能生成,现在是要判断页面的地址,哪些有真正内容:(要去循环判断:(
我用的多线程,req ...

[Asm] 纯文本查看 复制代码
http://live.seewo.com/live-server/live/v1/plan/826514856624701445/open/detail

用这个接口 去查询
 楼主| ⊙⌒⊙ 发表于 2022-12-21 16:23
qeq66 发表于 2022-12-21 16:17
[mw_shl_code=asm,true]http://live.seewo.com/live-server/live/v1/plan/826514856624701445/open/detai ...

用了这个接口去查询,好像快多了?为什么呀
 楼主| ⊙⌒⊙ 发表于 2022-12-21 16:23
Exception in thread Thread-1 (_handle_workers):
Traceback (most recent call last):
  File "D:\Program Files\Python\Lib\threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "D:\Program Files\Python\Lib\threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "D:\Program Files\Python\Lib\multiprocessing\pool.py", line 524, in _handle_workers
    taskqueue.put(None)
MemoryError

刚运行一会儿:(又错了
qeq66 发表于 2022-12-21 16:28
⊙⌒⊙ 发表于 2022-12-21 16:23
用了这个接口去查询,好像快多了?为什么呀

这是api接口,你上面那些都是页面地址。页面地址会请求api地址。。至于你的报错,不太清楚
 楼主| ⊙⌒⊙ 发表于 2022-12-21 16:35
[Python] 纯文本查看 复制代码
import requests
from urllib.parse import urlencode
import json
import csv
import base64
from multiprocessing import Pool

url2='https://jhl.ke.seewo.com/live/plan/826510684487921665' #实际URL
url='https://jhl.ke.seewo.com/live/fetch?actionName=GET_PLAN_DETAIL&ts=1670989434716' #判断视频的URL

def checkurl(num):
	apiUrl="http://live.seewo.com/live-server/live/v1/plan/8265"+num+"/open/detail"
	strs='{"method":"GET","apiUrl":'+apiUrl+',"headers":{"userName":"","userType":"","userId":""},"baseURL":"http://live.seewo.com/live-server"}'
	#print(strs)
	result=base64.b64encode(strs.encode('utf-8')).decode('ascii')
	#print(result)

	headers={
	'Accept': 'application/json, text/plain, */*',
	'Content-Type': 'application/json',
	"ApiExtend":result
	}

	response=requests.post(url=apiurl,headers=headers)
	#result=json.loads(response.text)
	if(response.status_code == 200):
		print('https://jhl.ke.seewo.com/live/plan/8265'+num)
		with open('url.txt', 'a+', encoding='utf-8') as f:
			f.write('https://jhl.ke.seewo.com/live/plan/8265'+num+"\n")
			f.close()


def main():
    #for i in range (99999999999999):
    # 保存进程
	Process_list = []
    # 创建并启动进程,限制进程数
	p = Pool(30)
    # for (cid,) in cids:
	for i in range (11111151600000,99999999999999):
		num=str(i).zfill(14)
        # print(cid)
        # exit()
		p.apply_async(checkurl, args=(num,))
		Process_list.append(p)
		if i % 100000 ==0:
			with open('url2.txt', 'a+', encoding='utf-8') as f:
				f.write(str(i)+",")
				f.close()
	p.close()
	p.join()




if __name__ == '__main__':
    main()



运行一会儿就会自动关闭,程序哪儿有问题?
您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则

返回列表

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2024-11-25 00:20

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表