用python 的requests 模块解析网页源代码报code = 400错误。
本帖最后由 老冉 于 2019-7-11 00:09 编辑url = https://api.bilibili.com/playurl?callback=callbackfunction&aid=19956343&page=65&platform=html5&quality=1&vtype=mp4&type=jsonp&_=1562215557403
我用chrome 打开网页正常,用 Ctrl + U 也能查看网页的源代码,但是用 python 的 requests 模块,selenium 都不能正常解析该网页的源代码,报 code = 400 错误。请大神帮助看看问题所在?
谢谢!!
>>> import requests
>>> url = r'https://api.bilibili.com/playurl?callback=callbackfunction&aid=19956343&page=65&platform=html5&quality=1&vtype=mp4&type=jsonp&_=1562215557403'
>>>
>>> h = requests.get(url)
>>> h
<Response >
>>> h = h.text
>>> h
'callbackfunction({"code":40000,"message":"bad request"});'
>>>
你headers都不设置的么?
你headers都不设置的么?+1
你pycharm都不用的么?
你代码都不保存的么? 400不就是服务器拒绝吗,肯定是headers没设置咯,大多数网站禁止非浏览器接入。 好歹告诉哔哩哔哩你是个什么啊!!就是设置headers 在h = 之前 配置headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36'}
h =requests.get(url,headers=headers)
看下 模拟 模拟 模拟!!! #coding=utf-8
import requests
header={"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate, br",
"Accept-Language":"zh-CN,zh;q=0.9",
"Cache-Control":"max-age=0",
"Connection":"keep-alive",
"Cookie":"CURRENT_FNVAL=16; buvid3=7D1AC0A0-88B0-453C-8F42-4AD5DD09000484586infoc; stardustvideo=1; rpdid=iwmkmwsoiidospqmilkxw; __guid=231148239.731019756462000900.1562223228121.9128; monitor_count=3",
"Host":"api.bilibili.com",
'If-None-Match':'"602555680aeba463c3cd8e598e653ce0"',
"Upgrade-Insecure-Requests":"1",
"User-Agent":"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"
}
url='https://api.bilibili.com/playurl?callback=callbackfunction&aid=19956343&page=65&platform=html5&quality=1&vtype=mp4&type=jsonp&_=1562215557403'
req=requests.get(url,headers=header).text
print(req)
楼猪不妨试试这个代码!! lijiusong 发表于 2019-7-4 15:05
#coding=utf-8
import requests
header={"Accept":"text/html,application/xhtml+xml,application/xml;q= ...
Cookie有的最好脱敏一下 lijiusong 发表于 2019-7-4 15:05
#coding=utf-8
import requests
header={"Accept":"text/html,application/xhtml+xml,application/xml;q= ...
试了一下,成了, 非常感谢楼主的帮助!
页:
[1]