爬虫爬出的页面不是乱码就报错,求帮忙解答下
本帖最后由 storm 于 2020-9-24 10:52 编辑# -*- coding: utf-8 -*-
# @Time : 2020/9/23 8:13
# @file : Xpath
from lxml import etree
import requests
# 请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url = 'https://www.lmonkey.com/'
res = requests.get(url,headers=headers)
#响应代码
code = res.status_code
#判断响应结果
if code == 200:
print('响应成功')
# 写入文件
with open('./test.html','w') as fp:
fp.write(res.text)
然后报错
昨天还能出来的不过是乱码 请问下这种情况是什么原因造成的 ,我在百度上找,也没找到个合适的解决方法 明显是编码问题 encode(‘utf-8’) woshijvm 发表于 2020-9-23 08:47
明显是编码问题
我也想估计是编码的原因 请教下怎么改变编码然后写入文件中 本帖最后由 JackLove1234 于 2020-9-23 09:08 编辑
from lxml import etree
import requests
# 请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url = 'https://www.lmonkey.com/'
res = requests.get(url,headers=headers)
#响应代码
print(res.encoding)
code = res.status_code
#判断响应结果
if code == 200:
print('响应成功')
# 写入文件
with open('./test.html','w',encoding='utf-8') as fp:
fp.write(res.text) # -*- coding: utf-8 -*-
# @Time : 2020/9/23 8:13
# @file : Xpath
from lxml import etree
import requests
# 请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url = 'https://www.lmonkey.com/'
res = requests.get(url,headers=headers)
#响应代码
code = res.status_code
#判断响应结果
if code == 200:
print('响应成功')
# 写入文件
with open('./test.html','w', encoding='utf-8') as fp:
fp.write(res.text) culprit 发表于 2020-9-23 08:53
encode(‘utf-8’)
这位大哥说的对
from lxml import etree
import requests
# 请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
url = 'https://www.lmonkey.com/'
res = requests.get(url,headers=headers)
#res.encoding = res.apparent_encoding
#响应代码
print(res.encoding)
code = res.status_code
#判断响应结果
if code == 200:
print('响应成功')
# 写入文件
with open('./test.html','w',encoding='utf-8') as fp:
fp.write(res.text) JackLove1234 发表于 2020-9-23 09:01
from lxml import etree
import requests
感谢您的回答 谢谢 木子汐 发表于 2020-9-23 09:06
# -*- coding: utf-8 -*-
# @Time : 2020/9/23 8:13
# @file : Xpath
十分感谢您的回答
页:
[1]
2