本帖最后由 fa00x 于 2022-6-8 21:22 编辑
[Python] 纯文本查看 复制代码
import requests
from lxml import etree
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'
}
for i in range(2800,3482): #1951,5482
for j in range(1,40):
url1 = f'https://www.2meinv.com/article-{i}-{j}.html'
res = requests.get(url=url1, headers=headers)
tree = etree.HTML(res.text)
img_url = tree.xpath('/html/body/div[5]/a/img/@src')[0]
#print(img_url)
url3 = requests.get(url = img_url)
img_name = img_url.split("/")[-1]
s = 'G:/pych2/pic33'
with open(s + "\\" + img_name, "wb") as f:
f.write(url3.content)
print(img_name)
已更新 剩下兄弟们看你们自己的了
下载心得 一段时间就会屏蔽无法下载。多线程屏蔽时间更快。估计是服务器后台有监控。单线程。timeout自己设置一下。
|