本帖最后由 天轩科技 于 2023-8-19 02:54 编辑
已更新:爬取王者荣耀英雄所有皮肤图片【包括氪金的皮肤】 :https://www.52pojie.cn/forum.php?mod=viewthread&tid=1822926
大家都在爬王者荣耀英雄的图片,我也来~哈哈哈。复习一下之前学过的知识,爬一下王者荣耀的海报。
2023年8月18日 04点25分:刚刚发现好像有氪金皮肤的海报没弄,等明天睡一觉再弄。。唔··可以挑战一下~
需要用到的库 requests ,BeautifulSoup4
pip install requests
pip install bs4
[Python] 纯文本查看 复制代码 import requests , re
from bs4 import BeautifulSoup
url = "https://pvp.qq.com/web201605/herolist.shtml"
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 Edg/115.0.1901.203"}
html = requests.request("get",url=url,headers=headers).text
soup = BeautifulSoup(html,"lxml")
list = []
for i in soup.find_all(attrs={"class":"herolist clearfix"}):
for o in i.find_all("a"):
list.append(f"https://pvp.qq.com/web201605/"+o.get("href"))
count = 1
for ix in list:
htm = requests.get(url=ix,headers=headers)
htm.encoding ="GBK"
soup1 = BeautifulSoup(htm.text,"lxml")
for z in soup1.find_all(attrs={"class":"zk-con1"}):
name = soup1.find(attrs={"class": "cover-name"}).get_text()
# print(z)
ZG = z.get("style")
img_urls = re.search("//.*?.jpg",str(ZG))
imgs = f"http:" + img_urls.group()
pic = requests.get(imgs).content
with open(f"{name}.jpg","wb") as f:
f.write(pic)
print(f"正在下载 === {name} == 英雄图片,这是第{count}张图片")
count += 1
print("图片下载完毕!")
成品图片下载链接:https://wwbq.lanzouj.com/im07915p5x8d |