本帖最后由 Dlam万能的猫 于 2022-3-18 22:45 编辑
【原创源码】【python】爬虫--壁纸
需要下载requests,bs4两个库
py文件放到哪里,图片就保存到哪里
代码:
import requests
from bs4 import BeautifulSoup
import re
for page in range(1, 1229):
print('正在下载第' + str(page) + '页...')
url = 'http://www.netbian.com/index_' + str(page) + '.htm'
if page == 1:
url = 'http://www.netbian.com/index.htm' # 第一页 url 与后面不同,做一下替换
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36'}
response = requests.get(url, headers=headers)
bs = BeautifulSoup(response.content, 'lxml')
li_list = bs.find('div', class_="list").ul.find_all('li') # 获取li标签
for i in li_list:
href = i.find('a')['href']
if '/desk' in href:
number = re.findall("\d+", href)[0]
pic_url = 'http://www.netbian.com/desk/' + number + '-1920x1080.htm' # 深层页面
response2 = requests.get(pic_url, headers=headers)
bs2 = BeautifulSoup(response2.content, 'lxml')
final_url = bs2.find('td').a['href'] # 图片下载地址
pic_name = bs2.find('td').a['title'] + '.jpg' # 图片名字
response3 = requests.get(final_url, headers=headers)
with open(pic_name, 'wb') as f:
f.write(response3.content)
运行截图:
运行截图
|