吾爱破解 - 52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 1841|回复: 5
收起左侧

[Python 转载] 爬取壁纸网的图片,并封装桌面应用,无需编写代码也能下

[复制链接]
微微星辰 发表于 2022-8-29 15:39
本帖最后由 微微星辰 于 2022-8-30 08:42 编辑

自己写了个爬虫程序,爬取很多壁纸,明星名字等等都可以直接搜,并且提前内置很多浏览器认证方式,无需更改,
话不多说,代码如下:
[Python] 纯文本查看 复制代码
import requests
import re
import os
import time
import random

USER_AGENTS = [
    "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0",
    "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.3; rv:11.0) like Gecko",
    "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)",
    "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)",
    "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1",
    "Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1",
    "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11",
    "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon 2.0)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; TencentTraveler 4.0)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; The World)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SE 2.X MetaSr 1.0; SE 2.X MetaSr 1.0; .NET CLR 2.0.50727; SE 2.X MetaSr 1.0)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; 360SE)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser)",
    "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)",
    "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5",
    "Mozilla/5.0 (iPod; U; CPU iPhone OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5",
    "Mozilla/5.0 (iPad; U; CPU OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5",
    "Mozilla/5.0 (Linux; U; Android 2.3.7; en-us; Nexus One Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1",
    "MQQBrowser/26 Mozilla/5.0 (Linux; U; Android 2.3.7; zh-cn; MB200 Build/GRJ22; CyanogenMod-7) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1",
    "Opera/9.80 (Android 2.3.4; Linux; Opera Mobi/build-1107180945; U; en-GB) Presto/2.8.149 Version/11.10",
    "Mozilla/5.0 (Linux; U; Android 3.0; en-us; Xoom Build/HRI39) AppleWebKit/534.13 (KHTML, like Gecko) Version/4.0 Safari/534.13",
    "Mozilla/5.0 (BlackBerry; U; BlackBerry 9800; en) AppleWebKit/534.1+ (KHTML, like Gecko) Version/6.0.0.337 Mobile Safari/534.1+",
    "Mozilla/5.0 (hp-tablet; Linux; hpwOS/3.0.0; U; en-US) AppleWebKit/534.6 (KHTML, like Gecko) wOSBrowser/233.70 Safari/534.6 TouchPad/1.0",
    "Mozilla/5.0 (SymbianOS/9.4; Series60/5.0 NokiaN97-1/20.0.019; Profile/MIDP-2.1 Configuration/CLDC-1.1) AppleWebKit/525 (KHTML, like Gecko) BrowserNG/7.1.18124",
    "Mozilla/5.0 (compatible; MSIE 9.0; Windows Phone OS 7.5; Trident/5.0; IEMobile/9.0; HTC; Titan)",
    "UCWEB7.0.2.37/28/999",
    "NOKIA5700/ UCWEB7.0.2.37/28/999",
    "Openwave/ UCWEB7.0.2.37/28/999",
    "Mozilla/4.0 (compatible; MSIE 6.0; ) Opera/UCWEB7.0.2.37/28/999",
    # iPhone 6:
    "Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25",
]

headers = {
    'User-Agent': random.choice(USER_AGENTS)
   # 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
}
# 创建文件夹  文件夹清空
def file_folder():
    # 创建mydata文件夹
    # 如果mydata文件夹已存在,清空文件夹(先清空后删除再创建)
    pathd = os.getcwd() + '\\肖战'
    if os.path.exists(pathd):  # 判断mydata文件夹是否存在
        for root, dirs, files in os.walk(pathd, topdown=False):
            for name in files:
                os.remove(os.path.join(root, name))  # 删除文件
            for name in dirs:
                os.rmdir(os.path.join(root, name))  # 删除文件夹
        os.rmdir(pathd)  # 删除mydata文件夹
    os.mkdir(pathd)  # 创建mydata文件夹


count_1 = 1
sename = ""

def data(url):
    print(url)
    response = requests.get(url,headers)
    # print(response)
    response.encoding = 'utf-8'
    response = response.text
    print(response)

    url_na='<p>(.*?)</p>'
    url_r = 'style="background-image: url\((.*?)\)">'
    # print(url_r)
    url_name = re.findall(url_na, response)
    print(url_name)
    url_src = re.findall(url_r, response)
    print(url_src)
    count_1=0
    # del url_name[0]
    for i in url_src:

        print(i)
        ress = requests.get(i,headers)
        ress.encoding = "utf-8"
        response = ress.content
        namel=url_name[count_1]
        # print(res.content)
        path="E:\\"+sename
        if os.path.exists(path):
            pass
        else:
            os.mkdir(path)
        path = path+"\\" + namel + ".jpg"
        with open(path, "wb") as f:
            f.write(response)
        count_1 += 1
        time.sleep(0.5)


if __name__ == "__main__":
    # 创建文件夹images
    sename = input('请输入查询关键词')
    url = "https://www.tt98.com/search/index.php?key="+sename+"&page="
    pagestart = 1
    pagestop = 6
    # file_folder()
    for i in range(int(pagestart), int(pagestop)):
        if i > 1:
            st = url + str(i) + '.html'
        else:
            st = url
        print(f'开始爬取第{i}页')
        data(st)
        print('第', i, '页爬取完毕,如果文件夹不加载图片说明网址输入错误')



另外,还生成了一个应用,想直接用的可以直接下载(爬虫源代码及exe可执行文件):
补上一个蓝奏云:https://wwd.lanzouw.com/iYGlu0ajbg1i
















下载地址.txt

242 Bytes, 下载次数: 9, 下载积分: 吾爱币 -1 CB

爬虫源代码及exe可执行文件

免费评分

参与人数 1吾爱币 +7 热心值 +1 收起 理由
苏紫方璇 + 7 + 1 欢迎分析讨论交流,吾爱破解论坛有你更精彩!

查看全部评分

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

forever2006 发表于 2022-8-29 21:10
在Win11系统运行出现闪退。
红客联盟红哥 发表于 2022-8-30 08:16
ErXing 发表于 2022-8-30 08:19
 楼主| 微微星辰 发表于 2022-8-30 08:33
forever2006 发表于 2022-8-29 21:10
在Win11系统运行出现闪退。

可以用Pycharm 源码试试
skoy03 发表于 2022-9-6 11:26
python不会使用肿么办?
您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则

返回列表

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2024-11-25 03:44

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表