本帖最后由 ORCW 于 2020-5-31 13:37 编辑
python自制漫画下载器v1.0
下载地址在最下面,请先看完上面。
爬取网站:https://manhua.dmzj.com/
包含功能:1.搜索漫画和下载(保存为jpg格式或者png格式)
2.可自主选择下载保存位置(无默认保存地址,每次开启软件需重新选择保存地址)
预览:
流程如下:
下载速度:
最初的代码(现在这个也就加了个搜索和界面{:1_900:} ):
import os
from bs4 import BeautifulSoup
import requests
import re
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from tkinter import *
url = input('网址:')
header = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'}
header2={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36','Referer':url}
header3={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36','Referer':'https://manhua.dmzj.com/wunengdenainai/64301.shtml'
}
r = requests.get(url,headers=header).text
b = BeautifulSoup(r,'lxml').find('div',class_='cartoon_online_border').find_all('a')
for i in b:
url2='https://manhua.dmzj.com/' + i['href']+str('#@page=') + '1'
r2 = requests.get(url2,headers=header2).text
chrome_options=Options()
chrome_options.add_argument('--headless')
driver=webdriver.Chrome('C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe',
chrome_options=chrome_options)
driver.get(url2)
html = driver.find_element_by_xpath("//*").get_attribute("outerHTML")
driver.close()
b2 = BeautifulSoup(html,'lxml').find_all('option')
for e in b2:
url3 = 'https:' + str(e['value'])
print(url3)
r3 = requests.get(url3,headers=header3).content
filename='D:\\11MM\\'+ url3[-6:]
print(filename)
with open(filename,'wb') as f:
f.write(r3)
下载地址:
蓝奏云直链:https://ww.lanzouj.com/id6gs0d
源码链接:https://ww.lanzouj.com/id6gstc
已知问题:
1.搜索与下载的时候会比较慢
2.漫画之家源有时候无法搜索
如果你觉得有用的话,能否给个免费的评分,秋梨膏{:1_923:} |