pytho pdfkit 将网页django2.0教程内容打印成pdf文档
最近在学习django2.0,把杜赛大佬的django2.0网页教程打印成pdf文档 方便学习!
https://www.dusaiphoto.com/article/detail/2/
感谢大佬的分享!
附:django2.0教程pdf文档
链接: https://pan.baidu.com/s/1HM7_tEqcPk3h2ffZiYV17Q 提取码: udpi 七天有效期!
代码:
[Python] 纯文本查看 复制代码 #采集网页打印成pdf文档输出
# -*- coding: UTF-8 -*-
import pdfkit
import requests
from lxml import etree
import re
confg = pdfkit.configuration(wkhtmltopdf=r'C:\Users\Administrator\AppData\Local\Programs\Python\Python37\wkhtmltox\bin\wkhtmltopdf.exe')
#获取链接
def get_listurl():
url="https://www.dusaiphoto.com/article/detail/2/"
list_url = [url,]
html=requests.get(url).content.decode('utf-8')
con=re.findall(r'<div class="card-text" style="overflow: hidden">(.+?)<div class="container-fluid">',html,re.S)[0]
listurls=re.findall(r'<p class="mb-0">.+?<a href="(.+?)".+?style="color: #b8b8b8;"',con,re.S)
for listurl in listurls:
listurl=f'https://www.dusaiphoto.com{listurl}'
list_url.append(listurl)
print(list_url)
return list_url
#获取正文内容
def get_content(url):
#url='https://www.dusaiphoto.com/article/detail/4/'
html=requests.get(url).content.decode('utf-8')
content=re.findall(r'<div class="mt-4">(.+?)<div class="mt-4 mb-4">',html,re.S)[0]
return content
#保存html为pdf文档
def dypdf(contents):
contents=etree.HTML(contents)
s = etree.tostring(contents).decode()
print("开始打印内容!")
pdfkit.from_string(s, r'out.pdf',configuration=confg)
print("打印保存成功!")
if __name__ == '__main__':
contents=''
urls=get_listurl()
for url in urls:
print(url)
content=get_content(url)
contents='%s%s%s'%(contents,content,'<p><br><p>')
dypdf(contents)
wkhtmltopdf,这个工具的下载网站是:https://wkhtmltopdf.org/downloads.html
代码中,confg = pdfkit.configuration(wkhtmltopdf=r’C:\Users\Administrator\AppData\Local\Programs\Python\Python37\wkhtmltox\bin\wkhtmltopdf.exe’)
wkhtmltopdf=更换为你自己的包目录!
参考以下教程:
使用python把html网页转成pdf文件 https://www.cnblogs.com/xiaowenshu/p/9916719.html
写的比较粗糙,见谅!
查了下论坛 发现另一个老哥的更完善!大家可以参考他的!
Python爬取C语言中文网教程生成PDF
https://www.52pojie.cn/thread-990598-1-1.html
@null119
发帖不易,有帮助的话,麻烦贵手给个热心,给个赞!
有问题欢迎交流!
有在学django的老哥也可以交流一下!
如果有帮助到你!麻烦给个热心值!感谢!
|