本帖最后由 susheng 于 2022-11-6 19:11 编辑
很多人没区分话题和模板地址,话题地址类似这个https://mp.weixin.qq.com/mp/appmsgalbum ,用https://wwn.lanzouy.com/iMrNE0dw3ekd 这个下载。
之前发过帖子批量下载公众号文章内容/音频/视频 ,这次增加了页面模板批量下载,比如支付宝这个模板:
打开软件 https://wwk.lanzoue.com/icAqd0fbyeni 输入模板地址即可下载:
第2次下载会跳过已经下载过的文章:
再用这个html批量转pdf工具 https://wwk.lanzouf.com/iSpV90fbtpqh
还生成了一个文章列表excel,包含文章日期,文章标题,文章链接和文章封面。
部分代码如下:
[Asm] 纯文本查看 复制代码 def down(begin,count):
url2=url.replace('#wechat_redirect','')
url_home = f'{url2}&begin={begin}&count={count}&action=appmsg_list&f=json&r=0.26146868035616433&appmsg_token='
res = requests.post(url_home,headers=headers,verify=False).json()
for i in res['appmsg_list']:
if html.unescape(i['link']) in urls:
print('已经下载过文章:'+html.unescape(i['link']))
continue
data = requests.get(i['link'],headers=headers,verify=False)
content = data.text.replace('data-src', 'src')
try:
date = time.strftime('%Y-%m-%d', time.localtime(int(i['sendtime'])))
title = i['title']
print('正在下载文章:',title,i['link'])
with open(date+'_'+trimName(title)+'.html', 'w', encoding='utf-8') as f:
f.write(content)
except Exception as e:
with open(str(randint(1,10))+'.html', 'w', encoding='utf-8') as f:
f.write(content)
print('错误信息:',e)
with open(fname, 'a+', encoding=encoding) as f2:
f2.write(date+','+title + ','+i['author'] + ','+i['digest'] + ','+html.unescape(i['link'])+ ','+i['cover']+'\n')
|