requests爬取某度贴吧表情
本帖最后由 三滑稽甲苯 于 2020-2-21 23:25 编辑效果:
思路:
1. 登录贴吧网页版,点开一个帖子,查看表情图片链接(例如https://tb2.bds防tatic.com/tb/editor/im转ages/client/image_化emoticon23.png)
2. 尝试修改 .png 前的数字,发现能正常访问
3. 撸代码!
部分序号表情图片不存在,如果有关于判断表情是否存在的思路欢迎评论交流
半成品:
from requests import get
path = 'E:/贴吧表情/'
url = 'https://tb2.bdstatic.com/tb/editor/images/client/image_emoticon{}.png'
i = 1
while i <= 150:
try:
print(f'Downloading {i}.png...')
response = get(url.format(i))
except: break
else:
print(f'Writing {i}.png...')
with open(f'{path}{i}.png','wb') as f:
f.write(response.content)
i += 1 谢谢,用迅雷新建批量任务:https://tb2.bdstatic.com/tb/editor/images/client/image_emoticon(*).png也可以;头条的表情有吗? 改进版:victory:
支持判断表情是否存在
可以看到不存在的51-60号表情未被保存:victory:from requests import get
path = r'E://python/script/data/贴吧表情/'
url = 'https://tb2.bdstatic.com/tb/editor/images/client/image_emoticon{}.png'
i = 1
fail = get(url.format(51)).text
while i <= 150:
try:
print(f'Downloading {i}.png...')
response = get(url.format(i))
except: break
else:
if response.text == fail:
print(f'Image {i} not found.')
else:
print(f'Writing {i}.png...')
with open(f'{path}{i}.png','wb') as f:
f.write(response.content)
i += 1
效果:
页:
[1]