【Python】B站热门视频榜单各项数据爬取(不含下载),附企业微信机器人转播功能
## 简介(效果自测)**功能:**
爬取B站实时热门视频榜单;
获取详细数据(评论,弹幕等);
根据分区筛选视频;
转播内容至企业微信机器人。
**涉及知识:** 对网页json文件的解析
**环境需求:**
Python 3
json库
requests库
faker库(可选)
企业微信机器人(可选)
## 源代码
```
from faker import Faker
import requests
f = Faker()
headers = {
'user-agent': f.user_agent()
}
link = "https://api.bilibili.com/x/web-interface/popular?ps=50&pn=1"
res_link = requests.get(link, headers=headers)
json_link = res_link.json()
list_video = json_link['data']['list']
video_number = 0
filterList = ["热点", "综合", "日常", "社科人文", "环球", "科技"]
# 此处可根据分区自定义筛选后的视频
for video in list_video:
owner_video = json_link['data']['list']['owner']
rcmd_video = json_link['data']['list']['rcmd_reason']
stat_video = json_link['data']['list']['stat']
coin_video = stat_video['coin']
danmaku_video = stat_video['danmaku']
favorite_video = stat_video['favorite']
like_video = stat_video['like']
reply_video = stat_video['reply']
share_video = stat_video['share']
view_video = stat_video['view']
id_video = video['bvid']
url = "https://www.bilibili.com/video/" + id_video
video_name = video['title']
video_description = video['desc']
video_owner = owner_video['name']
rcmd_reason_video = rcmd_video['content']
tname_video = video['tname']
video_number = video_number + 1
if tname_video in filterList:
print('UP主:' + video_owner)
print('分区:' + tname_video)
if rcmd_reason_video:
print('推荐原因:' + rcmd_reason_video)
print('第' + str(video_number) + '个视频:' + '---- ' + video_name + ' ----' + '\n\n')
print('描述:' + video_description + '\n\n')
print('点赞量:' + str(like_video))
print('播放量:' + str(view_video))
print('投币量:' + str(coin_video))
print('收藏量:' + str(favorite_video))
print('转发量:' + str(share_video))
print('弹幕量:' + str(danmaku_video))
print('评论量:' + str(reply_video) + '\n')
print('链接:' + url)
print('=' * 50)
```
## 配置教程
### 修改筛选视频的分区条件
对这一行代码进行修改,新增新的列表项目,爬虫结果会显示符合条件的视频。
```
filterList = ["热点", "综合", "日常", "社科人文", "环球", "科技"]
```
### 取消筛选视频的分区条件
删除这一行代码并调整缩进以显示全部的视频。
```
if tname_video in filterList:
```
### 我不想看到那么多的数据
这些代码都是可以增删的。
```
print('UP主:' + video_owner)
print('分区:' + tname_video)
if rcmd_reason_video:
print('推荐原因:' + rcmd_reason_video)
print('第' + str(video_number) + '个视频:' + '---- ' + video_name + ' ----' + '\n\n')
print('描述:' + video_description + '\n\n')
print('点赞量:' + str(like_video))
print('播放量:' + str(view_video))
print('投币量:' + str(coin_video))
print('收藏量:' + str(favorite_video))
print('转发量:' + str(share_video))
print('弹幕量:' + str(danmaku_video))
print('评论量:' + str(reply_video) + '\n')
print('链接:' + url)
print('=' * 50)
```
### 我不止想看前50个视频
URL`ps`参数改一下即可。
```
link = "https://api.bilibili.com/x/web-interface/popular?ps=50&pn=1"
```
### 如何转播到企业微信里?
开头声明一个空列表放在循环前面。
```
content_list = []
```
循环体最后一行加入语句,可视情况修改。
```
content_list.append("[" + "「" + tname_video + "」" + video_name + "](" + url + ")")
```
最后加入发送函数。
```
def send(content):
URL = "改成你自己的Webhook URL哦"
HEADERS = {"Content-Type": "text/plain"}
data = {
"msgtype": "markdown",
"markdown": {
"content": content,
}
}
requests_url = requests.post(URL, headers=HEADERS, data=json.dumps(data))
if requests_url.text == '{"errcode":0,"errmsg":"ok"}':
return "发送成功"
else:
return "发送失败" + requests_url.text
print(send(str(content_list).replace("', '", '\n').replace("['", "").replace("']", "")))
``` 看到有些小伙伴自行添加出了bug,这里贴一下加上企业微信转播的完整代码吧。
```
from faker import Faker
import requests
import json
f = Faker()
headers = {
'user-agent': f.user_agent()
}
link = "https://api.bilibili.com/x/web-interface/popular?ps=50&pn=1"
res_link = requests.get(link, headers=headers)
json_link = res_link.json()
list_video = json_link['data']['list']
video_number = 0
filterList = ["热点", "综合", "日常", "社科人文", "环球", "科技"]
# 此处可根据分区自定义筛选后的视频
content_list = []
for video in list_video:
owner_video = json_link['data']['list']['owner']
rcmd_video = json_link['data']['list']['rcmd_reason']
stat_video = json_link['data']['list']['stat']
coin_video = stat_video['coin']
danmaku_video = stat_video['danmaku']
favorite_video = stat_video['favorite']
like_video = stat_video['like']
reply_video = stat_video['reply']
share_video = stat_video['share']
view_video = stat_video['view']
id_video = video['bvid']
url = "https://www.bilibili.com/video/" + id_video
video_name = video['title']
video_description = video['desc']
video_owner = owner_video['name']
rcmd_reason_video = rcmd_video['content']
tname_video = video['tname']
video_number = video_number + 1
if tname_video in filterList:
print('UP主:' + video_owner)
print('分区:' + tname_video)
if rcmd_reason_video:
print('推荐原因:' + rcmd_reason_video)
print('第' + str(video_number) + '个视频:' + '---- ' + video_name + ' ----' + '\n\n')
print('描述:' + video_description + '\n\n')
print('点赞量:' + str(like_video))
print('播放量:' + str(view_video))
print('投币量:' + str(coin_video))
print('收藏量:' + str(favorite_video))
print('转发量:' + str(share_video))
print('弹幕量:' + str(danmaku_video))
print('评论量:' + str(reply_video) + '\n')
print('链接:' + url)
print('=' * 50)
content_list.append("[" + "「" + tname_video + "」" + video_name + "](" + url + ")")
def send(content):
URL = "记得替换"
HEADERS = {"Content-Type": "text/plain"}
data = {
"msgtype": "markdown",
"markdown": {
"content": content,
}
}
requests_url = requests.post(URL, headers=HEADERS, data=json.dumps(data))
if requests_url.text == '{"errcode":0,"errmsg":"ok"}':
return "发送成功"
else:
return "发送失败" + requests_url.text
print(send(str(content_list).replace("', '", '\n').replace("['", "").replace("']", "")))
``` 谢谢楼主,不过发送到企业微信只有一条 ppszxc 发表于 2021-4-27 21:53
谢谢楼主,不过发送到企业微信只有一条
看下这行代码有没有放对地方。
content_list.append("[" + "「" + tname_video + "」" + video_name + "](" + url + ")")
我估计你是放在循环外部了,这个要放在循环内。 很高端的样子,不太会用。 有白云的日子。 发表于 2021-4-27 22:01
很高端的样子,不太会用。
放到Python解释器里直接运行就可以了 ARtcgb 发表于 2021-4-27 22:02
放到Python解释器里直接运行就可以了
好的,感谢楼主回复,这就去试试。 谢谢大佬分享 感谢博主的分享 看着就很有用的东西啊
页:
[1]
2