python爬虫爬取腾讯视频

l2430478 发表于 2022-1-1 11:36

本帖最后由 l2430478 于 2022-1-1 11:45 编辑

代码如下
import requests
from lxml import etree
from selenium import webdriver
from fake_useragent import UserAgent

class tencent_movie(object):
def __init__(self):
   ua = UserAgent(verify_ssl=False)
   for i in range(1, 100):
         self.headers = {
            'User-Agent': ua.random
            }
def get_html(self,url):
   response=requests.get(url,headers=self.headers)
   html=response.content.decode('utf-8')
   return html
def parse_html_tengxun(self,html):
   target=etree.HTML(html)
   links = target.xpath('//h2[@class="result_title"]/a/@href')
   host=links
   res = requests.get(host, headers=self.headers)
   con = res.content.decode('utf-8')
   new_html = etree.HTML(con)
   first_select = int(input('1.电视剧\n2.电影\n'))
   if (first_select == 1):
         titles=new_html.xpath('//div[@class="mod_episode"]/span/a/span/text()')
         new_links=new_html.xpath('//div[@class="mod_episode"]/span/a/@href')
         for title in titles:
            print('第%s集'%title)
         select = int(input('你要看第几集：(输入数字即可)'))
         new_link = new_links
         last_host = 'https://api.akmov.net/?url=' + new_link
   else:
         last_host = 'https://api.akmov.net/?url=' + host
   self.driver = webdriver.Chrome()
   self.driver.maximize_window()
   self.driver.get(last_host)
def main(self):
   name = str(input('请输入电视剧或电影名：'))
   url = 'https://v.qq.com/x/search/?q={}&stag=0&smartbox_ab='.format(name)
   html = self.get_html(url)
   self.parse_html_tengxun(html)

if __name__ == '__main__':
spider=tencent_movie()
spider.main()

拣尽寒枝不肯栖 发表于 2022-1-1 11:38

本帖最后由拣尽寒枝不肯栖于 2022-1-1 11:59 编辑

马上去试试，谢谢

galaxy1127 发表于 2022-1-1 12:27

小白不会用

LaLaLand 发表于 2022-1-1 12:36

先去试试水，感谢楼主！

TokeyJs 发表于 2022-1-1 12:36

这是搜索视频然后跳转到网页播放？？{:1_926:}

l2430478 发表于 2022-1-1 12:50

TokeyJs 发表于 2022-1-1 12:36
这是搜索视频然后跳转到网页播放？？

你这里不行吗？我这里可以啊。

HelloWang 发表于 2022-1-1 14:41

一直报错。

江苏男孩 发表于 2022-1-1 17:05

我也想学习爬图，感谢分享！

ccb0429 发表于 2022-1-1 17:28

https://chromedriver.storage.googleapis.com/index.html 报错的可能是没有环境或者没有安装谷歌浏览器去这个网页里下载对应版本就欧克了楼主用的selenium需要用这个

chuanshuo2017 发表于 2022-1-1 18:38

谢谢分享，学习了。

页: [1] 2 3 4 5

吾爱破解 - 52pojie.cn's Archiver

python爬虫爬取腾讯视频