本帖最后由 huguo002 于 2019-4-19 20:24 编辑
Python 获取百度相关搜索结果关键词例子,超简单
Python 获取百度相关搜索结果关键词例子,超简单
带写入功能!
[Asm] 纯文本查看 复制代码 # -*- coding=utf-8 -*-
import requests
import re
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"
}
keyword=(input('请输入关键词:',))
url='https://www.baidu.com/s?wd='+keyword
html=requests.get(url,headers=headers).text
#print(html)
ze=r'<div id="rs"><div class="tt">相关搜索</div><table cellpadding="0">(.+?)</table></div>'
xgss=re.findall(ze,html,re.S)
#print(xgss)
xgze=r'<th><a href="(.+?)">(.+?)</a></th>'
sj=re.findall(xgze,str(xgss),re.S)
#print(sj)
gjc=''
for x in sj:
print(x[1])
gjc=gjc+x[1]+'\r\n'
print(gjc)
with open(r'C:\Users\Administrator\Desktop\gjc.txt', 'w', encoding='utf-8') as f:
f.write(gjc)
需要手动输入查询的关键词!!!
更新代码,添加关键词txt读入,格式为一行一个关键词!
后期看看怎么改成多线程版本!
[Asm] 纯文本查看 复制代码 #百度相关搜索关键词抓取,读取txt关键词,导出txt关键词
# -*- coding=utf-8 -*-
import requests
import re
gjc=''
data = []
for line in open(r'C:\Users\Administrator\Desktop\gjc.txt',"r", encoding='utf-8'):
data.append(line)
#print(data)
for keyword in data:
print(keyword)
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"
}
url='https://www.baidu.com/s?wd='+keyword
html=requests.get(url,headers=headers).text
#print(html)
ze=r'<div id="rs"><div class="tt">相关搜索</div><table cellpadding="0">(.+?)</table></div>'
xgss=re.findall(ze,html,re.S)
#print(xgss)
xgze=r'<th><a href="(.+?)">(.+?)</a></th>'
sj=re.findall(xgze,str(xgss),re.S)
#print(sj)
for x in sj:
print(x[1])
gjc=gjc+x[1]+'\r\n'
print(gjc)
with open(r'C:\Users\Administrator\Desktop\gjcsj.txt', 'w', encoding='utf-8') as f:
f.write(gjc)
基本个人用应该可以了!
ps:单线程,确实比较慢!!!
|