获取指定城市大学的历年高考分数线
本帖最后由 Hatsune_miku 于 2019-6-26 13:37 编辑高考完了,是时候填志愿了,去哪里呢,去哪所大学呢,大学历年录取分数线多少呢?
我老弟(不是我)也遇到这样的问题,干脆写个脚本获取所有想去城市的大学的本科院校,来慢慢看。
软件介绍
获取高校的地址是:daxue.eol.cn,查询分数线的API是百度的。查询院校均为本科院校。
使用教程
安装依赖:
pip install requests lxml xlrd xlwt xlutils
目前获取的是【上海】【北京】【浙江】【广东】四个地方的本科学校:
分别对应getSchool函数里的alis:
alis = ['sh', 'bj', 'zj', 'gd']
for city in alis:
if city == 'sh':
num = 39
if city == 'bj':
num = 69
if city == 'zj':
num = 59
if city == 'gd':
num = 67
alis列表里面的字母是每个城市or省的名称拼音缩写。sh=上海,bj=北京,以此类推,这里可以修改为自己想查的城市。
下面的num是一个城市所有的本科院校,可以打开:https://daxue.eol.cn/mingdan.shtml,然后找自己想查找的城市,进去之后看序号,到最后一个本科学校的序号是多少,这里就依次对应填多少。
myCity = '江西' # 你户口所在地(高考所在地)
mySubject = '理科' # 可选【文科】【理科】【综合】综合一般是新高考政策才会选,一般户口所在地为浙江或者上海或者别的地方。
如果想查211,985大学,把main函数里的schools = getSchool() 改成 schools = getProjectSchool()
运行截图:
运行完毕的时候,程序会在运行目录生成一个University.xls,这个就是我们获取到的数据啦。
Python运行版本:Python3
最后祝高三学子都能找到自己满意的大学。
完整代码:
import requests
import xlrd
import xlwt
from xlutils.copy import copy
from lxml import etree
book_name_xls = 'University.xls'
sheet_name_xls = 'University'
def write_excel_xls(path, sheet_name, value):
index = len(value)# 获取需要写入数据的行数
workbook = xlwt.Workbook()# 新建一个工作簿
sheet = workbook.add_sheet(sheet_name)# 在工作簿中新建一个表格
for i in range(0, index):
for j in range(0, len(value)):
sheet.write(i, j, value)# 像表格中写入数据(对应的行和列)
workbook.save(path)# 保存工作簿
print("xls格式表格写入数据成功!")
def write_excel_xls_append(path, value):
index = len(value)# 获取需要写入数据的行数
workbook = xlrd.open_workbook(path)# 打开工作簿
sheets = workbook.sheet_names()# 获取工作簿中的所有表格
worksheet = workbook.sheet_by_name(sheets)# 获取工作簿中所有表格中的的第一个表格
rows_old = worksheet.nrows# 获取表格中已存在的数据的行数
new_workbook = copy(workbook)# 将xlrd对象拷贝转化为xlwt对象
new_worksheet = new_workbook.get_sheet(0)# 获取转化后工作簿中的第一个表格
for i in range(0, index):
for j in range(0, len(value)):
new_worksheet.write(i+rows_old, j, value)# 追加写入数据,注意是从i+rows_old行开始写入
new_workbook.save(path)# 保存工作簿
print("xls格式表格【追加】写入数据成功!")
def read_excel_xls(path):
workbook = xlrd.open_workbook(path)# 打开工作簿
sheets = workbook.sheet_names()# 获取工作簿中的所有表格
worksheet = workbook.sheet_by_name(sheets)# 获取工作簿中所有表格中的的第一个表格
for i in range(0, worksheet.nrows):
for j in range(0, worksheet.ncols):
print(worksheet.cell_value(i, j), "\t", end="")# 逐行逐列读取数据
print()
def getSchool():
alis = ['sh', 'bj', 'zj', 'gd']
for city in alis:
if city == 'sh':
num = 39
if city == 'bj':
num = 69
if city == 'zj':
num = 59
if city == 'gd':
num = 67
r = requests.get(f'http://daxue.eol.cn/{city}.shtml')
r.encoding='utf-8'
page = etree.HTML(r.text)
for i in range(num):
cengci_a = page.xpath(f'/html/body/div/div/div/table/tbody/tr[{i+3}]/td')
for cengci in cengci_a:
schools.append(cengci.text)
return schools
def getProjectSchool(): # 211 985
alis = ['211', '985']
for project in alis:
if project == '211':
num = 116 # 116所
x = 3
if project == '985':
num = 39
x = 2
r = requests.get(f'https://daxue.eol.cn/{project}.shtml')
r.encoding='utf-8'
page = etree.HTML(r.text)
for i in range(num):
tag_a = page.xpath(f'/html/body/div/div/div[{x}]/table/tbody/tr[{i+1}]/td/a')
for a in tag_a:
schools.append(a.text)
tag_b = page.xpath(f'/html/body/div/div/div[{x}]/table/tbody/tr[{i+1}]/td/a')
for b in tag_b:
schools.append(b.text)
return schools
def Query(school):
url = 'https://sp0.baidu.com/8aQDcjqpAAV3otqbppnN2DJv/api.php'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) \
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.90 Safari/537.36'
}
params = {
'resource_id': '34559',
'query': school,
'co': f'tr|th',
'format': 'json',
'oe': 'utf-8',
'ie': 'utf-8',
'_': '1561384786050'
}
r = requests.get(url, params = params, headers = headers).json()
for i in range(len(r['data']['tr'])):
year = r['data']['tr']['col']['info']['text']
average= r['data']['tr']['col']['info']['text']
low = r['data']['tr']['col']['info']['text']
shengkong = r['data']['tr']['col']['info']['text']
batch = r['data']['tr']['col']['info']['text']
value1 = [,]
write_excel_xls_append(book_name_xls, value1)
read_excel_xls(book_name_xls)
def main():
value_title = [["年份", "平均分", "最低分", "省控线", "批次", "学校"],]
write_excel_xls(book_name_xls, sheet_name_xls, value_title)
schools = getSchool()
for school in schools:
try:
Query(school)
except:
pass
if __name__ == '__main__':
schools = []
myCity = '江西'
mySubject = '理科'
main()
===== RESTART: C:\Users\Administrator.USER-20200210NV\Desktop\查高考的分数.py =====
xls格式表格写入数据成功!
Traceback (most recent call last):
File "C:\Users\Administrator.USER-20200210NV\Desktop\查高考的分数.py", line 133, in <module>
main()
File "C:\Users\Administrator.USER-20200210NV\Desktop\查高考的分数.py", line 121, in main
schools = getSchool()
File "C:\Users\Administrator.USER-20200210NV\Desktop\查高考的分数.py", line 55, in getSchool
r = requests.get(f'http://daxue.eol.cn/{city}.shtml')
AttributeError: module 'requests' has no attribute 'get'
>>> Hatsune_miku 发表于 2019-6-26 15:15
应该是暂时性的故障
求做个完整版,我一长方形固态空间移动专业的看不懂哇 高考结束后,这是最好的资源了
你是不是要报计算机专业呢? 谢谢分享!下载试试。 真是高手! 这简直是人间天使啊,这么应景的工具,nice 感谢分享 支持支持 感谢分享 这个很有用!下来用用!写了!