本帖最后由 Lengxiy 于 2024-6-13 14:48 编辑
前言
写这篇是一个下午简单分析了一下智慧职教的部分功能然后做简单的自动话,也就是批改网课的作业,一次性要批改500~900分不等的试卷(就是最后的一些主观题)然后又要一天内搞定,手工太累了,就想了写点简单的code完事
代码肯定略有问题,但是因为我只开放了一个作业需要批改,所以简单很多,刚好我写的能够我自己用,里面的代码理论上应该也有类似的有需求的教师用的到如果需要也可以自行修改使用
使用的模块
selenium,request,time,re
分析内容开始
登录方式
职教云的登录地址跟学生的一样的:
[Python] 纯文本查看 复制代码 https://sso.icve.com.cn/sso/auth?mode=simple&source=2&redirect=https%3A%2F%2Fmooc.icve.com.cn
由于我不是很擅长分析加解密那些,本来一开始也想着用request通过发送数据包session什么的进行登录,结果看到有加密,就直接放弃,转为用selenium进行登录
然后通过Xpath的方式填充账号密码
开课课程页面
登录后我就直接访问到课程页面
[Python] 纯文本查看 复制代码 driver.get('https://mooc.icve.com.cn/learning/u/teacher/teaching/mooc_index.action')
然后顺便通过selenium访问拿到的cookies,header什么的再通过post一下拿到课程的json数据
[Python] 纯文本查看 复制代码
# 打开课程页面
driver.get('https://mooc.icve.com.cn/learning/u/teacher/teaching/mooc_index.action')
time.sleep(1)
cookies = driver.get_cookies()
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
token = str(input("请输入获取到的token:"))
post_class_url_token = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_selectMoocClassAndCourse.action?token="+token+"&siteCode=zhzj&curPage=1&pageSize=10"
text = api.do_post(cookies,post_class_url_token,headers)
这里其实token是加密的,但是就像我前面说的,我不会分析里面的加解密...问了AI解了一次base64结果还有一层加密,然后整不会了,只能手动打开F12,Ctrl+F5刷新一次浏览器复制拿到的token用了
大致看看数据结构是什么样的
了解了大概的数据啥样的之后做一个简单的选择课程进行后续的批改作业
[Python] 纯文本查看 复制代码
token = str(input("请输入获取到的token:"))
post_class_url_token = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_selectMoocClassAndCourse.action?token="+token+"&siteCode=zhzj&curPage=1&pageSize=10"
text = api.do_post(cookies,post_class_url_token,headers)
class_s = text['data']
total_count = text['totalCount']
for r in class_s:
classId.append(r['classId'])
classId_name.append(r['className'])
r_1 = r['courseList']
for r_1_1 in r_1:
courseId.append(r_1_1[9])
courseId_name.append(r_1_1[4])
print(f"************搜索到如下课程************")
j = 0
for i in classId_name:
m = 0
for k in courseId_name:
print(f"第{j+1}个课程名称:{classId_name[m]}{courseId_name[j]}")
j = j + 1
m = m + 1
choose = int(input("请输入你需要批改第几个课程(输入数字即可):"))
课程交卷学生分析
选择好课程之后
在模拟点击一下“作业考试”跳到审阅数据的页面
[Python] 纯文本查看 复制代码
# 打开课程页面
driver.get('https://mooc.icve.com.cn/learning/u/teacher/teaching/mooc_guidance.action?courseId='+courseId[choose-1]+'&phase=2&flagCourse=newest&type=2')
time.sleep(1)
# 点击“作业考试”按钮
copy_button = driver.find_element(By.XPATH, '//*[@id="homeworkExam"]')
copy_button.click()
time.sleep(1)
在这个页面的话,其实已经可以继续post得到详细的相关学生试卷数据了,但实际上我是通过点击“未批”进去新的界面分析数据包得到的下面的网址
然后就继续用selenium模拟点击,进入到试卷真正批阅的地址
其中下面这条代码,最后一个800是一次性显示多少条数据的意思,原始数据是10,我想了想,懒得在做一个for,所以直接粗暴的直接改显示数量获取数据,这里如果有需要根据自己的实际需求来改数值
这里我主要是获取recordId的值,因为后续的每个卷子是根据recordId进行改变的
[Python] 纯文本查看 复制代码 get_s_url = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_getExamRecordInfo.action?token="+token+"&examCode=20240226090626944370&type=unchecked&keyName=&sortColumn=&sortType=&curPage=1&pageSize=800"
[Python] 纯文本查看 复制代码
#获取需要批阅的数据
get_s_url = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_getExamRecordInfo.action?token="+token+"&examCode=20240226090626944370&type=unchecked&keyName=&sortColumn=&sortType=&curPage=1&pageSize=800"
json_data = driver.execute_async_script(api.do_script(get_s_url))
# 根据JSON数据进行判断
a = json_data['data']['totalCount']
recordIds = json_data['data']['list']
for r in recordIds:
recordId.append(r['recordId'])
count = int(a)
# 点击“未批”按钮
copy_button = driver.find_element(By.XPATH, '/html/body/div[1]/div[2]/div[2]/div/div/div[2]/div[23]/div[2]/div[1]/div[2]/div[3]/div/div[1]/div[2]')
driver.execute_script("arguments[0].scrollIntoView();", copy_button)
copy_button.click()
time.sleep(2)
# 点击“批阅”按钮
copy_button = driver.find_element(By.XPATH, '//*[@id="ExamRecordList"]/tr[1]/td[7]/p/a')
copy_button.click()
time.sleep(2)
批阅试卷分析
接下来就是分析实际的批阅试卷了,这里花费了我最多的时间
因为这个界面我也不太会用selenium找到Xpath进行填写评分数据....
先给看看页面啥样子
这种就是看到的试卷的样子,最苦恼的就是,我能够找到填写分数框的Xpath但是死活运行程序的时候,提示找不到
我想了想,可能设置了什么隐藏什么的,总之也搜了很多相关内容,就是哪怕加载完整个页面了,用selenium执行也是找不到这个Xpath
为此我还特意等待加载完整个页面后,打印出这个页面的源码到txt,我一看,确实是有输入框的xpath,但是死活运行的时候就是提示不存在....
由于无法通过selenium填充评分,就索性继续分析包然后直接通过Post进行传分....
下面是传分的包地址
[Python] 纯文本查看 复制代码 https://spoc-exam.icve.com.cn/teacher/exampaper/papercheck_saveScoreAndFinishCheck.action?recordId="+recordId[m]+"&score=e3a347ebc66846098cf7d630b1386326%400&remark=e3a347ebc66846098cf7d630b1386326%40
下面是在这个界面一直批阅的代码
我的逻辑就是通过post发包传分,然后传完后刷新页面,页面的最底下的按钮就会从“确认批阅”变为“下一份”
我在通过selenium点击“下一份”的按钮到下一份试卷重复批阅
[Python] 纯文本查看 复制代码 m = 0
while count > m:
contains_docx = False
total_chars = 0
# 获取当前会话的Cookies
cookies = driver.get_cookies()
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
answer_url = "https://spoc-exam.icve.com.cn/student/exam/examrecord_getRecordContentByPage.action?recordId="+recordId[m]+"&examBatchId=402883a98de16e97018de2f2e9c60503&contentIds=24eff13009cd483c8ee56d05e52307fd%2C1832baef66de4ecb8a22e61bf425fec5&checkScore=true"
text = api.do_post(cookies,answer_url,headers)
contentHtml = text['data']['1832baef66de4ecb8a22e61bf425fec5']['contentHtml']
# 提取<div class="exam_Answers_con">里面的内容
exam_answers_con = re.search(r'<div class="exam_Answers_con">(.*?)</div>', contentHtml, re.DOTALL).group(1)
# 提取<div class="exam_Answers_con">中的内容
exam_answers_con_match = re.search(r'<div class="exam_Answers_con">(.*?)</div>', contentHtml, re.DOTALL)
if exam_answers_con_match:
exam_answers_con = exam_answers_con_match.group(1)
else:
exam_answers_con = ""
# 去掉HTML标签
exam_answers_con_text = re.sub(r'<[^>]+>', '', exam_answers_con).strip()
# 计算字符数量
total_chars = len(exam_answers_con_text)
# 检查是否包含 .docx
contains_docx = '.docx' in exam_answers_con_text
# 获取当前会话的Cookies
cookies = driver.get_cookies()
session_cookies = {cookie['name']: cookie['value'] for cookie in cookies}
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
if total_chars < 50 and contains_docx == False:
scor_0_url="https://spoc-exam.icve.com.cn/teacher/exampaper/papercheck_saveScoreAndFinishCheck.action?recordId="+recordId[m]+"&score=e3a347ebc66846098cf7d630b1386326%400&remark=e3a347ebc66846098cf7d630b1386326%40"
response = requests.post(scor_0_url, headers=headers, cookies=session_cookies)
elif contains_docx == True or total_chars >= 50:
scor_100_url="https://spoc-exam.icve.com.cn/teacher/exampaper/papercheck_saveScoreAndFinishCheck.action?recordId="+recordId[m]+"&score=e3a347ebc66846098cf7d630b1386326%40100&remark=e3a347ebc66846098cf7d630b1386326%40"
response = requests.post(scor_100_url, headers=headers, cookies=session_cookies)
# 检查响应状态码
if response.status_code == 200:
print(f"Record {recordId[m]} scored successfully.")
else:
print(f"Failed to score record {recordId[m]}: {response.status_code}")
driver.switch_to.window(driver.window_handles[1])
driver.refresh()
# 点击“确认批阅”按钮
yes_button = driver.find_element(By.XPATH, '//*[@id="submitRecord"]')
yes_button.click()
time.sleep(0.5)
# 点击“下一份”按钮
again_button = driver.find_element(By.XPATH, '//*[@id="nextRecord"]')
again_button.click()
m = m + 1
关于评分标准插一嘴
因为试卷着实太多,不可能手动一个一个看
所以我的评分逻辑就是通过正则表达式,分析学生填写的答案,如果字数过少默认0分,字数达标也懒得看内容了,基本大差不差就给了100
因为我看过一些摸鱼的学生,直接在里面上传一些诸如:老师你看不见,求过求过求过; 我要吃饭!!!
此类的,我简单统计了一下,之类学生,基本字数都在50字以下,自己百度搜点内容写进来都不愿意,那我也就直接给0分了
当然也有上传附件的也就是word文档,学生太多我也懒得看,用正则匹配".docx",一般上传文档的或多或少里面的内容都是正常的也就直接给了100
代码如下
[Python] 纯文本查看 复制代码
# 提取<div class="exam_Answers_con">里面的内容
exam_answers_con = re.search(r'<div class="exam_Answers_con">(.*?)</div>', contentHtml, re.DOTALL).group(1)
# 提取<p>标签中的内容和<br>标签前后的内容
p_tags_content = re.findall(r'<p[^>]*>(.*?)<\/p>', exam_answers_con, re.DOTALL)
br_tags_content = re.findall(r'>([^<]*?)<br\s*/?>', exam_answers_con, re.DOTALL)
# 合并所有提取的内容
all_content = p_tags_content + br_tags_content
# 去掉 HTML 标签并计算字符数量
for content in all_content:
# 去掉 HTML 标签
text = re.sub(r'<[^>]+>', '', content).strip()
# 计算字符数量
total_chars += len(text.strip())
if '.docx' in text:
contains_docx = True
整体而言就是这样了
源代码分享
智慧职教自动批改.py
[Python] 纯文本查看 复制代码 from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.edge.options import Options
from selenium.common.exceptions import TimeoutException
import time,requests,api
import re
user = str(input("请输入你的智慧职教账号:"))
pwd = str(input("请输入你的智慧职教密码:"))
tou = str(input("是否需要无头模式(默认有头,输入1会变成无头模式):"))
# 设置Edge选项
edge_options = Options()
edge_options.add_argument("--inprivate") # 隐私模式
edge_options.add_argument("--disable-gpu")
edge_options.add_argument("--no-sandbox")
if tou == "1":
edge_options.add_argument("--headless") # 无头模式,加快运行速度
recordId = [] # 存储的id
classId = [] #课程名称id
classId_name = []#课程名称
courseId = [] #课程第几期的id
courseId_name = [] #课程第几期名称
# 初始化webdriver
driver = webdriver.Edge(options=edge_options)
# 打开登录页面
login_url = "https://sso.icve.com.cn/sso/auth?mode=simple&source=2&redirect=https%3A%2F%2Fmooc.icve.com.cn"
driver.get(login_url)
try:
# 等待页面加载,并找到账号和密码输入框
username_input = driver.find_element(By.XPATH, '//*[@id="app"]/div[1]/div[2]/div/div[1]/div/div[2]/div[3]/form/div[1]/div/div/input')
password_input = driver.find_element(By.XPATH, '//*[@id="app"]/div[1]/div[2]/div/div[1]/div/div[2]/div[3]/form/div[2]/div/div/input')
# 输入账号和密码
username_input.send_keys(user)
password_input.send_keys(pwd)
# 找到隐私勾选框并点击(使用XPath)
privacy_checkbox = driver.find_element(By.XPATH, '//*[@id="app"]/div[1]/div[2]/div/div[1]/div/div[2]/div[3]/form/div[4]/label/span/span')
privacy_checkbox.click()
# 登录按钮点击(使用XPath)
privacy_checkbox = driver.find_element(By.XPATH, '//*[@id="app"]/div[1]/div[2]/div/div[1]/div/div[2]/div[3]/form/div[5]')
privacy_checkbox.click()
time.sleep(2)
# 打开课程页面
driver.get('https://mooc.icve.com.cn/learning/u/teacher/teaching/mooc_index.action')
time.sleep(1)
cookies = driver.get_cookies()
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
token = str(input("请输入获取到的token:"))
post_class_url_token = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_selectMoocClassAndCourse.action?token="+token+"&siteCode=zhzj&curPage=1&pageSize=10"
text = api.do_post(cookies,post_class_url_token,headers)
class_s = text['data']
total_count = text['totalCount']
for r in class_s:
classId.append(r['classId'])
classId_name.append(r['className'])
r_1 = r['courseList']
for r_1_1 in r_1:
courseId.append(r_1_1[9])
courseId_name.append(r_1_1[4])
print(f"************搜索到如下课程************")
j = 0
for i in classId_name:
m = 0
for k in courseId_name:
print(f"第{j+1}个课程名称:{classId_name[m]}{courseId_name[j]}")
j = j + 1
m = m + 1
choose = int(input("请输入你需要批改第几个课程(输入数字即可):"))
# 打开课程页面
driver.get('https://mooc.icve.com.cn/learning/u/teacher/teaching/mooc_guidance.action?courseId='+courseId[choose-1]+'&phase=2&flagCourse=newest&type=2')
time.sleep(1)
# 点击“作业考试”按钮
copy_button = driver.find_element(By.XPATH, '//*[@id="homeworkExam"]')
copy_button.click()
time.sleep(1)
#获取需要批阅的数据
get_s_url = "https://mooc.icve.com.cn/patch/zhzj/teacherMooc_getExamRecordInfo.action?token="+token+"&examCode=20240226090626944370&type=unchecked&keyName=&sortColumn=&sortType=&curPage=1&pageSize=800"
json_data = driver.execute_async_script(api.do_script(get_s_url))
# 根据JSON数据进行判断
a = json_data['data']['totalCount']
recordIds = json_data['data']['list']
for r in recordIds:
recordId.append(r['recordId'])
count = int(a)
# 点击“未批”按钮
copy_button = driver.find_element(By.XPATH, '/html/body/div[1]/div[2]/div[2]/div/div/div[2]/div[23]/div[2]/div[1]/div[2]/div[3]/div/div[1]/div[2]')
driver.execute_script("arguments[0].scrollIntoView();", copy_button)
copy_button.click()
time.sleep(2)
# 点击“批阅”按钮
copy_button = driver.find_element(By.XPATH, '//*[@id="ExamRecordList"]/tr[1]/td[7]/p/a')
copy_button.click()
time.sleep(2)
m = 0
while count > m:
contains_docx = False
total_chars = 0
# 获取当前会话的Cookies
cookies = driver.get_cookies()
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
answer_url = "https://spoc-exam.icve.com.cn/student/exam/examrecord_getRecordContentByPage.action?recordId="+recordId[m]+"&examBatchId=402883a98de16e97018de2f2e9c60503&contentIds=24eff13009cd483c8ee56d05e52307fd%2C1832baef66de4ecb8a22e61bf425fec5&checkScore=true"
text = api.do_post(cookies,answer_url,headers)
contentHtml = text['data']['1832baef66de4ecb8a22e61bf425fec5']['contentHtml']
# 提取<div class="exam_Answers_con">里面的内容
exam_answers_con = re.search(r'<div class="exam_Answers_con">(.*?)</div>', contentHtml, re.DOTALL).group(1)
# 提取<div class="exam_Answers_con">中的内容
exam_answers_con_match = re.search(r'<div class="exam_Answers_con">(.*?)</div>', contentHtml, re.DOTALL)
if exam_answers_con_match:
exam_answers_con = exam_answers_con_match.group(1)
else:
exam_answers_con = ""
# 去掉HTML标签
exam_answers_con_text = re.sub(r'<[^>]+>', '', exam_answers_con).strip()
# 计算字符数量
total_chars = len(exam_answers_con_text)
# 检查是否包含 .docx
contains_docx = '.docx' in exam_answers_con_text
# 获取当前会话的Cookies
cookies = driver.get_cookies()
session_cookies = {cookie['name']: cookie['value'] for cookie in cookies}
# 获取必要的请求头
headers = {
'User-Agent': driver.execute_script("return navigator.userAgent;"),
'Referer': driver.current_url
}
if total_chars < 50 and contains_docx == False:
scor_0_url="https://spoc-exam.icve.com.cn/teacher/exampaper/papercheck_saveScoreAndFinishCheck.action?recordId="+recordId[m]+"&score=e3a347ebc66846098cf7d630b1386326%400&remark=e3a347ebc66846098cf7d630b1386326%40"
response = requests.post(scor_0_url, headers=headers, cookies=session_cookies)
elif contains_docx == True or total_chars >= 50:
scor_100_url="https://spoc-exam.icve.com.cn/teacher/exampaper/papercheck_saveScoreAndFinishCheck.action?recordId="+recordId[m]+"&score=e3a347ebc66846098cf7d630b1386326%40100&remark=e3a347ebc66846098cf7d630b1386326%40"
response = requests.post(scor_100_url, headers=headers, cookies=session_cookies)
# 检查响应状态码
if response.status_code == 200:
print(f"Record {recordId[m]} scored successfully.")
else:
print(f"Failed to score record {recordId[m]}: {response.status_code}")
driver.switch_to.window(driver.window_handles[1])
driver.refresh()
# 点击“确认批阅”按钮
yes_button = driver.find_element(By.XPATH, '//*[@id="submitRecord"]')
yes_button.click()
time.sleep(0.5)
# 点击“下一份”按钮
again_button = driver.find_element(By.XPATH, '//*[@id="nextRecord"]')
again_button.click()
m = m + 1
print("批阅完成")
except TimeoutException:
print("登录失败:超时")
finally:
# 关闭浏览器
driver.quit()
api.py里面的内容
[Python] 纯文本查看 复制代码
import requests
import json
def do_script(api_url):
script = f"""
var callback = arguments[arguments.length - 1];
fetch("{api_url}")
.then(response => response.json())
.then(data => callback(data))
.catch(error => callback({{"error": error.message}}));
"""
return script
def do_post(cookies,url,headers):
session_cookies = {cookie['name']: cookie['value'] for cookie in cookies}
response = requests.post(url, headers=headers, cookies=session_cookies)
text = json.loads(response.text)
return text
最后
直接运行智慧职教自动批改.py文件应该就可以了
智慧职教自动批改.py以及api.py要求在同一个目录
由于是自学的python以及分析肯定有很多不足,还请海涵,单纯提供一些工作相关便捷性代码在这里,有需要自取修改使用 |