随遇而安8 发表于 2021-12-5 16:11

python正则遇到汉字怎么办?

本帖最后由 随遇而安8 于 2021-12-5 16:14 编辑

代码如下,正则如果只匹配user_id":(.*?),"fullname":"(.*?)"是可以匹配出所有,但是再往后加上start_time": (.*?),匹配就为空
问题:是不是因为中间隔着“计算机课”这几个汉字,所以才不能匹配出start_time的吗?
import re

response = '{"teacher_id":5114,"attendance_id":7727052,"user_id":16786743,"fullname":"张三","lesson_name":"计算机课","lesson_start_time":1633860985,"lesson_end_time":1633862410,"conversion_status":1},{"teacher_id":5114,"attendance_id":8062199,"user_id":12956827,"fullname":"李四","lesson_name":"计算机课","lesson_start_time":1638620201,"lesson_end_time":1638622728,"conversion_status":1},{"teacher_id":5114,"attendance_id":7961271,"user_id":12769816,"fullname":"王五","lesson_name":"计算机课","lesson_start_time":1637061349,"lesson_end_time":1637064881,"conversion_status":1}'
student = re.findall(
    'user_id":(.*?),"fullname":"(.*?)".*start_time": (.*?),"lesson_end_time', response)
print(student)

cszcszv163 发表于 2021-12-5 16:28

import re

response = '{"teacher_id":5114,"attendance_id":7727052,"user_id":16786743,"fullname":"张三","lesson_name":"计算机课","lesson_start_time":1633860985,"lesson_end_time":1633862410,"conversion_status":1},{"teacher_id":5114,"attendance_id":8062199,"user_id":12956827,"fullname":"李四","lesson_name":"计算机课","lesson_start_time":1638620201,"lesson_end_time":1638622728,"conversion_status":1},{"teacher_id":5114,"attendance_id":7961271,"user_id":12769816,"fullname":"王五","lesson_name":"计算机课","lesson_start_time":1637061349,"lesson_end_time":1637064881,"conversion_status":1}'
student = re.findall(
    'user_id":(.*?),"fullname":"(.*?)".*?start_time":(.*?),"lesson_end_time', response)
print(student)

SDU123 发表于 2021-12-5 16:25

start_time" :和(.*?)之间有一个空格

lgsp_Jim 发表于 2021-12-5 16:25

*start_time",写错了吧

cszcszv163 发表于 2021-12-5 16:25

多打了空格

随遇而安8 发表于 2021-12-5 16:29

cszcszv163 发表于 2021-12-5 16:25
多打了空格

只匹配出来一条,如图

冥界3大法王 发表于 2021-12-5 16:31

@随遇而安8

中文编码范围,中文汉字的正则也许用的着。
双字节字符编码范围:
1. GBK (GB2312/GB18030)
\x00-\xff GBK双字节编码范围
\x20-\x7f ASCII
\xa1-\xff 中文gb2312
\x80-\xff 中文 gbk

2. UTF-8 (Unicode)
\u4e00-\u9fa5 (中文)
\x3130-\x318F (韩文)
\xAC00-\xD7A3 (韩文)
\u0800-\u4e00 (日文)
如果想在Android Studio 中查找汉字
可以使用正则搜索
[\u4e00-\u9fa5]

GiaoMan-wei 发表于 2021-12-5 16:35

请完整分析cszcszv16的代码,,人家的没问题

cszcszv163 发表于 2021-12-5 16:35

这里要用懒惰匹配

aspweb17 发表于 2021-12-5 16:55

又可以匹配的啊,,,赶紧搜下下
页: [1] 2
查看完整版本: python正则遇到汉字怎么办?