JS，正则表达式提取全部网址的问题

cqwcns 发表于 2023-1-24 22:57

水平太菜，一直搞不懂正则表达式。
以下是富文本编辑器生成的HTML，其中可能会夹杂一些图片地址，我希望把全部的图片地址提取处理，放到一个数组中。
<h1>电话</h1>的汉人同化它的好帖收到退货rtdsth等富含人体h <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571083349_OHR.GalileoMoons_ZH-CN0498325568_1920x1080.jpg"></figure>洞庭湖人太少hstrhsrth <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571113655_be6e7ee47196117d743e38015197ecdf924d2ec4bc0fbff5d69ec24b67b828c2.jpg"></figure>石头人和
提取条件大概就是：<img src="要提取的内容">
希望的结果是这样的：
arr = [
"https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571083349_OHR.GalileoMoons_ZH-CN0498325568_1920x1080.jpg",
"https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571113655_be6e7ee47196117d743e38015197ecdf924d2ec4bc0fbff5d69ec24b67b828c2.jpg"
];

网上查了一下，知道应该是用match方法，但试了几个写法，都不是想要的结果。

不知道应该怎么写才对，请各位大佬指教，感谢。

石昊荒天帝 发表于 2023-1-25 00:04

kesai 发表于 2023-1-25 00:17

<img\s+src="([^"]+)">,捕捉$1，看看行不

ashley912 发表于 2023-1-25 01:25

打卡，每天学一点知识

xifangczy 发表于 2023-1-25 02:45

test.match(/(?<=img[^>]*src=")[^"]*/g)

weixiao222 发表于 2023-1-25 05:06

python里我是这么干的。
re.findall('<img src="(.*?)">', str)

笨笨家的唯一 发表于 2023-1-25 10:22

你试试这个
// str 就是你需要提取内容的原文
let str = `<h1>电话</h1>的汉人同化它的好帖收到退货rtdsth等富含人体h <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571083349_OHR.GalileoMoons_ZH-CN0498325568_1920x1080.jpg"></figure>洞庭湖人太少hstrhsrth <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571113655_be6e7ee47196117d743e38015197ecdf924d2ec4bc0fbff5d69ec24b67b828c2.jpg"></figure>石头人和`;
// arr 用来保存提取的结果
let arr = []
// 正则匹配表达式，你要提取的内容就在括号里面
let reg = /<img src="(.*?)"/ig
// 在字符串中查找所有符合正则表达式的字符串，生成一个迭代器(你不用管迭代器是啥)
let result = str.matchAll(reg);
// 循环迭代器，获取数据
for (const res of result) {
// res 就是迭代器中的每一个元素
console.log(res)
// res 表示的是正则中第一个括号内的内容，也就是你要的内容
console.log(res)
// 压进数组进行保存
arr.push(res)
}
// 输出给你看，到此就保存完了
console.log(arr)

hollsovan 发表于 2023-1-25 12:16

零宽断言的应用，5楼[^"]*用的很巧妙

let x = '<h1>电话</h1>的汉人同化它的好帖收到退货rtdsth等富含人体h <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571083349_OHR.GalileoMoons_ZH-CN0498325568_1920x1080.jpg"></figure>洞庭湖人太少hstrhsrth <figure class="image"><img src="https://7765-kk-yy-1302668058.tcb.qcloud.la/notice/image/20230124/13553687299/1674571113655_be6e7ee47196117d743e38015197ecdf924d2ec4bc0fbff5d69ec24b67b828c2.jpg"></figure>石头人和'
result = x.match(/(?<=img src=").*?(?=")/g);
// 输出结果
[...result].map(i => console.log(i))

总结就是提取a和b之间的内容，用(?<=a).*?(?=b)

页: [1]

吾爱破解 - 52pojie.cn's Archiver

JS，正则表达式提取全部网址的问题