本帖最后由 kover 于 2022-12-15 15:39 编辑
像这样的微信链接 https://mp.weixin.qq.com/s?src=11×tamp=1671077771&ver=4227&signature=EdUzpelUazkqkRVfv3HyP-E9ORc8ruu2R7Os6x3T3FDWNGDMyRVWCoe6aWfwCxre4zokjSqhvWCdjGaE7GTCGNdpBr*97VmwH3Jr0Zo4XbAvoqyqUJGIC4aq*VSWwlct&new=1
查看源码好像跟普通的html不一样,有很多代码隔开内容
想要提取里面的内容,要如何写呢?
rsp = requests.get(today_url, headers=heders)
hot = rsp.content.decode('utf8')
news_list = re.findall('(?<=关键字?).*(?=关键字)',hot)[0]
用其他人的方法行不通了,这个不是html那种那么干净的代码
加了好多这种代码
<section style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;min-height: 1em;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-role="outer" label="Powered by 135editor.com" style="margin-top: 10px;margin-bottom: 10px;white-space: normal;max-width: 100%;font-family: -apple-system-font, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;min-height: 1em;background-color: rgb(255, 255, 255);line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tools="135编辑器" data-id="89202" style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;min-height: 1em;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;min-height: 1em;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-role="outer" label="Powered by 135editor.com" style="margin-top: 10px;margin-bottom: 10px;white-space: normal;max-width: 100%;font-family: -apple-system-font, BlinkMacSystemFont, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;min-height: 1em;background-color: rgb(255, 255, 255);line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section data-tools="135编辑器" data-id="89202" style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;min-height: 1em;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;min-height: 1em;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><p style="margin-top: 10px;margin-bottom: 10px;max-width: 100%;letter-spacing: 0.544px;line-height: 2em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.544px;color: rgb(0, 0, 0);box-sizing: border-box !important;overflow-wrap: break-word !important;">1、</span></strong><span style="max-width: 100%;letter-spacing: 0.544px;color: rgb(0, 0, 0);box-sizing: border-box !important;overflow-wrap: break-word !important;">
|