zl2222 发表于 2024-4-2 15:44

请大佬看看第二个url为啥报错

import requests
from lxml import etree

url = 'https://www.bond-y.com/zj/106288/38753182.html'

while True:
    headers = {
      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36'
    }
    resp = requests.get(url, headers=headers)
    resp.encoding = 'utf-8'

    e = etree.HTML(resp.text)
    info = ''.join(e.xpath('//div[@id="booktxt"]/p/text()'))
    title = e.xpath('//h1/text()')[0]
    url = 'https://www.bond-y.com{e.xpath("//a[@rel='next']/@href")}'
    with open('1.txt', 'w', encoding='utf-8') as f:
      f.write(title + '\n\n' + info + '\n\n')

nody 发表于 2024-4-2 16:52

url = 'https://www.bond-y.com{e.xpath("//a[@rel=\'next\']/@href")}'
或者
url = '''https://www.bond-y.com{e.xpath("//a[@rel='next']/@href")}'''
你单引号把字符串断了

yushuai033X 发表于 2024-4-2 16:54

url = "https://www.bond-y.com{e.xpath('//a[@rel=\"next\"]/@href')}"
url = 'https://www.bond-y.com{e.xpath("//a[@rel=\'next\']/@href")}'

WIN108711 发表于 2024-4-2 16:58

e.xpath表达式有问题
next_page_link = e.xpath("//a[@class='next']/@href")# 修正XPath表达式

TheWeiJun 发表于 2024-4-2 17:33

少一个f"",我看见是这样

mo1230 发表于 2024-4-3 17:06

你字符串用的是单引号包裹的,里面再出现单引号的话要用转义字符即 \' 来代替

zl2222 发表于 2024-4-3 19:17

mo1230 发表于 2024-4-3 17:06
你字符串用的是单引号包裹的,里面再出现单引号的话要用转义字符即 \' 来代替

转义后还报错
页: [1]
查看完整版本: 请大佬看看第二个url为啥报错