selenium爬取动态信息做分组取当前节点报错

zhchl163 · 发表于 2020-3-21 14:44

本帖最后由 zhchl163 于 2020-3-21 16:07 编辑

求助爬取到了数据列表,之后想分组提取数据,就报错了,是xpath定位写错了还是其他原因.

[Python] 纯文本查看 复制代码

# -*- coding: utf-8 -*-
import re
from selenium import webdriver
import time
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
import pandas as pd
driver_path = r"D:\Spider\chromedriver\chromedriver.exe"


class taobao:

    def Chrome_option(self):
        chrome_options = Options()
        chrome_options.add_argument('--head')
        chrome_options.add_argument('--disable-gpu')
        self.web = webdriver.Chrome(executable_path=driver_path, chrome_options=chrome_options)
        self.web.get("https://s.1688.com/selloffer/offer_search.htm?keywords=%B1%AD%B5%E6&n=y&netType=1%2C11&encode=utf-8&spm=a260k.dacugeneral.search.0")
        self.web.find_element_by_xpath("//*[@id='s-module-overlay']/div[2]/div/div[2]/em[4]").click()

        for x in range(6, 11, 1):
            height = float(x) / 10
            js = "document.documentElement.scrollTop = document.documentElement.scrollHeight * %f" % height
            self.web.execute_script(js)
            time.sleep(2)
    def get_content(self):
        item_list =self.web.find_elements_by_xpath("//div[@class='sw-layout-main sw-layout-grid-first sw-offer-220']//ul/li")
        item = {}
        for i in item_list:
            # print(i.text)
            item["title"] = i.find_element_by_xpath(".//div[2]/div[4]/a").get_attribute('title')
            item["company_name"] = i.find_element_by_xpath(".//div[2]/div[5]").get_attribute("title")
            print(item)
        print(len(item_list))
        self.web.quit()
    def run(self):
        selenium_option = self.Chrome_option()
        contents=self.get_content()
if __name__ == '__main__':
   taobao=taobao()
   taobao.run()

错误代码显示:

[Asm] 纯文本查看 复制代码

.....
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":".//div[2]/div[4]/a"}
  (Session info: chrome=80.0.3987.149)

xiaojie96528 · 发表于 2020-3-21 15:08

本帖最后由 xiaojie96528 于 2020-3-21 15:46 编辑

item_list = self.web.find_elements_by_xpath（“ // html / body / div [3] / div [2] / div [1] / div / div / ul / li”）这句就有问题了 xpath都找不到

zhchl163 · 发表于 2020-3-21 15:47

本帖最后由 zhchl163 于 2020-3-21 16:12 编辑

find_elements_by_xpath 提取了变成列表,遍历出来也是webelement对象<class 'selenium.webdriver.remote.webelement.WebElement>,就是find_element_by_xpath取当前节点加上点号就报错

zhchl163 · 发表于 2020-3-21 15:56

本帖最后由 zhchl163 于 2020-3-21 16:16 编辑

是数据还没加载出来,就开始读取,原因嘛?

wysyz · 发表于 2020-3-21 15:59

看天书一样，，，

zhchl163 · 发表于 2020-3-21 16:01

xiaojie96528 发表于 2020-3-21 15:08
item_list = self.web.find_elements_by_xpath（“ // html / body / div [3] / div [2] / div [1] / div / ...

这是我最后一次改的xpath,前一次的xpath :"//div[@class='sw-layout-main sw-layout-grid-first sw-offer-220']//ul/li"

zhchl163 · 发表于 2020-3-21 16:17

wysyz 发表于 2020-3-21 15:59
看天书一样，，，

重新改了一下

帐号		自动登录	找回密码
密码			注册[Register]

[求助] selenium爬取动态信息做分组取当前节点报错