• 企业400电话
  • 微网小程序
  • AI电话机器人
  • 电商代运营
  • 全 部 栏 目

    企业400电话 网络优化推广 AI电话机器人 呼叫中心 网站建设 商标✡知产 微网小程序 电商运营 彩铃•短信 增值拓展业务
    Python selenium模拟网页点击爬虫交管12123违章数据

    在上一篇文章《Python教程—模拟网页点击爬虫定位系统》讲解怎么通过模拟点击方式爬取车辆定位数据,本次介绍怎么以模拟点击方式进入交管12123爬取车辆违章数据,本文直接讲解过程,使用的命令解释见上一篇文章。本文同《Python教程—模拟网页点击爬虫定位系统》同样为企业中实际的爬虫案例,如果之后想进入车企行业可以做个了解。

    准备工具:spyder、selenium库、google浏览器及对应版本的chromedriver.exe

    效果

    注:分享此案例目的是为了帮助同行解放双手,更好管理企业资产,本文程序以删除网址、账号密码,该网址比较麻烦的一点是开始点击登录的时候网页可能会有其他弹窗出现,使得原有路径改变,程序会因为找不到对应路径而报错,重新执行程序即可。除了模拟点击登录,还可以直接通过Cookie直接登录网页,这种方式就可以绕过登录的繁琐步骤。

    调用库

    from selenium import webdriver
    import time
    import csv
    import datetime
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.support.wait import WebDriverWait
    import math
    import xlrd

    读取需要查询的车牌号

    data = xlrd.open_workbook('cheliang.xlsx')

    创建浏览,打开网页

    opt = webdriver.ChromeOptions()   #创建浏览
    #opt.set_headless()    #无窗口模式
    driver = webdriver.Chrome(options=opt)  #创建浏览器对象
    driver.maximize_window()   #最大化窗口
    ​
    print("正在打开网页")
    driver.get('') #打开网页

    依次点击单位登录、输入账号、密码、点击验证码填写区域触发图片、勾选、输入验证码、点击登录

    time.sleep(3)     #加载等待
    print("点击单位登录")
    time.sleep(3)     #加载等待
    driver.find_element_by_xpath("/html/body/div[1]/div[2]/div/div[2]/div[2]/button").click()#点击单位登录
    ​
    time.sleep(3)     #加载等待
    print("正在填写账号")
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input")
    # 清空原有内容
    elem.clear()
    # 填入账号
    elem.send_keys("")
    ​
    time.sleep(1)     #加载等待
    print("正在填写密码")
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input")
    # 清空原有内容
    elem.clear()
    # 填入密码
    elem.send_keys("")
    ​
    time.sleep(1)     #加载等待
    print("正在查看验证码")
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input").click()#查看验证码
    print("请输入验证码")
    yanzhengma=input()
    ​
    time.sleep(1)     #加载等待
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input").click()#勾选
    ​
    time.sleep(1)     #加载等待
    # 填入验证码
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input")
    elem.clear()
    elem.send_keys(str(yanzhengma))
    ​
    time.sleep(1)     #加载等待
    print("正在登陆")
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#点击

    点击违法查询,设置查询时间

    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#点击
     
    time.sleep(3)     #加载等待
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/ul/li[5]/a").click()#点击违法查询
     
    time.sleep(1)     #加载等待
    driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i").click()#点击选择日期
     
    for i in range(3):
        time.sleep(0.5)     #加载等待
        driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/thead/tr/th[1]/i").click()#点击
     
    time.sleep(0.5)     #加载等待
    driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]").click()#点击
     
    time.sleep(0.5)     #加载等待
    driver.find_element_by_xpath("/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]").click()#点击

    循环依次查询每个车牌违章信息,每次都需要清空上次输入,填写本次查询车牌,识别有多少条数据,共多少页,每页最多展示10条,最后一页有多少条数据

    for ii in range(0,nrows):
        rowValues= table.row_values(ii) #某一行数据
        print('正在读取第'+str(ii+1)+'辆车')
    # 填写车牌
        time.sleep(0.5)     #加载等待
        elem = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input")
        elem.clear()
        elem.send_keys(rowValues)#输入车牌
        time.sleep(0.1)     #加载等待
        driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button").click()#点击查询
        time.sleep(0.5)     #加载等待
        result=driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span").text#总违章条数
        result=int(result)
        a=math.ceil(result/10)#总页数
        b=result%10 #除余

    读取列表中的数据,其中扣分和罚款需要点击"查看详情",从弹窗中读取数据

    result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
    result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
    result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
    result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
    result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
    result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
    result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
    WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看详情,打开弹窗
    time.sleep(1)     #加载等待
    result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[7]/span[2]"))).text
    result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[8]/span[2]"))).text
    result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
    R.append(result)
    WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[@class='modal-footer ui_modal']/button"))).click()#关闭弹窗
    time.sleep(0.5)     #加载等待

    每读取一辆车的数据就写入表格中

    with open(wenjian,'w',encoding='utf-8',newline='') as fp:
        writer = csv.writer(fp)
        writer.writerows(R) #写入数据

    完整代码

    from selenium import webdriver
    import time
    import csv
    import datetime
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.support.wait import WebDriverWait
    import math
    import xlrd
    data = xlrd.open_workbook('cheliang.xlsx')
    table = data.sheets()[0]
    nrows = table.nrows #行数
    ncols = table.ncols #列数
     
    opt = webdriver.ChromeOptions()   #创建浏览
    #opt.set_headless()    #无窗口模式
    driver = webdriver.Chrome(options=opt)  #创建浏览器对象
    driver.maximize_window()   #最大化窗口
     
    print("正在打开网页")
    driver.get('') #打开网页
     
    time.sleep(3)     #加载等待
    print("点击单位登录")
    time.sleep(3)     #加载等待
    driver.find_element_by_xpath("/html/body/div[1]/div[2]/div/div[2]/div[2]/button").click()#点击单位登录
     
    time.sleep(3)     #加载等待
    print("正在填写账号")
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input")
    # 清空原有内容
    elem.clear()
    # 填入账号
    elem.send_keys("")
     
    time.sleep(1)     #加载等待
    print("正在填写密码")
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input")
    # 清空原有内容
    elem.clear()
    # 填入密码
    elem.send_keys("")
     
    time.sleep(1)     #加载等待
    print("正在查看验证码")
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input").click()#查看验证码
    print("请输入验证码")
    yanzhengma=input()
     
    time.sleep(1)     #加载等待
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input").click()#勾选
     
    time.sleep(1)     #加载等待
    # 填入验证码
    elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input")
    elem.clear()
    elem.send_keys(str(yanzhengma))
     
     
    time.sleep(1)     #加载等待
    print("正在登陆")
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#点击
     
    time.sleep(3)     #加载等待
    driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/ul/li[5]/a").click()#点击违法查询
     
    time.sleep(1)     #加载等待
    driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i").click()#点击选择日期
     
    for i in range(3):
        time.sleep(0.5)     #加载等待
        driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/thead/tr/th[1]/i").click()#点击
     
    time.sleep(0.5)     #加载等待
    driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]").click()#点击
     
    time.sleep(0.5)     #加载等待
    driver.find_element_by_xpath("/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]").click()#点击
     
    wenjian=datetime.datetime.now().strftime('%Y-%m-%d-%H%M%S') #以开始时间作为数据导出的表格文件名
    wenjian=wenjian+'.csv'
     
    R=[]
    for ii in range(0,nrows):
        rowValues= table.row_values(ii) #某一行数据
        print('正在读取第'+str(ii+1)+'辆车')
        # 填写车牌
        time.sleep(0.5)     #加载等待
        elem = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input")
        elem.clear()
        elem.send_keys(rowValues)#输入车牌
        time.sleep(0.1)     #加载等待
        driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button").click()#点击查询
        time.sleep(0.5)     #加载等待
        result=driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span").text#总违章条数
        result=int(result)
        a=math.ceil(result/10)#总页数
        b=result%10 #除余
        
        for i in range(1,a):
            for j in range(1,11):
                
                result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
                result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
                result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
                result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
                result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
                result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
                result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
                #result1=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]").text
                #result2=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]").text
                #result3=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]").text
                #result4=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]").text
                #result5=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]").text
                #result6=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]").text
                #result7=driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]").text
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看详情,打开弹窗
                time.sleep(1)     #加载等待
                #driver.find_element_by_xpath("//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a").click()#点击列表中的元素
                #time.sleep(0.5)     #加载等待
                result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[7]/span[2]"))).text
                result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[8]/span[2]"))).text
                #result8=driver.find_element_by_xpath("//form[@class='form-horizontal']/div[7]/span[2]").text
                #result9=driver.find_element_by_xpath("//form[@class='form-horizontal']/div[8]/span[2]").text
                result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
                R.append(result)
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[@class='modal-footer ui_modal']/button"))).click()#关闭弹窗
                time.sleep(0.5)     #加载等待
                #driver.find_element_by_xpath("//div[@class='modal-footer ui_modal']/button").click()#点击列表中的元素
                #time.sleep(0.5)     #加载等待
                
            driver.find_element_by_link_text("下一页").click()#翻页
            time.sleep(0.5)     #加载等待   
            
        if b>0:
            for j in range(1,b+1):
                result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
                result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
                result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
                result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
                result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
                result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
                result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看详情,打开弹窗
                time.sleep(1)     #加载等待
                result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[7]/span[2]"))).text
                result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[8]/span[2]"))).text
                result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
                R.append(result)
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[@class='modal-footer ui_modal']/button"))).click()#关闭弹窗
                time.sleep(0.5)     #加载等待
     
        if b==0:
            for j in range(1,11):
                result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
                result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
                result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
                result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
                result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
                result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
                result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看详情,打开弹窗
                time.sleep(1)     #加载等待
                result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[7]/span[2]"))).text
                result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//form[@class='form-horizontal']/div[8]/span[2]"))).text
                result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
                R.append(result)
                WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[@class='modal-footer ui_modal']/button"))).click()#关闭弹窗
                time.sleep(0.5)     #加载等待
       
        time.sleep(0.5)     #加载等待
        with open(wenjian,'w',encoding='utf-8',newline='') as fp:
            writer = csv.writer(fp)
            writer.writerows(R) #写入数据

    到此这篇关于Python selenium模拟网页点击爬虫交管12123违章数据的文章就介绍到这了,更多相关Python selenium模拟点击爬虫内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

    您可能感兴趣的文章:
    • python爬虫之利用Selenium+Requests爬取拉勾网
    • python爬虫selenium模块详解
    • python实现selenium网络爬虫的方法小结
    • python爬虫利用selenium实现自动翻页爬取某鱼数据的思路详解
    • Python爬虫之Selenium实现关闭浏览器
    • Python爬虫中Selenium实现文件上传
    • Python爬虫之Selenium下拉框处理的实现
    • 教你如何使用Python selenium
    上一篇:python scipy 稀疏矩阵的使用说明
    下一篇:Python 如何解决稀疏矩阵运算
  • 相关文章
  • 

    © 2016-2020 巨人网络通讯 版权所有

    《增值电信业务经营许可证》 苏ICP备15040257号-8

    Python selenium模拟网页点击爬虫交管12123违章数据 Python,selenium,模拟,网页,