在上一篇文章《Python教程—模擬網(wǎng)頁點(diǎn)擊爬蟲定位系統(tǒng)》講解怎么通過模擬點(diǎn)擊方式爬取車輛定位數(shù)據(jù),本次介紹怎么以模擬點(diǎn)擊方式進(jìn)入交管12123爬取車輛違章數(shù)據(jù),本文直接講解過程,使用的命令解釋見上一篇文章。本文同《Python教程—模擬網(wǎng)頁點(diǎn)擊爬蟲定位系統(tǒng)》同樣為企業(yè)中實(shí)際的爬蟲案例,如果之后想進(jìn)入車企行業(yè)可以做個(gè)了解。
準(zhǔn)備工具:spyder、selenium庫、google瀏覽器及對(duì)應(yīng)版本的chromedriver.exe
效果
注:分享此案例目的是為了幫助同行解放雙手,更好管理企業(yè)資產(chǎn),本文程序以刪除網(wǎng)址、賬號(hào)密碼,該網(wǎng)址比較麻煩的一點(diǎn)是開始點(diǎn)擊登錄的時(shí)候網(wǎng)頁可能會(huì)有其他彈窗出現(xiàn),使得原有路徑改變,程序會(huì)因?yàn)檎也坏綄?duì)應(yīng)路徑而報(bào)錯(cuò),重新執(zhí)行程序即可。除了模擬點(diǎn)擊登錄,還可以直接通過Cookie直接登錄網(wǎng)頁,這種方式就可以繞過登錄的繁瑣步驟。
調(diào)用庫
from selenium import webdriver
import time
import csv
import datetime
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
import math
import xlrd
讀取需要查詢的車牌號(hào)
data = xlrd.open_workbook('cheliang.xlsx')
創(chuàng)建瀏覽,打開網(wǎng)頁
opt = webdriver.ChromeOptions() #創(chuàng)建瀏覽
#opt.set_headless() #無窗口模式
driver = webdriver.Chrome(options=opt) #創(chuàng)建瀏覽器對(duì)象
driver.maximize_window() #最大化窗口
print("正在打開網(wǎng)頁")
driver.get('') #打開網(wǎng)頁
依次點(diǎn)擊單位登錄、輸入賬號(hào)、密碼、點(diǎn)擊驗(yàn)證碼填寫區(qū)域觸發(fā)圖片、勾選、輸入驗(yàn)證碼、點(diǎn)擊登錄
time.sleep(3) #加載等待
print("點(diǎn)擊單位登錄")
time.sleep(3) #加載等待
driver.find_element_by_xpath("/html/body/div[1]/div[2]/div/div[2]/div[2]/button").click()#點(diǎn)擊單位登錄
time.sleep(3) #加載等待
print("正在填寫賬號(hào)")
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input")
# 清空原有內(nèi)容
elem.clear()
# 填入賬號(hào)
elem.send_keys("")
time.sleep(1) #加載等待
print("正在填寫密碼")
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input")
# 清空原有內(nèi)容
elem.clear()
# 填入密碼
elem.send_keys("")
time.sleep(1) #加載等待
print("正在查看驗(yàn)證碼")
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input").click()#查看驗(yàn)證碼
print("請(qǐng)輸入驗(yàn)證碼")
yanzhengma=input()
time.sleep(1) #加載等待
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input").click()#勾選
time.sleep(1) #加載等待
# 填入驗(yàn)證碼
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input")
elem.clear()
elem.send_keys(str(yanzhengma))
time.sleep(1) #加載等待
print("正在登陸")
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#點(diǎn)擊
點(diǎn)擊違法查詢,設(shè)置查詢時(shí)間
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#點(diǎn)擊
time.sleep(3) #加載等待
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/ul/li[5]/a").click()#點(diǎn)擊違法查詢
time.sleep(1) #加載等待
driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i").click()#點(diǎn)擊選擇日期
for i in range(3):
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/thead/tr/th[1]/i").click()#點(diǎn)擊
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]").click()#點(diǎn)擊
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]").click()#點(diǎn)擊
循環(huán)依次查詢每個(gè)車牌違章信息,每次都需要清空上次輸入,填寫本次查詢車牌,識(shí)別有多少條數(shù)據(jù),共多少頁,每頁最多展示10條,最后一頁有多少條數(shù)據(jù)
for ii in range(0,nrows):
rowValues= table.row_values(ii) #某一行數(shù)據(jù)
print('正在讀取第'+str(ii+1)+'輛車')
# 填寫車牌
time.sleep(0.5) #加載等待
elem = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input")
elem.clear()
elem.send_keys(rowValues)#輸入車牌
time.sleep(0.1) #加載等待
driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button").click()#點(diǎn)擊查詢
time.sleep(0.5) #加載等待
result=driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span").text#總違章條數(shù)
result=int(result)
a=math.ceil(result/10)#總頁數(shù)
b=result%10 #除余
讀取列表中的數(shù)據(jù),其中扣分和罰款需要點(diǎn)擊"查看詳情",從彈窗中讀取數(shù)據(jù)
result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看詳情,打開彈窗
time.sleep(1) #加載等待
result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[7]/span[2]"))).text
result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[8]/span[2]"))).text
result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
R.append(result)
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://div[@class='modal-footer ui_modal']/button"))).click()#關(guān)閉彈窗
time.sleep(0.5) #加載等待
每讀取一輛車的數(shù)據(jù)就寫入表格中
with open(wenjian,'w',encoding='utf-8',newline='') as fp:
writer = csv.writer(fp)
writer.writerows(R) #寫入數(shù)據(jù)
完整代碼
from selenium import webdriver
import time
import csv
import datetime
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
import math
import xlrd
data = xlrd.open_workbook('cheliang.xlsx')
table = data.sheets()[0]
nrows = table.nrows #行數(shù)
ncols = table.ncols #列數(shù)
opt = webdriver.ChromeOptions() #創(chuàng)建瀏覽
#opt.set_headless() #無窗口模式
driver = webdriver.Chrome(options=opt) #創(chuàng)建瀏覽器對(duì)象
driver.maximize_window() #最大化窗口
print("正在打開網(wǎng)頁")
driver.get('') #打開網(wǎng)頁
time.sleep(3) #加載等待
print("點(diǎn)擊單位登錄")
time.sleep(3) #加載等待
driver.find_element_by_xpath("/html/body/div[1]/div[2]/div/div[2]/div[2]/button").click()#點(diǎn)擊單位登錄
time.sleep(3) #加載等待
print("正在填寫賬號(hào)")
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[1]/div/input")
# 清空原有內(nèi)容
elem.clear()
# 填入賬號(hào)
elem.send_keys("")
time.sleep(1) #加載等待
print("正在填寫密碼")
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[2]/div/input")
# 清空原有內(nèi)容
elem.clear()
# 填入密碼
elem.send_keys("")
time.sleep(1) #加載等待
print("正在查看驗(yàn)證碼")
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input").click()#查看驗(yàn)證碼
print("請(qǐng)輸入驗(yàn)證碼")
yanzhengma=input()
time.sleep(1) #加載等待
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[4]/div/label/input").click()#勾選
time.sleep(1) #加載等待
# 填入驗(yàn)證碼
elem = driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[3]/div/input")
elem.clear()
elem.send_keys(str(yanzhengma))
time.sleep(1) #加載等待
print("正在登陸")
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/div/div[2]/form[1]/div[5]/button").click()#點(diǎn)擊
time.sleep(3) #加載等待
driver.find_element_by_xpath("/html/body/div[4]/div/div[1]/ul/li[5]/a").click()#點(diǎn)擊違法查詢
time.sleep(1) #加載等待
driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[1]/div/div[1]/span/i").click()#點(diǎn)擊選擇日期
for i in range(3):
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/thead/tr/th[1]/i").click()#點(diǎn)擊
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[4]/table/tbody/tr/td/span[1]").click()#點(diǎn)擊
time.sleep(0.5) #加載等待
driver.find_element_by_xpath("/html/body/div[6]/div[3]/table/tbody/tr[2]/td[1]").click()#點(diǎn)擊
wenjian=datetime.datetime.now().strftime('%Y-%m-%d-%H%M%S') #以開始時(shí)間作為數(shù)據(jù)導(dǎo)出的表格文件名
wenjian=wenjian+'.csv'
R=[]
for ii in range(0,nrows):
rowValues= table.row_values(ii) #某一行數(shù)據(jù)
print('正在讀取第'+str(ii+1)+'輛車')
# 填寫車牌
time.sleep(0.5) #加載等待
elem = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[3]/div/input")
elem.clear()
elem.send_keys(rowValues)#輸入車牌
time.sleep(0.1) #加載等待
driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[1]/div[2]/form/div[4]/button").click()#點(diǎn)擊查詢
time.sleep(0.5) #加載等待
result=driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div[2]/div[1]/div/p/span").text#總違章條數(shù)
result=int(result)
a=math.ceil(result/10)#總頁數(shù)
b=result%10 #除余
for i in range(1,a):
for j in range(1,11):
result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
#result1=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]").text
#result2=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]").text
#result3=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]").text
#result4=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]").text
#result5=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]").text
#result6=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]").text
#result7=driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]").text
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看詳情,打開彈窗
time.sleep(1) #加載等待
#driver.find_element_by_xpath("http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a").click()#點(diǎn)擊列表中的元素
#time.sleep(0.5) #加載等待
result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[7]/span[2]"))).text
result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[8]/span[2]"))).text
#result8=driver.find_element_by_xpath("http://form[@class='form-horizontal']/div[7]/span[2]").text
#result9=driver.find_element_by_xpath("http://form[@class='form-horizontal']/div[8]/span[2]").text
result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
R.append(result)
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://div[@class='modal-footer ui_modal']/button"))).click()#關(guān)閉彈窗
time.sleep(0.5) #加載等待
#driver.find_element_by_xpath("http://div[@class='modal-footer ui_modal']/button").click()#點(diǎn)擊列表中的元素
#time.sleep(0.5) #加載等待
driver.find_element_by_link_text("下一頁").click()#翻頁
time.sleep(0.5) #加載等待
if b>0:
for j in range(1,b+1):
result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看詳情,打開彈窗
time.sleep(1) #加載等待
result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[7]/span[2]"))).text
result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[8]/span[2]"))).text
result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
R.append(result)
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://div[@class='modal-footer ui_modal']/button"))).click()#關(guān)閉彈窗
time.sleep(0.5) #加載等待
if b==0:
for j in range(1,11):
result1=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[1]"))).text
result2=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[2]"))).text
result3=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[3]"))).text
result4=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[4]"))).text
result5=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[5]"))).text
result6=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[6]"))).text
result7=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[7]"))).text
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://table[@id='my-msg-list']/tbody/tr["+str(j)+"]/td[8]/a"))).click()#查看詳情,打開彈窗
time.sleep(1) #加載等待
result8=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[7]/span[2]"))).text
result9=WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://form[@class='form-horizontal']/div[8]/span[2]"))).text
result=[result1,result2,result3,result4,result5,result6,result7,result8,result9]
R.append(result)
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"http://div[@class='modal-footer ui_modal']/button"))).click()#關(guān)閉彈窗
time.sleep(0.5) #加載等待
time.sleep(0.5) #加載等待
with open(wenjian,'w',encoding='utf-8',newline='') as fp:
writer = csv.writer(fp)
writer.writerows(R) #寫入數(shù)據(jù)
到此這篇關(guān)于Python selenium模擬網(wǎng)頁點(diǎn)擊爬蟲交管12123違章數(shù)據(jù)的文章就介紹到這了,更多相關(guān)Python selenium模擬點(diǎn)擊爬蟲內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
您可能感興趣的文章:- python爬蟲之利用Selenium+Requests爬取拉勾網(wǎng)
- python爬蟲selenium模塊詳解
- python實(shí)現(xiàn)selenium網(wǎng)絡(luò)爬蟲的方法小結(jié)
- python爬蟲利用selenium實(shí)現(xiàn)自動(dòng)翻頁爬取某魚數(shù)據(jù)的思路詳解
- Python爬蟲之Selenium實(shí)現(xiàn)關(guān)閉瀏覽器
- Python爬蟲中Selenium實(shí)現(xiàn)文件上傳
- Python爬蟲之Selenium下拉框處理的實(shí)現(xiàn)
- 教你如何使用Python selenium