以python语言为例,演示Xpath的用法
0x00 安装lxml模块
lxml下载地址:https://pypi.python.org/pypi/lxml/
Windows系统的话,下载对应的系统、python版本的安装包,双击安装即可。
检验是否成功,在python IDE中,输入from lxml import etree
,如果没有报错,即是成功。
0x01 使用Xpath
Xpath支持获取xml和html文件。
import requests
from lxml import etree
link = 'http://bj.zu.anjuke.com/fangyuan/jiuxianqiao/'
link_req = requests.get(link)
link_re_etree = etree.HTML(link_req.content)
room_addr = []
for i in range(1,9):
try:
room_addr.append(link_re_etree.xpath('///*[@id="list-content"]/div[@_soj="Filter_'+str(i)+'"]/div[1]/address/a/text()')[0].strip())
a = room_addr[-1]
print a
except :
pass