html - XPATH - Extract content between two positions with different indent (python) -
i'm trying data html file via url. here's example:
<html> ... <div class="start"> <!-- here.. --> <p></p> <p><a href=''></a> <span></span <br> <!-- ..to here --> <div class="end"> ... </div> ... ... </div> ... </html>
i'm trying the data directly under div class="start"
, don't know how, since div contains whole page. know, div class="end"
comes right after data want. keep in mind don't want text in between, different elements, in case <p> & <span> & <a>
. note element types may vary showing in html above.
google gave me different types of (without luck): '//*[preceding-sibling::div[@class="start"] , following-sibling::div[@class="end"]]'
you got close googling. looks want is
//div[@class="start"]/*[following-sibling::div[@class="end"]]
since <div class="start">
parent (not sibling) of data want select, use div[@class="start"]/*
in xpath, instead of *[preceding-sibling::div[@class="start"]]
.
Comments
Post a Comment