[PYTHON] I can't get the element in Selenium!

Today's troubles

Automatic verification of websites and web systems using Selenium.

I can't get this element by using find_element_by_xpath! I could take the element just before this ...!

Background

Originally, there was a website that said, "For some reason, this site cannot be verified with Selenium."

Ew No way www The way of assembling the person who made it (no more) is bad www

Ask the members, "Would you like to reassemble?" Then, this word was returned.

I see. I threw it all, so let's see it in HTML.

This solution

** Everyone, let's add class and id attributes to HTML! ** **

… But it is nonsense to modify the screen to do “verification automation”. Website Although it is called a website, the content is a web system anyway. It could be php or JSP.

This time, I want to get the text of article 2 from sample.html! I would like to explain that it was a requirement.

sample.html


<html>
    <head><!--abridgement--></head>
    <body>
        <div id='wrap'>
            <div class='article'>
                <article>
                    <h1>Article 1 title</h1>
                    <div>
                        <p>Body of article 1</p>
                    </div>
                </article>
            </div><!-- .article -->
            <div class='article'>
                <article>
                    <h1>Article 2 title</h1>
                    <div>
                        <p>Body of article 2</p>
                    </div>
                </article>
            </div><!-- .article -->
        </div><!-- #wrap -->
    </body>
</html>

Here is the specification method that was said to "do not work".

Doesn't work.py


path = "/html/body/div/div[2]/article/div"
elmt = driver.find_element_by_xpath(path)

It seems that you want to specify the div in the second ʻarticle of the div (class = article) in the div (id = wrap) in the body in the html.

Hmmmm. Isn't that "/ html / body / div [5]"?

I think it works.py


path = "/html/body/div[5]"
elmt = driver.find_element_by_xpath(path)

In the xpath specification, ** it doesn't matter how the screen elements are nested **. ** "How many times did the div appear from the top?" ** is the criterion for judgment. Is it similar to the CSS pseudo-class : nth-child ()?

I want to hide the second <div class ='article'>

Various things disappear.css


div#wrap div:nth-child(2) { display: none; }

If you specify, the "second div" in the first div # wrap will be erased. In the case of sample.html, the text of article 1 and article 2 are all hidden. (Because it is a sample, specify the class! ... Don't say)

Hmm. Something is wrong. ..

Conclusion

After all, programmers are a race that lives by worrying about indentation and nesting, so it can be misunderstood, but I do not know what kind of structure the HTML is written in an external program. Whether it's CSS, Python, or Java, they don't move in the light of human will. Only the ** people ** who developed it know that "there are multiple articles and there are similar blocks" as in this case.

** When specifying an element with find_element_by_xpath, how many times does the element appear counting from the top? Please pay attention to **.

Recommended Posts

I can't get the element in Selenium!
When I get a chromedriver error in Selenium
I can't enter characters in the text area! ?? !! ?? !! !! ??
I can't click the Selenium checkbox Python VBA
Get the value selected in Selenium Python VBA pull-down
I can't manipulate iframes in a page with Selenium
[Note] I can't call the installed module in jupyter
I can't use the darknet command in Google Colaboratory!
I can't log in to the admin page with Django3
Get the number of occurrences for each element in the list
Get the index of each element of the confusion matrix in Python
I get a UnicodeDecodeError in mecab-python3
I got lost in the maze
I participated in the ISUCON10 qualifying!
Get the first element of queryset
I can't install scikit-learn in Python
I wrote the queue in Python
Get the desktop path in Python
Get the host name in Python
I wrote the stack in Python
Get the query string (query string) in Django
How to determine the existence of a selenium element in Python
I get a can't set attribute when using @property in python
What to do when you get "I can't see the site !!!!"
I can't import modules in the parent directory even with sys.path.append ('..')
Get the client's IP address in Django
Get html from element with Python selenium
Get the top nth values in Pandas
I can't debug python scripts in Eclipse
I saved the scraped data in CSV!
I wrote the selection sort in C
Python's "I can't reach the itch ..." feature
Get the EDINET code list in Python
I wrote the sliding wing in creation.
I can't install the package with pip.
Why can't I install matplotlib in python! !!
Get a capture of the entire web page in Selenium Python VBA
Click the Selenium links in order to get the elements of individual pages
I can't find the commands in the package introduced by pip from Emacs
[Python] Get the files in a folder with Python
Get the weather in Osaka via WebAPI (python)
I can't use the "next_results" parameter in the Twitter API Search API! ?? Causes and remedies
Selenium + Firefox 47+ Can't load the profile. Error handling
When I name the file flask.py in Flask, I get Import Error: cannot import name'Flask'
I tried the least squares method in Python
Get only the subclass elements in a list
I get a strange window when I use the open directory dialog in Tkinter
Get the X Window System window title in Python
How to get the files in the [Python] folder
I wrote the hexagonal architecture in go language
I implemented the inverse gamma function in python
I want to get the file name, line number, and function name in Python 3.4
I checked the calendar deleted in Qiita Advent Calendar 2016
Use pygogo to get the log in json.
Get the last element of the array by splitting the string in Python and PHP
I implemented Human In The Loop ― Part ① Dashboard ―
I want to display the progress in Python!
Get a row containing a specific element in np.where
I tried to graph the packages installed in Python
How to get the variable name itself in python
Get the file name in a folder using glob