Selenium: Using Python to Find Elements via XPath

Scraping Robot
October 2, 2024
Community

For those engaging in test automation or web scraping, it is very common to need to fetch elements when utilizing Selenium. This resource has several methods available to find elements. However, Selenium by XPath is one of the most important links to keep in mind. The use of XPath tends to be one of the best ways to do this, even over some of the more complicated methods out there. When you know how to use it, it can handle most tasks for you. Take a look at how Python and XPath relate and why you should use this tool.

Table of Contents

How to Use XPath in Selenium with Python to Find Elements: The XPath Python Connection Explained

how to use XPath in selenium with python

Selenium supports numerous programming languages, which is one reason it is so well used. XPath may not be as well known to you, though.

What is XPath in Selenium, then? XML Path Language, or XPath, is a type of syntax that navigates through elements on the Document Object Model (DOM). Specifically, XPath targets nodes from the XML document you have. What makes this significantly different from other solutions is that it is not restricted to just using HTML.

One of the most common reasons to use Selenium by XPath is in web testing automation, though web scraping is another aspect of its functionality. It works quickly to find elements on a page, and once you learn a bit more about how it works, it will quickly become your go-to solution.

You can give it a try to see what happens by opening up your favorite DevTool. Then, input the following command:

(“.//nav”)

You will need to add the DevTool component as well. When this happens, it will locate any <nav> elements. It can be used to just locate one, but more commonly, it can pull in a variety of elements working to efficiently navigate the entire process.

What You Need to Know to Use Selenium by XPath

need to know to use selenium by XPath

In the next few minutes, we will provide you with a hands-on process for trying out XPath and seeing what it can do for your situation and help you accomplish a wide range of goals in test automation and web scraping.

In order to use Selenium by XPath, you will need to have some experience with both Python and Selenium and have the most recent versions updated on your system. You also need to have a development tool that will allow you to drive a Google Chrome browser. For this, we will use Chrome DevTools, but there are others. Finally, be sure you have installed a Python project with Selenium bindings.

To make sure you’re ready to use this tool, open up your project and then add the following to the main file:

from selenium import webdriver

browser = webdriver.Chrome()

browser.get(‘https://selenium.dev/’)

Assuming your file is named main.py, run it:

python main.py

If it works properly, then a Chrome window should open and go directly to the Selenium website. You will then see it close. This is exactly what you need to happen.

Now, let’s break down how to find a single element with Python Selenium by XPath. To keep things simple, we will stick with the website you already have, Selenium. When you start to try out Selenium XPath and Python together yourself, you can continue to use the same lines of code below with changes.

To find an element using the XPath Selenium Python method, follow these directions:

Choose an element on the page you are on. In our example, we will use “Documentation” which is the third navigational tool on the Selenium website.

Consider the page’s DOM hierarchy next. By doing this, you have a better idea of how to reach the element you have selected. To get an idea of the hierarchy, right-click on the Documentation item on the screen. Then, click “Inspect element.” This will pop up an image of the hierarchy. It will follow this flow:

/html/body/header/nav/div/ul/li/a

You will see eight <li> elements, and on this page, there are eight of them. Look for the one you are interested in. In our case, we are looking for the -nav element. That’s the fourth on the list. It contains an <a> element as well as a link to the Documentation page.

Use a relative path. A query will start at the root element. In our situation, the query could look like this:

“./html/body/header/nav/div/ul/li[4]/a”

Looking at this, you have:

  • ./ at the start, meaning we are targeting an absolute path.
  • All of the elements are listed after this, starting with the root and ending with the specific node (<a> node).
  • We used an index in this line of code to reach the fourth list item. The link is listed inside of it.
  • This index is 1-based instead of 0-based.

Now, let’s see how the Python Selenium XPath solution works for you. We can take the query listed above and update it to a command that will run. To do that, start by adding the Selenium import:

from Selenium.webdriver.common.by import By

After this, include the following two lines of code at the bottom of the file:

target_element = browser.find_element(By.XPATH,

“./html/body/header/nav/div/ul/li[4]/a”)

print(target_element.text)

In this code, we are using the find_element method to locate the node desired. The first line location means that the method is meant to search using XPath, and the second argument is the path itself. We also added a print statement to give the command something to do.

Now that you have all of that, run the code. When you do, you will see “Documentation” listed on the console.

Now that’s certainly a lot of work to find something that we can obviously see. However, you can find an element by the XPath Selenium Python combo that’s more complex and use it to find more than one solution as well.

In this example, we used an absolute path to find an element. The problem with this is that it is rather limited, and if there is any change, which is common, that means the path is invalid, and your task will not function.

This is why we often use relative paths, which means that they are relative to a specific element. These are intrinsically more resilient when changes to the structure of the page occur.

So, now let’s take the same code as above and use Selenium by XPath to use a relative path to do the same thing.

All you need to do is change out the find_element.

“//nav/div/ul/li[4]/a”

Doing this, there are two // at the start of the common. It also starts at the .nav element.

By doing this, anything that changes from the upper portion of the hierarchy will not impact this function.

Understanding Selenium by XPath Functions

check selenium by XPath functions

Using Python and Selenium to find elements by XPath requires understanding functions. To be efficient, you have to know how to incorporate several specific functions that can make a significant difference in the entire process. There are several functions, but for this, we will focus on two of the most important components.

Contains(): It is not hard to see what this bit of code will mean. Try out this example:

paragraph = browser.find_element(By.XPATH, “//p[contains(text(), ‘power’)]”)

print(paragraph.text)

If you input that properly, you will see the following:

What you do with that power is entirely up to you.

In this function, the contains() function provides for two arguments. The first is the text() attribute, which will find the text from the element you searched. It will also include the string you are looking for.

Another way to say the same thing is that we are using Selenium to find an element by XPath Python by telling it to give us all of the <p> elements that contain the text “power.” It uses the find_element to do this, which means the first match will show up.

Start_with() : The next function to use is this one. It is very easy to change up the same line of code already created to use the starts_with() function:

paragraph = browser.find_element(By.XPATH, “//p[starts-with(text(), ‘What’)]”)

In this situation, we are using the text() attribute to find the specific text we want from the elements and the starts_with() function to find the substring “what”.

The entire process of using Selenium by XPath gets easier over time, especially as you learn more of the steps necessary to build onto this, including finding multiple elements and navigating the range of functions.

As noted, this is a commonly used solution for test automation, but you can also use it for web scraping. Using XPath with Python and Selenium for web scraping can provide you with a wide range of functional outcomes. Web scraping, which is a critical method for pulling valuable data off of websites for decision-making or other tasks, can be a powerful resource for companies who master it.

Scraping Robot can help you with the process even more so. With our API, the process of finding the information you need is incredibly easy. Check out Scraping Robot’s basic functionality now to see what we mean.

The information contained within this article, including information posted by official staff, guest-submitted material, message board postings, or other third-party material is presented solely for the purposes of education and furtherance of the knowledge of the reader. All trademarks used in this publication are hereby acknowledged as the property of their respective owners.