Introduction to lxml

Learn how to scrape and navigate the HTML DOM using XPath.

Now that we have covered XPath, it's time to put our knowledge into practice and explore its practical applications in extracting data from static and dynamic websites.

lxml

Although Beautiful Soup alone does not have built-in support for XPath, we can leverage another library to harness the power of XPath. lxml is a highly valuable Python library for web scraping. While its primary focus is parsing XML, it also offers support for HTML. Notably, lxml allows us to utilize both XPath and CSS selectors, making it a versatile tool for data extraction. As a result, it serves as an excellent alternative to Beautiful Soup.

Usage

Let's take a look at how we can use it.

Get hands-on with 1200+ tech skills courses.