Introduction to Requests

Discover the Requests library and header spoofing.

We'll cover the following

The requests library
Headers spoofing
- User-Agent
  - Try it yourself
- HTTP-Cookies
  - Try it yourself
Post request
Conclusion

We have covered how a browser communicates with a website server by sending an HTTP request and receiving an HTML response that includes the Document Object Model (DOM) structure. Now, we plan to implement the same procedure in our script to ensure that it accurately emulates the actions of a browser. Our primary objective is to replicate the behavior of a browser to accomplish our desired outcomes.

The `requests` library

It is a Python library that enables us to send HTTP requests to website servers and quickly receive the response objects.

Press + to interact

The above code sends an HTTP request to the Books to Scrape website and retrieves the response object.

The response object has several attributes, such as:

object.URL: The address of the site being requested.
object.status_code: The status of the request and the server's response.
object.history: A list of the response object’s history after redirection.
object.header: Information about the server response that does not relate to the content, such as the date.
object.text: The content of the response as a string.
object.content: The content of the response in bytes.

Headers spoofing

Spoofing refers to sending false headers with values that match those of a typical browser. We can include additional information when requesting a website to help the server understand the request and customize its response. However, websites often use this header information to block requests that don't match what a typical browser would send.

Let’s explore the ShellHacks website using the network tool to see these headers.

Press + to interact

HTTP-Cookies

It contains data about the user's browsing history, which the server sends to the browser. It can then be stored and returned with subsequent requests to the same server. The values in the cookies can be used to identify the sender’s identity.

Note: Check out this blog for additional knowledge about HTTP cookies.

We need to carefully set the cookie’s values to match those of the browser and make our script bypass any cookie restrictions.

Try it yourself

After logging into Quotes to Scrape, inspect the network tool and copy the value of the session cookie. Then, paste it into the code widget below. We should see the logout label as if we had previously logged in.

Press + to interact

Get hands-on with 1300+ tech skills courses.

Introduction to Course Content and Web Scraping

Fundamental Concepts of Web Scraping

Dynamic Sites with Selenium

Assessment: Python Scraping

Scrapy Framework

Scraping Educative’s Courses Information

Wrap Up

Introduction to Requests

The `requests` library

Headers spoofing

User-Agent

Try it yourself

HTTP-Cookies

Try it yourself

Post request

Conclusion

Introduction to Course Content and Web Scraping

Fundamental Concepts of Web Scraping

Dynamic Sites with Selenium

Assessment: Python Scraping

Scrapy Framework

Scraping Educative’s Courses Information

Wrap Up

Introduction to Requests

The requests library

Headers spoofing

User-Agent

Try it yourself

HTTP-Cookies

Try it yourself

Post request

Conclusion

The `requests` library