SVM
Learn how to solve classification and regression problems using support vector machine (SVM).
We'll cover the following
A support vector machine (SVM) is a popular type of supervised learning algorithm that can be used for both regression and classification tasks. SVM works by finding a hyperplane that maximally separates different classes or fits the data in the case of regression. This hyperplane is chosen such that it has the maximum distance to the nearest data points of each class, also known as the maximum margin.
When the behavior in our data is nonlinear and difficult for linear models to use, SVM can be a good alternative.
In the case of classification, SVM can be used when we have just two or many categories. It’s good at handling data that can be separated by straight lines or more complicated shapes, thanks to a trick called kernel functions, which help draw clearer lines in higher-dimensional space.
In regression, SVMs predict numbers by finding the best-fitting line for the data. Just like in classification, they use kernel functions to make this line-drawing process work better in a higher-dimensional space.
In this lesson, we’ll explore the use of SVMs for both regression and classification tasks. We’ll learn about the different types of SVMs available in scikit-learn, such as support vector classifier (SVC) and support vector regression (SVR). Additionally, we’ll delve into the key hyperparameters within SVM, exploring its strengths and limitations.
Intuition
To develop an intuition of how SVM works, imagine someone entering a party filled with two large groups of people talking. The person who just entered doesn’t feel like talking to either group right now. They just want to cross the venue and get to the bar to have a drink. However, the person doesn’t want to go through either group on the way to the bar because they don’t want to awkwardly walk through the middle of a conversation. They also doesn’t want to get too close to any of these groups because someone might spot them and start chatting, which would distract them from getting their first drink. As a result, they try to walk between both groups, maintaining the largest distance from both of them.
This is roughly how SVM works. By creating a “line” that goes through different classes, SVM avoids splitting observations from the same class and keeps the largest distance possible from those classes. In the scenario above, the two groups represent the two different classes you want to separate, and the path through the room represents the decision boundary created by the SVM.
SVC
The C-Support Vector Classification (SVC) is a type of SVM that is specifically designed for classification tasks. The key concepts in SVM classification include the following:
Hyperplane: In a binary classification problem, SVM finds the hyperplane that best separates the data into two classes. The hyperplane is represented as
, where is the weight vector, and is the bias term. Margin: The margin is the distance between the hyperplane and the nearest data point from each class. SVM aims to maximize this margin while ensuring that no data points are misclassified.
Support vectors: Support vectors are the data points that are closest to the hyperplane, and they play a crucial role in determining the position of the hyperplane.
Kernel trick: SVM can handle nonlinearly separable data by mapping the input features into a higher-dimensional space using a kernel function (e.g., polynomial kernel or radial basis function kernel). In this higher-dimensional space, SVM tries to find a linearly separable hyperplane.
SVC is a type of SVM that aims to find the best hyperplane that separates different classes in the data while maximizing the margin:
Get hands-on with 1200+ tech skills courses.