import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
from sklearn.datasets import make_blobs
Mia Tarantola Perceptron Blog Post
Introduction
In this blog post we will delve into the perceptron algorithm. I have implemented a algorithm that separates binary data. We have also conducted experiments that push the limits of our algorithm and show cases different patterns
Importing Packages
#importing perceptron.py code and updating
import perceptron
from perceptron import Perceptron
import importlib
reload(perceptron) importlib.
<module 'perceptron' from '/Users/mtarantola@middlebury.edu/Downloads/Machine Learning/miatarantola.github.io/posts/Perceptron Blog Post/perceptron.py'>
Walk Through of the Update Function
The following function is used to update the perceptron weights:
\(\tilde{w} ^{(t+1)} = \tilde{w} ^ {(t)} + \mathbb{1}(\tilde{y_i} \langle \tilde{w}^{(t)}, \tilde{x_i} \rangle < 0) \tilde{y_i} \tilde{x_i}\)
In order to implement this algorithm, we must follow these steps.
\({1.}\) pick a random index \(~{i} \in\) n.
\({2. }\) predict the label, \(\hat{y_i}\), of our randomly selected data point, \(\tilde{x_i}\).
To do so we calculate the dot product, \(\langle \tilde{w}^{(t)}, \tilde{x_i} \rangle\) and compare the resulting value to 0.
if the result is greater than 0 return 1, otherwise return -1.
\({3.}\) compute \(\mathbb{1}(\tilde{y_i} \langle \tilde{w}^{(t)}, \tilde{x_i} \rangle < 0)\)
Given our predicted label, \(\hat{y_i}\) , of 1 or -1 we can multiply by \(\tilde{y_i}\) to check for correctness
If \(\hat{y_i} \tilde{y_i} <0\) then the signs of our observed and predicted label do not match. If \(\hat{y_i} \tilde{y_i} >0\) then the signs of our predicted and observed label match
So, \(\mathbb{1}(\tilde{y_i} \langle \tilde{w}^{(t)}, \tilde{x_i} \rangle < 0)\) returns 1 if model predicted incorrectly and \(\tilde{y_i} \tilde{x_i}\) is added to the pre-existing weights.
Otherwise the expression returns 0 for a correctly prediction and no update is performed
Experiments
Data that is linearly separable
8674)
np.random.seed(
= 100
n = 3
p_features
= make_blobs(n_samples = 100, n_features = 2,centers=[(-1.7,-1.7),(1.7,1.7)])
X1, y1
= plt.scatter(X1[:,0], X1[:,1], c = y1)
fig1 = plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
= Perceptron() p1
100000) p1.fit(X1,y1,
= plt.plot(p1.history)
fig1 = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
p1.score(X1,y1)
1.0
def draw_line(w, x_min, x_max):
= np.linspace(x_min, x_max, 101)
x = -(w[0]*x + w[2])/w[1]
y = "black")
plt.plot(x, y, color
= plt.scatter(X1[:,0], X1[:,1], c = y1)
fig = draw_line(p1.w, -2, 2)
fig
= plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
For this experiment we ran our perceptron algorithm on linearly separable data. We can see that the reuslting line is a good separator as it clearly separates our two labels. It is interesting to see that the accuracy decreases before it increases. Another result that suggests our separator is a good fit is that our algorithm stops before the maximum number of iterations allowed
Data that is not linearly separable
7810) #7810
np.random.seed(
= 100
n = 3
p_features
= make_blobs(n_samples = 100,n_features=2, centers=2)
X2, y2
= plt.scatter(X2[:,0], X2[:,1], c = y2)
fig2 = plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
= Perceptron() p2
1000) p2.fit(X2,y2,
= plt.plot(p2.history)
fig3 = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
p2.score(X2,y2)
0.5
= plt.scatter(X2[:,0], X2[:,1], c = y2)
fig = draw_line(p2.w, -2, 2)
fig
= plt.xlabel("Feature 1")
xlab = plt.ylabel("Feature 2") ylab
This experiment displays the use of our perceptron algorithm on nonlinearly separable data. As shown above, the resulting line is not a good separater. In fact, it isn’t separating any data. We can also see the inaccuracy by tracking the accuracy over iterations. We can see that the algorihtm is not converging to one set of weights and thus has not found an optimizer.
The perceptron algorithm in >2 dimensions
7810) #7810
np.random.seed(
= 100
n
= make_blobs(n_samples = 100,n_features=6, centers=2) X3, y3
= Perceptron() p3
1000) p3.fit(X3,y3,
= plt.plot(p3.history)
fig4 = plt.xlabel("Iteration")
xlab = plt.ylabel("Accuracy") ylab
p3.score(X3,y3)
1.0
Yes, I do believe that my data is linearly separable because my perceptron’s accuracy converges to 1 after 18 iterations. This means that my perceptron reached 100% accuracy and did not complete upon reaching that maximum number of iterations.
Ending Question
The time complexity for a single iteration of the perceptron algorithm is O(p) because predicting the label of a single data point requires calculating the dot product of x and w. Thus, np iterates over the p features of x to calulate the dot product. All other steps of this equation invlove simple multiplication or addition which are O(1).