Skip to content

2. Perceptron

Deadline and Submission

📅 14.sep (sunday)

🕐 Commits until 23:59

Individual

Submission the GitHub Pages' Link (yes, only the link for pages) via insper.blackboard.com.

Activity: Understanding Perceptrons and Their Limitations

This activity is designed to test your skills in Perceptrons and their limitations.


Exercise 1

Data Generation Task:

Generate two classes of 2D data points (1000 samples per class) using multivariate normal distributions. Use the following parameters:

  • Class 0:

    Mean = \([1.5, 1.5]\),

    Covariance matrix = \([[0.5, 0], [0, 0.5]]\) (i.e., variance of \(0.5\) along each dimension, no covariance).

  • Class 1:

    Mean = \([5, 5]\),

    Covariance matrix = \([[0.5, 0], [0, 0.5]]\).

These parameters ensure the classes are mostly linearly separable, with minimal overlap due to the distance between means and low variance. Plot the data points (using libraries like matplotlib if desired) to visualize the separation, coloring points by class.

Perceptron Implementation Task:

Implement a single-layer perceptron from scratch to classify the generated data into the two classes. You may use NumPy only for basic linear algebra operations (e.g., matrix multiplication, vector addition/subtraction, dot products). Do not use any pre-built machine learning libraries (e.g., no scikit-learn) or NumPy functions that directly implement perceptron logic.

  • Initialize weights (w) as a 2D vector (plus a bias term b).
  • Use the perceptron learning rule: For each misclassified sample \((x, y)\), update \(w = w + η * y * x\) and \(b = b + η * y\), where \(η\) is the learning rate (start with \(η=0.01\)).
  • Train the model until convergence (no weight updates occur in a full pass over the dataset) or for a maximum of 100 epochs, whichever comes first. If convergence is not achieved by 100 epochs, report the accuracy at that point. Track accuracy after each epoch.
  • After training, evaluate accuracy on the full dataset and plot the decision boundary (line defined by \(w·x + b = 0\)) overlaid on the data points. Additionally, plot the training accuracy over epochs to show convergence progress. Highlight any misclassified points in a separate plot or by different markers in the decision boundary plot.

Report the final weights, bias, accuracy, and discuss why the data's separability leads to quick convergence.

Exercise 2

Data Generation Task:

Generate two classes of 2D data points (1000 samples per class) using multivariate normal distributions. Use the following parameters:

  • Class 0:

    Mean = \([3, 3]\),

    Covariance matrix = \([[1.5, 0], [0, 1.5]]\) (i.e., higher variance of 1.5 along each dimension).

  • Class 1:

    Mean = \([4, 4]\),

    Covariance matrix = \([[1.5, 0], [0, 1.5]]\).

These parameters create partial overlap between classes due to closer means and higher variance, making the data not fully linearly separable. Plot the data points to visualize the overlap, coloring points by class.

Perceptron Implementation Task:

Using the same implementation guidelines as in Exercise 1, train a perceptron on this dataset.

  • Follow the same initialization, update rule, and training process.
  • Train the model until convergence (no weight updates occur in a full pass over the dataset) or for a maximum of 100 epochs, whichever comes first. If convergence is not achieved by 100 epochs, report the accuracy at that point and note any oscillation in updates; consider reporting the best accuracy achieved over multiple runs (e.g., average over 5 random initializations). Track accuracy after each epoch.
  • Evaluate accuracy after training and plot the decision boundary overlaid on the data points. Additionally, plot the training accuracy over epochs to show convergence progress (or lack thereof). Highlight any misclassified points in a separate plot or by different markers in the decision boundary plot.

Report the final weights, bias, accuracy, and discuss how the overlap affects training compared to Exercise 1 (e.g., slower convergence or inability to reach 100% accuracy).

Evaluation Criteria

Usage of Toolboxes

You may use toolboxes (e.g., NumPy) ONLY for matrix operations and calculations during this activity. All other computations, including activation functions, loss calculations, gradients, and the forward pass, MUST BE IMPLEMENTED within your Perceptron code. The use of third-party libraries for the Perceptron implementation IS STRICTLY PROHIBITED.

Failure to comply with these instructions will result in your submission being rejected.

The deliverable for this activity consists of a report that includes:

Important Notes:

  • The deliverable must be submitted in the format specified: GitHub Pages. No other formats will be accepted. - there exists a template for the course that you can use to create your GitHub Pages - template;

  • There is a strict policy against plagiarism. Any form of plagiarism will result in a zero grade for the activity and may lead to further disciplinary actions as per the university's academic integrity policies;

  • The deadline for each activity is not extended, and it is expected that you complete them within the timeframe provided in the course schedule - NO EXCEPTIONS will be made for late submissions.

  • AI Collaboration is allowed, but each student MUST UNDERSTAND and be able to explain all parts of the code and analysis submitted. Any use of AI tools must be properly cited in your report. ORAL EXAMS may require you to explain your work in detail.

  • All deliverables for individual activities should be submitted through the course platform insper.blackboard.com.

Grade Criteria:

Criteria Description
4 pts Correctness of the perceptron implementation
2 pts Exercise 1: Data generation, training, evaluation, and analysis.
2 pts Exercise 2: Data generation, training, evaluation, and analysis.
1 pt Visualizations: Quality and clarity of plots (data distribution, decision boundary, accuracy over epochs).
1 pt Report Quality: Clarity, organization, and completeness of the report.