Week 4: Learning¶
CS50's Introduction to Artificial Intelligence with Python
Machine learning enables computers to recognize patterns from data rather than following explicit instructions. Three primary approaches: supervised learning, reinforcement learning, and unsupervised learning.
Supervised Learning¶
Trains algorithms using input-output pairs to map inputs to outputs. The goal is creating a hypothesis function h that approximates an unknown natural function f.
Classification¶
Maps inputs to discrete outputs.
Nearest-Neighbor Classification¶
Assigns classification based on the single closest observation. Fails when a nearby outlier dominates the decision.
k-Nearest-Neighbors¶
Examines the k nearest data points and selects the most frequent classification among them.
- More accurate than 1-NN
- Requires computing distances to every point — computationally expensive at scale
Perceptron Learning¶
Establishes a decision boundary separating data classes.
- Input vector: x = (1, x₁, x₂)
- Weight vector: w = (w₀, w₁, w₂)
- Decision rule: Rain if
w₀ + w₁x₁ + w₂x₂ ≥ 0
Perceptron learning rule: Weights update iteratively based on prediction errors. - Correct prediction → no update - Underestimate → increase weights - Overestimate → decrease weights
Hard threshold: Binary output (0 or 1) — absolute predictions.
Soft threshold: Logistic function output between 0 and 1 — expresses confidence levels.
Support Vector Machines¶
Creates decision boundaries that maximize distance from both classes — the Maximum Margin Separator.
- Handles non-linear boundaries and higher dimensions
- The maximum margin ensures new data points slightly differing from training examples won't be misclassified
Regression¶
Maps inputs to continuous values rather than discrete classifications.
Example: predicting sales revenue based on advertising expenditure. The hypothesis function generates a line predicting output values rather than separating categories.
Loss Functions¶
Quantify prediction inaccuracy.
| Function | Formula | Notes |
|---|---|---|
| 0-1 Loss | 0 if correct, 1 if wrong | Classification |
| L₁ Loss | |actual − predicted| | Absolute differences |
| L₂ Loss | (actual − predicted)² | Penalizes outliers more severely |
Overfitting and Regularization¶
Overfitting: Model fits training data perfectly but fails on new data.
Regularization penalizes complexity:
Higher λ → discourages complexity → less overfitting risk.
Validation Strategies¶
Holdout Cross Validation: Split data into training and test sets.
k-Fold Cross-Validation: Divide data into k sets; train k times rotating test sets. Averages generalization estimates without losing data.
scikit-learn¶
from sklearn import svm
from sklearn.linear_model import Perceptron
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
# Train on evidence + labels, evaluate on test data
model = Perceptron()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
Reinforcement Learning¶
Trains agents through reward and punishment feedback. Agents observe environmental states, perform actions, and receive rewards (positive) or punishments (negative) based on outcomes.
Markov Decision Processes¶
| Component | Description |
|---|---|
| S | Set of states |
| Actions(S) | Set of actions available in each state |
| P(s' | s, a) | Transition model |
| R(s, a, s') | Reward function |
Q-Learning¶
Uses a function Q(s, a) estimating the value of taking action a in state s.
Update rule:
- α (alpha): Learning rate — controls how quickly new estimates override old ones
- Future rewards incorporate the highest estimated value of the next state's actions
Explore vs. Exploit Tradeoff¶
Greedy decision-making: Always selects the action with the highest Q-value. Prevents discovering potentially better routes.
ε-greedy algorithm: - With probability 1 − ε: exploit (choose best known action) - With probability ε: explore (choose randomly)
Function approximation: Enables Q-Learning in complex domains (like chess) by approximating Q-values using features rather than storing every state-action pair.
Unsupervised Learning¶
Finds patterns in unlabeled input data — no pre-assigned classifications.
k-Means Clustering¶
Groups similar data points together:
- Randomly place k cluster centers
- Assign each point to its nearest cluster center
- Move each center to the mean of its assigned points
- Repeat steps 2–3 until no point changes clusters (equilibrium)
Partitions data into k groups based on spatial similarity.