ML Visualizer

Neural Network Learning XOR

Neural Network Visualization

Training Stats

Epochs:0

Current Error:1.0000

Status:Ready

Current Example:[0, 0] → 0

Prediction:0.0000

Training Speed:

SlowFast

Test Network

How Neural Networks Work

The XOR Problem

The XOR (exclusive OR) function is a classic problem in neural networks. It outputs 1 when exactly one of its inputs is 1, and 0 otherwise:

Input 1	Input 2	XOR Output
0	0	0
0	1	1
1	0	1
1	1	0

XOR is not linearly separable, meaning a single neuron cannot solve it. This requires at least one hidden layer, making it a perfect demonstration of neural network capabilities.

Network Architecture

This visualization uses a 3-layer neural network:

Input Layer: 2 neurons (receiving the two binary inputs)
Hidden Layer: 4 neurons (with sigmoid activation)
Output Layer: 1 neuron (with sigmoid activation)

Each connection between neurons has an associated weight, and each neuron in the hidden and output layers has a bias term.

Forward Pass: Making Predictions

During the forward pass, the network calculates a prediction given the inputs:

Mathematical Formulation:

For each neuron in hidden and output layers:

1. Calculate weighted sum plus bias:

z = b + ∑(w_i * a_i)

where:

z is the neuron's pre-activation
b is the bias term
w_i are the weights from previous layer neurons
a_i are the activations from previous layer neurons

2. Apply sigmoid activation function:

a = σ(z) = 1 / (1 + e^(-z))

where a is the neuron's activation (output)

Step-by-step:

Input values are set as activations of the input layer neurons
For each hidden layer neuron, compute the weighted sum of inputs plus bias
Apply the sigmoid function to get the hidden layer activations
For the output neuron, compute the weighted sum of hidden layer outputs plus bias
Apply sigmoid to get the final prediction (between 0 and 1)

Backpropagation: Learning

The neural network learns through a process called backpropagation, which adjusts weights and biases to minimize prediction error:

Mathematical Formulation:

1. Calculate the error:

E = target - prediction

2. For output layer, calculate the delta (error signal):

δ_output = E * σ(z) * (1 - σ(z))

where σ(z) * (1 - σ(z)) is the derivative of the sigmoid function

3. For hidden layer, propagate the delta backward:

δ_hidden = (w_output * δ_output) * σ(z_hidden) * (1 - σ(z_hidden))

4. Update weights and biases:

Δw = learning_rate * δ * activation_input

Δb = learning_rate * δ

5. Apply updates:

w_new = w_old + Δw

b_new = b_old + Δb

Step-by-step:

Calculate the error between predicted and target output
Calculate the error gradient for the output layer
Propagate this error backward to the hidden layer
Update all weights proportionally to their contribution to the error
Update all biases based on the error gradient

Training Process

The network is trained through these steps:

Initialize weights and biases randomly (between -1 and 1)
For each training example (all 4 XOR cases):
- Perform forward pass to get prediction
- Calculate error
- Perform backpropagation to update weights and biases
One complete pass through all training examples is called an "epoch"
Continue training until the average error across all examples is below a threshold (0.05)

Visualization Elements

Neurons (circles): Brightness indicates activation level (0 to 1)
Connections (lines): Green indicates positive weights, red indicates negative weights. Line thickness shows the magnitude of the weight.
Training Stats: Shows current error, epochs completed, and current prediction
Test Examples: Allow you to see how the network responds to each of the four XOR cases

Why XOR Needs a Hidden Layer

XOR is not linearly separable, which means we cannot draw a straight line to separate the positive cases (0,1 and 1,0) from the negative cases (0,0 and 1,1) in 2D space.

The hidden layer transforms the input space to create a representation where the problem becomes linearly separable. Each hidden neuron essentially learns to create a line, and their combined effect can create complex decision boundaries to solve the XOR problem.