Neural Networks from Scratch — 1: What Is a Neural Network?
Neural Networks (NN) are computational models inspired by the structure and function of biological neurons in the human brain. The brain processes information through the coordinated work of billions of interconnected neurons. This biological mechanism forms the foundation of artificial neural networks.
An artificial neural network consists of units called nodes or artificial neurons. Between these neurons are weights that determine the strength of connections and form the basis of the learning process. Each artificial neuron sums the products of its incoming inputs with their weights, then processes this sum through an activation function.
Neural networks are used in the following areas:
- Recognizing objects in images (computer vision)
- Extracting meaning from text (natural language processing)
- Classifying datasets or identifying patterns

1. Forward Propagation
This is the process by which the neural network processes inputs and produces an output. The following steps occur at each layer:
- Initializing parameters: Weights and biases are assigned randomly.
- Processing inputs: Each input is multiplied by the weight of its connection.
- Adding bias: The bias value is added to the weighted sums.
- Applying the activation function: Functions like ReLU or Sigmoid enable learning of non-linear relationships.
Mathematical formula: z = (w₁ × x₁) + (w₂ × x₂) + b, then y = f(z)
Example Calculation (Inputs: x1=1, x2=2)
Given weights and bias values for hidden layer neurons, the calculation proceeds as follows:
![Forward propagation diagram — input [1,2]](/_next/image?url=https%3A%2F%2Fmiro.medium.com%2Fv2%2Fresize%3Afit%3A700%2F0*7tYyk-23QRxCYHr-.png&w=3840&q=75)
- Z1 = (1 × 0.5) + (2 × -0.1) + 0.2 = 0.5
- Z2 = (1 × 0.3) + (2 × 0.6) + 0.3 = 1.8

Output layer: y = (0.5 × -1.2) + (1.8 × 0.02) + 2.1 = 1.536

2. Backward Propagation
This is the process of learning from errors and updating parameters. Throughout training, the model measures the error in its predictions and updates its weights to reduce that error.
- Computing the error: The difference between the prediction and the actual value is measured with a loss function.
- Computing gradients: The contribution of each parameter to the error is found using the chain rule and derivatives.
- Propagating the error: Gradients are propagated backward through the network from the output to the input.
3. Parameter Update
After each iteration, the weights are updated according to the computed gradients and the learning rate. This loop repeats until the loss drops to the desired level.
Parameter update formula: w = w - η × ∂L/∂w
Here η (eta) represents the learning rate, and ∂L/∂w represents the derivative of the loss with respect to that weight.
Loss Computation — MSE (Mean Squared Error)
MSE, commonly used in regression problems, takes the average of the squared differences between predictions and actual values.
MSE = (1/N) × Σ(Target − Prediction)²
For our example: Target = 2, Prediction = 1.536 → Difference² = 0.215 → MSE = 0.215
Implementation from Scratch with Python
The following code performs the above calculation in pure Python without any libraries.
When you run this code, you will get the following output:
Output: z1: 0.5, z2: 1.8, Prediction: 1.536, MSE: 0.215296
Conclusion
In this article, we implemented the basic structure of neural networks, the forward propagation process, and MSE loss computation from scratch. In the next part of the series, we will cover backward propagation and gradient descent together with PyTorch.