What is Forward Propagation?

by Stephen M. Walker II, Co-Founder / CEO

What is Forward Propagation?

Forward Propagation, also known as a forward pass, is a process in neural networks where input data is fed through the network in a forward direction to generate an output. This process involves the following steps:

  1. Input Data — The input data is fed into the network, starting from the input layer.

  2. Weighted Sum — Each neuron in the hidden layers calculates a weighted sum of its inputs. This is also known as pre-activation. The weighted sum is a linear transformation of the inputs with respect to the weights associated with each input.

  3. Activation Function — An activation function is applied to the weighted sum to introduce non-linearity into the network. This function determines whether a neuron should be activated or not based on the weighted sum and the threshold value. The output of the activation function is then sent to the next layer.

  4. Output Generation — The process continues layer by layer until it reaches the output layer, generating the final output.

The forward propagation process is designed to avoid data moving in a circular motion, which does not generate an output. The data should not flow in reverse direction during output generation otherwise it would form a loop. Such network configurations are known as feed-forward networks.

In Python, a simple forward propagation process can be implemented as follows:

def relu(z):
    return max(0,z)

def feed_forward(x, Wh, Wo):
    # Hidden layer
    Zh = x * Wh
    H = relu(Zh)

In this code, x is the input, Wh and Wo are the weights for the hidden and output layers respectively, and relu is the activation function.

Forward propagation is a crucial part of training a neural network, as it provides the initial predictions that are then used in backpropagation to update the weights and improve the network's accuracy.

What is the difference between forward propagation and backward propagation in neural networks?

Forward propagation and backward propagation are two distinct phases in the training of neural networks, each with a specific role in the learning process. Forward propagation is about making predictions, and backward propagation is about learning from those predictions to improve the network's accuracy. Both processes are essential for training a neural network, and they are used iteratively to optimize the network's performance.

Forward vs. Backward Propagation

Forward propagation is the process where input data traverses through the network from the input to the output layer, involving steps such as input reception, calculation of weighted sums (pre-activation), application of activation functions to introduce non-linearity, and generation of the final output. This phase aims to predict outputs based on the network's current weights.

In contrast, backward propagation, or backpropagation, is the learning phase where the network adjusts its weights based on the errors from the forward phase. It calculates the error by comparing predicted and actual outputs, computes gradients using the chain rule, and updates the weights to minimize the error, following the principles of gradient descent optimization.

The key distinction between the two lies in their direction and purpose: forward propagation moves data forward to make predictions, while backward propagation uses the resulting errors to inform weight adjustments and improve the model. Backpropagation relies on forward propagation's outcomes, as it needs the predicted outputs and the values computed during the forward pass for gradient calculation.

How dooes Forward Propagation work?

Forward propagation in neural networks is the process by which input data is processed through the network to produce an output. It involves the following steps:

  1. Input Layer — The process begins with the input layer, where the input data is fed into the network.

  2. Hidden Layers — The data then moves through one or more hidden layers. In each layer, two main operations occur:

    • Pre-activation — This is the calculation of the weighted sum of the inputs, which includes the weights associated with the connections between the nodes of the previous layer and the current node, plus a bias term.
    • Activation — An activation function is applied to the pre-activation value to introduce non-linearity into the network, which allows the network to model complex relationships.
  3. Output Layer — After passing through all hidden layers, the data reaches the output layer, where the final prediction or output of the network is produced.

  4. Loss Function — The output is then evaluated using a loss function to determine the error or loss, which measures how far the network's prediction is from the actual target values.

The forward propagation process is crucial for the network to make predictions, and it is complemented by backpropagation, where the error from the output is used to update the weights and biases in the network to improve the model during training.

In practice, forward propagation can be implemented using libraries like PyTorch or TensorFlow, which provide tools to define neural network architectures and perform the necessary calculations efficiently. The actual implementation involves matrix multiplications and the application of activation functions to these matrices to propagate the data through the network.

For example, a simple forward propagation function in Python using the ReLU activation function might look like this:

import numpy as np

def relu(Z):
    return np.maximum(0, Z)

def forward_pass(X, Wh, Wo):
    # Hidden layer
    Zh = np.dot(X, Wh)  # Weighted sum
    H = relu(Zh)        # Activation

    # Output layer
    Zo = np.dot(H, Wo)  # Weighted sum
    output = relu(Zo)   # Activation

    return output

In this code, X represents the input data, Wh and Wo are the weights for the hidden and output layers, respectively, and Zh and Zo are the pre-activated values for the hidden and output layers.

Forward propagation is a linear process in the sense that data moves in one direction from input to output, but the activation functions introduce non-linearity, allowing the network to capture complex patterns in the data.

What is the role of activation functions in forward propagation?

Activation functions play a crucial role in forward propagation in neural networks. They are mathematical functions that introduce non-linearity into the network, enabling it to model complex relationships and patterns in the data.

During forward propagation, each node in the hidden and output layers of the network performs two main operations: pre-activation and activation. Pre-activation involves calculating the weighted sum of the inputs, which includes the weights associated with the connections between the nodes of the previous layer and the current node, plus a bias term. The activation function is then applied to this pre-activation value.

The activation function acts as a mathematical "gate" between the input feeding the current neuron and its output. It determines whether the output of the node should be sent to the next layer based on a specified threshold value. Without activation functions, the neural network could only compute linear mappings from inputs to outputs. The introduction of non-linearity by the activation functions allows the network to capture and model more complex, non-linear relationships in the data.

There are several commonly used activation functions, including the sigmoid, hyperbolic tangent (tanh), Rectified Linear Unit (ReLU), and Softmax functions. The choice of activation function can depend on the specific requirements of the model and the type of problem being addressed.

Activation functions are essential in forward propagation as they introduce non-linearity into the network, enabling it to model complex relationships and patterns in the data. They act as "gates" that determine whether the output of a node should be sent to the next layer, thereby controlling the flow of data through the network.

More terms

What is a graph?

A graph is a mathematical structure that consists of nodes (also called vertices) and edges connecting them. It can be used to represent relationships between objects or data points, making it useful in various fields such as computer science, social networks, and transportation systems. Graphs can be directed or undirected, weighted or unweighted, and cyclic or acyclic, depending on the nature of the connections between nodes.

Read more

What is deep learning?

Deep learning is a subset of machine learning that focuses on training artificial neural networks to learn from large amounts of data. These neural networks consist of multiple layers of interconnected nodes, which process input data and produce output predictions. As the name suggests, deep learning involves using many layers in these neural networks, allowing them to capture complex patterns and relationships within the data. This makes deep learning particularly well-suited for tasks such as image and speech recognition, natural language processing, and predictive modeling.

Read more

It's time to build

Collaborate with your team on reliable Generative AI features.
Want expert guidance? Book a 1:1 onboarding session from your dashboard.

Start for free