Journal

Posts tagged "Gradient Descent"

2 posts

May 2023

Feedforward Neural Networks

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

Feedforward Neural Networks (FNNs), also known as Multi-Layer Perceptrons (MLPs), are one of the most fundamental and widely-used neural network architectures in machine learning. FNNs have been employed for a variety of tasks, including classification, regression, and feature extraction. In this post, we'll explore the architecture, training process, and applications of FNNs.

Architecture of FNNs

An FNN consists of multiple layers of neurons, including an input layer, one or more hidden layers, and an output layer. Each neuron is connected to the neurons in the adjacent layers through weighted edges.

Neurons: Building Blocks of FNNs

A neuron receives inputs from other neurons or external sources, applies an activation function, and produces an output. Common activation functions include the sigmoid, ReLU (Rectified Linear Unit), and tanh functions.

Activation Functions in Neural Networks

Layers: Input, Hidden, and Output

Input Layer: The input layer receives raw data features and passes them to the hidden layers.
Hidden Layer(s): Hidden layers perform complex transformations on the data using weighted connections and activation functions.
Output Layer: The output layer provides the final predictions or classifications.

Training FNNs: Backpropagation and Gradient Descent

Training FNNs involves adjusting the weights and biases to minimize the loss function. The loss function measures the difference between the predicted output and the actual target.

Backpropagation

Backpropagation is an algorithm used to calculate the gradients of the loss function with respect to the weights and biases. It uses the chain rule to efficiently propagate error signals from the output layer to the input layer.

Backpropagation Explained

Gradient Descent

Gradient descent is an optimization algorithm that updates the weights and biases based on the gradients calculated during backpropagation. Variants like stochastic gradient descent (SGD) and Adam optimizer improve the optimization process.

Gradient Descent: The Optimization Algorithm

Applications of FNNs

Classification: FNNs can classify data into distinct categories, such as spam or not-spam for email filtering.
Regression: FNNs can predict continuous values, such as house prices based on property features.

...

May 8, 2023 Read more →

Machine Learning Series: Exploring the World of AI/ML

Machine learning is an exciting and rapidly evolving field that has the potential to transform virtually every industry. From natural language processing to computer vision, machine learning models are becoming an integral part of our daily lives, enabling new levels of automation and understanding. To explore the fascinating world of machine learning and share insights with a broader audience, I am launching a blog series on AI/ML.

In this post, I will discuss the topics I will be covering and what you can expect from the upcoming blog series.

The Topics We Will Explore

Our journey into machine learning has covered a wide range of topics, each diving into a different aspect of this dynamic field:

Introduction

Neural Networks

What are Neural Networks?
Exploring the Different Types of Neural Networks
Feedforward Neural Networks
Convolutional Neural Networks: The Backbone of Image Recognition
Recurrent Neural Networks: Understanding Sequential Data
Autoencoders: Compression, Reconstruction, and Beyond
Generative Adversarial Networks: The Art of AI-Generated Content

Deep Learning Hardware

Neural Networks and the Power of GPUs and TPUs
GPUs and TPUs: Accelerating Machine Learning with Specialized Hardware

Fundamentals of Machine Learning

Tensors in Machine Learning: Understanding Multidimensional Arrays
Layers in Machine Learning: Building Blocks of Neural Networks
Activation Functions: Bringing Nonlinearity to Neural Networks
Parameters in ML
Model Weights and Checkpoints in Machine Learning
Loss Functions in Machine Learning
Overfitting in Machine Learning
Gradient Descent: Optimization in Machine Learning
Hyperparameters and the Art of Tuning: Optimizing ML Models

Natural Language Processing

Tokenization: The Key to Understanding Language in NLP
Embeddings in Large Language Models
Embeddings and Vector Databases in Large Language Models
Understanding Perplexity: A Key Metric in Language Modeling
Attention Mechanisms in Large Language Models
GPT: The Language Model Revolutionizing Natural Language Understanding

...

May 1, 2023 Read more →