Journal

Blog 2023

Thoughts on photography, technology, music, and creative work.

May 2023

Autoencoders: Compression, Reconstruction, and Beyond

NOTE: This post is part of my Machine Learning Series where I discuss how AI/ML works and how it has evolved over the last few decades.

Autoencoders are a type of neural network architecture used for tasks such as dimensionality reduction, feature extraction, and data denoising. With their ability to learn efficient representations of data, autoencoders have found applications in various fields, from image processing to anomaly detection. In this post, we'll explore the structure and functionality of autoencoders and delve into their use cases.

Understanding Autoencoders

An autoencoder consists of two primary components: an encoder and a decoder. The encoder compresses the input data into a lower-dimensional representation called the latent space, while the decoder reconstructs the original data from this latent representation.

Autoencoder: The Encoder-Decoder Architecture

Encoder: Data Compression

The encoder is a neural network that receives input data and reduces its dimensionality, creating a compressed representation in the latent space. This process captures the most important features of the data.

Decoder: Data Reconstruction

The decoder is another neural network that takes the compressed representation and reconstructs the original data. The goal is to produce a reconstruction that closely resembles the original input.

Training: Minimizing Reconstruction Error

Autoencoders are trained to minimize the reconstruction error between the original input and the reconstructed output. Common loss functions include mean squared error (MSE) and binary cross-entropy.

Variants of Autoencoders

Variational Autoencoders (VAEs)

Variational autoencoders (VAEs) are a probabilistic extension of autoencoders that learn the distribution of the latent space. VAEs are used for tasks such as image generation and unsupervised learning.

Variational Autoencoders Explained

Denoising Autoencoders

Denoising autoencoders are trained to reconstruct input data that has been intentionally corrupted with noise. They are effective for image denoising and removing artifacts.

Denoising AutoEncoders In Machine Learning

Applications of Autoencoders

  • Dimensionality Reduction: Autoencoders can reduce the dimensionality of data while preserving essential features, similar to PCA.
  • Anomaly Detection: Autoencoders can detect anomalies by measuring high reconstruction error for atypical data points.
  • Image Generation: Variational autoencoders can generate new images by sampling from the learned latent space.

TL;DR

May 11, 2023 Read more

Recurrent Neural Networks: Understanding Sequential Data

NOTE: This post is part of my Machine Learning Series where I discuss how AI/ML works and how it has evolved over the last few decades.

Recurrent Neural Networks (RNNs) are a class of neural networks designed to handle sequential data. Whether it's analyzing time series, understanding natural language, or predicting stock prices, RNNs are powerful tools for capturing temporal dependencies in data. In this post, we'll delve into the structure of RNNs, how they process sequences, and their practical applications.

RNN Architecture

An RNN is composed of neurons that are organized in layers, with each neuron receiving input from the previous time step and the current input. The key feature of RNNs is their recurrent connections, allowing them to maintain hidden states that capture information from previous time steps.

Hidden States: Memory of the Past

The hidden states in an RNN act as memory, storing relevant information from previous time steps. This memory allows RNNs to effectively process sequences and recognize patterns that depend on temporal context.

Unrolling RNNs: Processing Sequences

An RNN can be unrolled over time to process sequences of varying lengths. At each time step, the RNN updates its hidden state based on the current input and the previous hidden state. The final hidden state is often used for tasks like classification, while the outputs at each time step can be used for tasks like language modeling.

Challenges and Variants

Vanishing and Exploding Gradients

Training RNNs can be challenging due to the vanishing and exploding gradient problem. Long sequences may result in gradients that vanish or explode, making it difficult for the RNN to learn long-term dependencies.

LSTM and GRU

To address these challenges, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), have been developed. LSTM introduces memory cells and gates to better regulate the flow of information, while GRU simplifies the LSTM architecture with fewer gates.

Applications of RNNs

RNNs have been used in a wide range of applications, including:

  • Natural Language Processing: RNNs are used for language modeling, sentiment analysis, machine translation, and more.
  • Time Series Forecasting: RNNs can predict future values in time series data, such as stock prices or weather patterns.
  • Speech Recognition: RNNs are used to transcribe and recognize spoken language.

TL;DR

May 10, 2023 Read more

Convolutional Neural Networks: The Backbone of Image Recognition

NOTE: This post is part of my Machine Learning Series where I discuss how AI/ML works and how it has evolved over the last few decades.

Convolutional Neural Networks (CNNs) have become the go-to architecture for image recognition and computer vision tasks. CNNs excel at identifying patterns in images, such as edges, textures, and shapes, making them a key player in applications like image classification, object detection, and facial recognition. In this post, we'll explore the key components of CNNs, how they operate on images, and their use cases.

Key Components of CNNs

Convolutional Layers

The heart of a CNN is the convolutional layer, which applies convolution operations to the input image using kernels (or filters) to extract features. These kernels slide over the input image, detecting patterns and creating feature maps.

Convolutional Layers Explained

Pooling Layers

Pooling layers downsample the feature maps created by the convolutional layers, reducing their spatial dimensions. Common pooling methods include max pooling and average pooling.

Pooling in Convolutional Neural Networks

Fully Connected Layers

Fully connected layers form the final part of a CNN, using the extracted features for classification or regression tasks. Activation functions, such as the softmax function, are often applied to the final layer for multi-class classification.

Putting It All Together: Image Classification

A typical CNN for image classification consists of alternating convolutional and pooling layers, followed by fully connected layers. The convolutional layers detect features in the image, while the pooling layers reduce dimensionality. The fully connected layers interpret the features and provide the final output.

Building a Simple CNN: Image Classification

Applications of CNNs

CNNs are widely used in various applications, including:

  • Image Classification: CNNs can classify images into categories, such as identifying whether an image contains a cat or dog.
  • Object Detection: CNNs can locate and identify multiple objects within an image.
  • Facial Recognition: CNNs are used to recognize faces and verify identities in security applications.

TL;DR

May 9, 2023 Read more

Feedforward Neural Networks

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

Feedforward Neural Networks (FNNs), also known as Multi-Layer Perceptrons (MLPs), are one of the most fundamental and widely-used neural network architectures in machine learning. FNNs have been employed for a variety of tasks, including classification, regression, and feature extraction. In this post, we'll explore the architecture, training process, and applications of FNNs.

Architecture of FNNs

An FNN consists of multiple layers of neurons, including an input layer, one or more hidden layers, and an output layer. Each neuron is connected to the neurons in the adjacent layers through weighted edges.

Neurons: Building Blocks of FNNs

A neuron receives inputs from other neurons or external sources, applies an activation function, and produces an output. Common activation functions include the sigmoid, ReLU (Rectified Linear Unit), and tanh functions.

Activation Functions in Neural Networks

Layers: Input, Hidden, and Output

  • Input Layer: The input layer receives raw data features and passes them to the hidden layers.
  • Hidden Layer(s): Hidden layers perform complex transformations on the data using weighted connections and activation functions.
  • Output Layer: The output layer provides the final predictions or classifications.

Training FNNs: Backpropagation and Gradient Descent

Training FNNs involves adjusting the weights and biases to minimize the loss function. The loss function measures the difference between the predicted output and the actual target.

Backpropagation

Backpropagation is an algorithm used to calculate the gradients of the loss function with respect to the weights and biases. It uses the chain rule to efficiently propagate error signals from the output layer to the input layer.

Backpropagation Explained

Gradient Descent

Gradient descent is an optimization algorithm that updates the weights and biases based on the gradients calculated during backpropagation. Variants like stochastic gradient descent (SGD) and Adam optimizer improve the optimization process.

Gradient Descent: The Optimization Algorithm

Applications of FNNs

  • Classification: FNNs can classify data into distinct categories, such as spam or not-spam for email filtering.
  • Regression: FNNs can predict continuous values, such as house prices based on property features.

...

May 8, 2023 Read more

Exploring the Different Types of Neural Networks

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

Neural networks are the foundation of many artificial intelligence and machine learning applications. There are several types of neural networks, each designed to address specific types of problems. In this post, we'll explore the most common types of neural networks and their applications.

Feedforward Neural Networks (FNNs)

Feedforward neural networks, also known as FNNs, are the simplest type of neural network. They consist of an input layer, one or more hidden layers, and an output layer. Information in FNNs flows in one direction, from the input to the output.

 Understanding Feed Forward Neural Networks With Maths and Statistics 

Convolutional Neural Networks (CNNs)

Convolutional neural networks (CNNs) are designed for image processing and computer vision tasks. CNNs use convolutional layers to scan images for local patterns, and pooling layers to reduce spatial dimensions. They excel at image classification and object detection.

A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way

Recurrent Neural Networks (RNNs)

Recurrent neural networks (RNNs) are designed to process sequential data, such as time series or text. RNNs have connections that loop back, allowing them to capture temporal dependencies. Variants such as LSTMs and GRUs address challenges like vanishing gradients.

Understanding RNN and LSTM

Generative Adversarial Networks (GANs)

Generative adversarial networks (GANs) consist of a generator and discriminator network that engage in an adversarial game. The generator creates synthetic data, while the discriminator evaluates its authenticity. GANs have applications in image synthesis and data augmentation.

 Understanding Generative Adversarial Networks (GANs)

Autoencoders

Autoencoders are neural networks used for dimensionality reduction and feature extraction. They consist of an encoder that compresses input data and a decoder that reconstructs the original data. Autoencoders are used for image denoising and anomaly detection.

 Applied Deep Learning - Part 3: Autoencoders

TL;DR

May 5, 2023 Read more

What are Neural Networks?

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

One of the most transformative developments in the field of artificial intelligence and machine learning was the advent of neural networks. These computational models are designed to mimic the way the human brain processes information and are capable of performing complex tasks such as image recognition, natural language processing, and more. In this blog post, we'll explore what neural networks are, their components, and why specialized hardware like GPUs (Graphics Processing Units) and TPUs (Tensor Processing Units) are highly effective for training and deploying neural networks.

What is a Neural Network?

A neural network is a computational model inspired by the structure and functionality of the biological brain. Composed of interconnected nodes or "neurons" organized into layers, neural networks learn to recognize patterns and make predictions by processing input data and adjusting the strength of connections between neurons.

The key components of a neural network include:

  • Input Layer: Receives input data and passes it to the subsequent layers for processing.
  • Hidden Layers: Layers between the input and output layers that perform various computations and transformations on the data.
  • Output Layer: Produces the final predictions or classifications based on the processed data.
  • Weights and Biases: Parameters that determine the strength of connections between neurons. These are adjusted during training to minimize the prediction error.

Neural networks learn through a process called backpropagation, which involves computing the gradient of the loss function with respect to each weight and adjusting the weights to minimize the loss.

The Role of GPUs and TPUs in Neural Networks

Training and inference with neural networks often involve large volumes of data and computationally intensive operations. Traditional CPUs (Central Processing Units) may struggle to handle these workloads efficiently. Enter GPUs and TPUs, specialized hardware accelerators that excel at parallel processing.

Graphics Processing Units (GPUs)

GPUs are hardware accelerators initially designed for rendering graphics in video games. However, they have been repurposed for general-purpose computing due to their ability to perform parallel computations efficiently. A GPU consists of thousands of small cores capable of executing operations simultaneously, making them highly suitable for the matrix and vector operations common in neural networks.

May 4, 2023 Read more

The Evolution of Computer Vision: A Decade of Innovation and Progress

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

Computer vision, the field of AI that enables computers to interpret and understand visual information from the world, has undergone significant advancements over the past decade. The ability to analyze images and videos, recognize objects, and understand visual scenes has opened up a multitude of applications in fields such as healthcare, autonomous vehicles, and security. In this blog post, we will explore the key milestones and breakthroughs that have shaped the evolution of computer vision over the last ten years.

The Rise of Deep Learning in Computer Vision

ImageNet and the Convolutional Neural Network (CNN) Revolution

One of the most transformative moments in computer vision came in 2012 with the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). The competition, which involved classifying images into 1,000 different categories, was won by AlexNet, a deep convolutional neural network (CNN) designed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. AlexNet significantly outperformed traditional computer vision algorithms, marking the beginning of the deep learning revolution in computer vision.

Object Detection and Segmentation Advances

Following the success of AlexNet, new architectures and techniques emerged for tasks such as object detection and segmentation. Models like R-CNN, YOLO (You Only Look Once), and Mask R-CNN improved the accuracy and speed of object detection and instance segmentation.

The Expansion of Computer Vision Applications

Healthcare and Medical Imaging

Advancements in computer vision have had a profound impact on healthcare, particularly in medical imaging. Deep learning models can now detect diseases from medical scans with accuracy comparable to human experts, aiding in early diagnosis and treatment.

Autonomous Vehicles and Robotics

Computer vision has played a crucial role in the development of autonomous vehicles, enabling them to perceive their surroundings and make safe driving decisions. Additionally, computer vision is used in robotics for tasks such as navigation, manipulation, and human-robot interaction.

The Emergence of Vision Transformers and Self-Supervised Learning

May 3, 2023 Read more

The Evolution of Machine Learning: A Journey Through the Last 50 Years

NOTE: This post is part of my Machine Learning Series where I’m discussing how AI/ML works and how it has evolved over the last few decades.

Machine learning has become an integral part of our lives, powering applications from voice assistants to self-driving cars. However, the field has a rich history that spans over five decades, with foundational ideas that date back even further. In this blog post, we'll explore the key milestones and breakthroughs in the history of machine learning over the last 50 years and how they've shaped the field as we know it today.

The 1970s: The Birth of Symbolic AI and Decision Trees

The 1970s marked the beginning of the modern era of artificial intelligence (AI) and machine learning research. During this time, symbolic AI, also known as rule-based AI, gained popularity. Researchers created expert systems that relied on manually coded rules to mimic human reasoning.

One of the significant advances in machine learning during this period was the development of decision tree algorithms. Decision trees use a tree-like structure to represent decisions and their possible consequences. The ID3 algorithm, developed by Ross Quinlan in the late 1970s, was one of the first algorithms for generating decision trees.

The 1980s: The Emergence of Neural Networks

The 1980s saw the rise of interest in neural networks. One of the most important contributions of this period was the backpropagation algorithm, introduced by Rumelhart, Hinton, and Williams in 1986. Backpropagation enabled efficient training of multi-layer neural networks, paving the way for deep learning.

Despite initial excitement, neural networks faced limitations, including the lack of computational power and the vanishing gradient problem. By the end of the 1980s, research in neural networks slowed down.

The 1990s: Support Vector Machines and Reinforcement Learning

The 1990s witnessed the development of support vector machines (SVMs), introduced by Vapnik and Cortes. SVMs became popular for classification tasks due to their ability to handle high-dimensional data and achieve strong generalization.

In addition, the 1990s saw significant advances in reinforcement learning (RL). Sutton and Barto's book, "Reinforcement Learning: An Introduction," became a foundational text in the field. Q-learning and TD-learning algorithms contributed to the growing interest in RL.

May 2, 2023 Read more

Machine Learning Series: Exploring the World of AI/ML

Machine learning is an exciting and rapidly evolving field that has the potential to transform virtually every industry. From natural language processing to computer vision, machine learning models are becoming an integral part of our daily lives, enabling new levels of automation and understanding. To explore the fascinating world of machine learning and share insights with a broader audience, I am launching a blog series on AI/ML.

In this post, I will discuss the topics I will be covering and what you can expect from the upcoming blog series.

The Topics We Will Explore

Our journey into machine learning has covered a wide range of topics, each diving into a different aspect of this dynamic field:

Introduction

Neural Networks

Deep Learning Hardware

  • Neural Networks and the Power of GPUs and TPUs
  • GPUs and TPUs: Accelerating Machine Learning with Specialized Hardware

Fundamentals of Machine Learning

  • Tensors in Machine Learning: Understanding Multidimensional Arrays
  • Layers in Machine Learning: Building Blocks of Neural Networks
  • Activation Functions: Bringing Nonlinearity to Neural Networks
  • Parameters in ML
  • Model Weights and Checkpoints in Machine Learning
  • Loss Functions in Machine Learning
  • Overfitting in Machine Learning
  • Gradient Descent: Optimization in Machine Learning
  • Hyperparameters and the Art of Tuning: Optimizing ML Models

Natural Language Processing

  • Tokenization: The Key to Understanding Language in NLP
  • Embeddings in Large Language Models
  • Embeddings and Vector Databases in Large Language Models
  • Understanding Perplexity: A Key Metric in Language Modeling
  • Attention Mechanisms in Large Language Models
  • GPT: The Language Model Revolutionizing Natural Language Understanding

...

May 1, 2023 Read more

March 2023

Apple Photo Scores: AI Judges Your Photos

Art critics have been present long before the birth of photography and have accompanied photographers through the journey from analog to digital. Now, with the proliferation of machine learning and the integration of on-device ML chips, such as Apple's Neural Engine chip, your smartphone has evolved into a discerning critic of your photographic creations.

Apple’s ML Photo Scores

Apple uses AI/ML to give each of your photos a series of scores. This is a hidden and undocumented feature that employs machine learning algorithms to examine your photos and allocate a score based on numerous factors such as quality, sharpness, and more. These scores are listed below (you can click through each of those links below to see all my photos ordered by each score):

These scores are preserved in an SQLite database and remain invisible to the end user. However, they serve to sort and highlight your photos, streamlining the process of finding your best shots. For instance, Apple’s ML model believes this is my number one overall photo:

I mean, it’s pretty good, but best overall? Here are the second and third ones:

Seeing your photo’s scores

...

March 23, 2023 Read more

Reverse Engineering Read Later Data from the Apple News App

As we navigate the digital world, we often come across articles we don't have time to read but still want to save for later. One way to accomplish this is by using the Read Later feature in Apple News. But what if you want to access those articles outside the Apple News app, such as on a different device or with someone who doesn't use Apple News? Or what if you want to automatically post links to those articles on your blog? That's where the nerd powers come in.

Reverse Engineering the Data

Initially, I reached out to Rhet Turnbull, the creator of the amazing osxphotos app/Python library that I use to extract the data from Apple Photos. I use that data to power the photo section of my site.

I asked Rhet if he had ever pulled this data from News. While I waited to hear back from him, I used lsof to look for the file that Apple News uses to store Read Later Articles. I discovered that Apple News uses a Binary PList file located in a super obvious place:

/Users/eecue/Library/Containers/com.apple.news/Data/Library/Application Support/com.apple.news/com.apple.news.public-com.apple.news.private-production/reading-list

Simple and obvious, right?! After I found it, I noticed it was in a strange format that a normal binary PList parser couldn’t understand. However, I was able to just run strings on the file and extract the Apple News Article ID which looks like this: https://apple.news/AbtWOAgVqToW62MeeZ1xkcQ.

I wrote a script to parse the data on the page above and then use Beautiful Soup to extract the article data. It wasn’t perfect, but it did the job:

import subprocess
import requests
from bs4 import BeautifulSoup

# Run the `strings` command to extract the strings from the binary file
proc = subprocess.Popen(['strings', '/Users/eecue/Library/Containers/com.apple.news/Data/Library/Application Support/com.apple.news/com.apple.news.public-com.apple.news.private-production/reading-list'], stdout=subprocess.PIPE)

# Loop through the output and look for article IDs
article_ids = []
for line in proc.stdout:
    # Check if the line starts with "rl-" and ends with "_"
    if line.startswith(b'rl-'):
        # Extract the article ID by removing the "rl-" prefix and "_" suffix
        article_id = line.decode().strip()[3:]
        if article_id.endswith('_'):
            article_id = article_id[:-1]
        article_ids.append(article_id)

def extract_info_from_apple_news(news_id):
    # Construct the Apple News URL from the ID
    apple_news_url = f'https://apple.news/{news_id}'
March 13, 2023 Read more

AI and The Potential Risks of Autonomous War Bots

In 2008, I had the opportunity to tour SPAWAR, the Space and Naval Warfare Systems Command, now known as NAVWAR. SPA/NAVWAR is a research and development laboratory for the U.S. Navy. During my visit, I was fascinated by the various autonomous military robots that were being developed and tested there. I photographed the tour and wrote about it for WIRED News.

Fast-forward to 2023, and with the emergence of large language models like ChatGPT and Bing AI, it's possible to imagine how these robots could be controlled using AI in ways that are frankly somewhat terrifying. With great power comes great responsibility, and we must consider the potential risks of relying on AI-powered machines in warfare.

How Large Language Models Could Control Autonomous Robots for War

Large language models like ChatGPT are designed to understand and generate human-like language. They work by training on vast amounts of text data, which enables them to recognize patterns and make predictions about what words are likely to come next in a sentence. With this ability, it's possible to use natural language commands to control autonomous robots on the battlefield.

For example, a commander could use a chatbot interface to ask an autonomous drone to perform a specific task, such as "Scan the area for enemy activity and report back." The drone would then use its onboard sensors to perform the task and send the results back to the commander. This type of interaction could reduce the need for human operators in dangerous situations and provide real-time intelligence to decision-makers.

The Potential Risks of Autonomous War Machines

While the idea of using AI-powered machines in warfare may seem appealing, it's important to consider the potential risks. One major concern is the possibility of unintended consequences. Autonomous robots rely on algorithms and programming to make decisions, and there's always the risk of a bug or glitch causing the machine to behave in unexpected ways. This could lead to unintended harm to civilians or friendly forces.

Another concern is the potential for hackers to gain control of autonomous robots. If an adversary were able to gain access to the communication channels used to control the machines, they could potentially cause havoc on the battlefield. They could redirect drones to attack friendly forces or civilians, or use them for reconnaissance to gain a tactical advantage.

March 9, 2023 Read more

February 2023

Photographing The Deep Space Network at Goldstone

The Deep Space Network (DSN) is one of the most critical components of space exploration and communication. The network comprises a series of antennas that are used to communicate with interplanetary spacecrafts, such as the Mars rovers and the Voyager spacecrafts, as they travel through the solar system.

Recently, I uploaded my complete archive of photographs from the Goldstone Deep Space Communications Complex in California's Mojave Desert. I shot the DSN for a WIRED News article in 2008. The experience was truly unique, and I wanted to share my journey with the world.

The Journey Begins

The journey to Goldstone was an adventure in itself. I had to drive through the barren landscape of the Mojave Desert, navigating through dusty roads and dodging tumbleweeds. The drive was long, but the view of the endless desert and the clear night sky was breathtaking.

As I approached the Goldstone Complex, I was greeted by a massive 70-meter antenna, standing tall against the backdrop of the night sky. The sight was awe-inspiring and humbling, reminding me of the vastness of space and the incredible work that the DSN does.

Inside the Complex

Once inside the complex, I was given a tour of the facilities and shown the various antennas that form the DSN. Each antenna is unique, with its set of challenges and capabilities. Some antennas are used for deep space communication, while others are used for radar imaging.

One of the most interesting antennas I saw was the 70-meter antenna, which is used for deep space communication. The antenna is so massive that it takes a team of engineers to move it, and it can communicate with spacecraft millions of kilometers away. The level of precision and engineering that goes into each antenna is truly mind-boggling.

Capturing the Moment

After the tour, it was time for me to start capturing the beauty of the DSN. I set up my camera and tripod, and began shooting. The light from the antennas illuminated the surrounding area, creating a unique and otherworldly environment. I took advantage of this and used creative lighting techniques to capture the beauty of the complex.

February 21, 2023 Read more

Using GPT-3 to Write Captions Based on Image Keywords, People, Albums, and Locations

Have you ever wanted to caption your photos automatically? With the GPT-3 Davinci model from OpenAI, you can do just that! By using image keywords, people, locations, and the album name, you can use AI/ML to generate captions that are not only descriptive, but also entertaining (and frequently hilariously wrong).

In this post, I’ll explore the capabilities of GPT-3 for writing captions based on image data, and how it can add a new dimension to your photos.

February 15, 2023 Read more

Back in the Blogging Game: A Decade Later

It's been over a decade since I last wrote a blog post, but I'm excited to be back. I started blogging in 1998, and it's been a wild journey ever since. So much has changed in my life, and I'm eager to share it all with you.

February 6, 2023 Read more