Post

EPFL Day 4: Machine Learning & Neural Networks

EPFL Day 4: Machine Learning & Neural Networks

1. Making a Machine Learn

We started by redefining how we solve problems with computers. In classical programming (like the control laws we used on previous days), we provide the robot with explicit rules and data to compute an answer. In machine Learning we provide the robot with data and the correct answers, and the algorithm calculates the rules.

We learned that “learning” in this context is essentially a mathematical optimization problem. A model makes a prediction, calculates its error (loss), and adjusts its parameters to minimize that error over time.

2. Introduction to Artificial Neural Networks (ANNs)

Next, we explored the architecture used to process this learning: Artificial Neural Networks.

An Artificial Neural Network processes data through a series of layers (input, hidden, and output). At the core of every single artificial neuron is a linear combination followed by a non-linear activation function:

\[y = f\left(\sum_{i=1}^{n} w_i x_i + b\right)\]
  • Inputs ($x_i$): The raw numerical data fed into the network.
  • Weights ($w_i$): Coefficients that determine the importance of each input feature.
  • Bias ($b$): A constant that shifts the activation threshold.
  • Activation Function ($f$): A mathematical function (like ReLU or Sigmoid) that introduces non-linearity, allowing the network to solve complex, real-world problems instead of just drawing straight lines.

3. Practical Application: The Light Detector

Before a neural network can process anything, the robot must be able to accurately gather and quantify data from its environment. To apply our understanding of inputs and data collection, our first task of the day was testing a Light Detector Robot that the teachers gave us.

We programmed the robot to execute an autonomous scan:

  1. The Sweep: The robot rotated step-by-step, taking continuous input readings ($x_i$) from its light sensors.
  2. Data Processing: Instead of just reacting to the light instantly, the algorithm stored the values, comparing each new reading to the previous maximum to find the absolute peak intensity.
  3. Actuation: Once the environment was fully mapped, the robot used that processed data to autonomously navigate back to the exact location of the brightest light source.

Understanding this flow of raw data and algorithmic decision-making laid the perfect groundwork for the afternoon, where we would replace simple light sensors with a camera and train our own deep learning model to steer the robot.

Light Detector Robot

4. Towards Deep Learning

Handling images is tough because every single pixel is a separate input. While a basic sensor only feeds the robot a few numbers, a single picture contains thousands of pixels. Deep Learning fixes this data overload by processing the image step by step through hidden layers. The first layers just spot simple edges and lines, and the deeper layers combine them into complex shapes until the robot successfully recognizes the object.

5. Training the Vision Model

For the practical phase, they gave us a model. Using the laptop’s webcam, we gathered real-time image datasets to train the model to distinguish between two custom objects:

  • Object A: A hair claw clip.
  • Object B: A flower clip.

By presenting both objects to the camera at various angles and distances, the network adjusted its classification layer. The system calculated the probability distribution for each frame, mapping the visual matrix inputs to one of the two target categories.

6. Real-Time Inference and Robot Actuation

Finally we established a closed-loop system between the computer’s vision model and the robot’s physical movement.

The laptop executed continuous real-time inference on the webcam stream. Once the model classified the object with a high confidence threshold, it transmitted a specific command to the robot:

  • If Object A (Hair Clip) was detected, the robot received an instruction to execute a turn to the left.
  • If Object B (Flower Clip) was detected, the instruction commanded a turn to the right.

This project brought the whole AI process together. We captured a live video feed, used our deep learning model to recognize the objects, and translated that digital ‘thought’ into a physical reaction from the robot.

This post is licensed under CC BY 4.0 by the author.