The second of our MIST101 workshop series, MIST101 #2 Supervised Learning and Neural Network, was successfully held last Thursday. The workshop went in depth to the major components of Supervised Learning problems and Neural Networks, and we are glad to obtain positive feedbacks from the audience.

During the workshop, we presented the Linear/Logistic Regression models and Neural Network architecture, introduced a learning algorithm called Gradient Descent that finds the minima of a loss function to foster model improvement/learning. In the end, we gave a summary of a typical training pipeline as:

- Preprocess data and split it into training(80%), validation(10%), and the test data set(10%).
- Choose a model architecture, optimize the model to minimize the loss function by a learning algorithm on the training set.
- Evaluate and fine-tune the model on the validation set.
- Repeat 2 and 3 until an optimized model is obtained.
- Evaluate the model on the test set to get a final performance score.

(Slides 60)

**Introduction to Neural Networks**

Supervised learning learns and constructs the model that best represents the target underlying function of the given input/output pairs. Typical examples of supervised learning are categorized into 2 types, Regression Problems and Classification Problems.

A loss function, which measures the incorrect prediction of a model to the given data set and serves as the learning objective to be minimized, can take various forms such as Mean Square Error(MSE) and Cross entropy, depending on the task on hand and the modeling principle (MLE, MAP, Bayesian).

**Artificial Neural Network Models**

A computational graph consists of Nodes and Edges that act as functions and input/output of a neural network. A Linear Regression Model,

and Logistic Regression Model, where σ is a nonlinear function,

were shown and explained in detail.

There are three major types of Artificial Neural Networks: Feed-forward Neural Network(FNN), Convolutional Neural Networks(CNN), and Recurrent Neural Networks(RNN). This workshop went in depth to talk about the fully-connected FNN. FNN consists of layers of neurons and no cycle, while a neuron encapsulates a linear transformation followed by a nonlinear activation. CNN and RNN will be introduced in workshop #3 and #4 respectively.

**Gradient Descent**

Gradient is a multi-variable generalization of the derivative and has a direction of greatest rate of increase in the function. It allows the observing point to move towards the valley of the function to find its global/local minimum. Usually we use a method called Back-propagation to compute gradients on a computational graph. There is Batch GD that sums loss across the whole training set, and Stochastic/Mini-batch GD that only accumulate loss on one, or a mini-batch of training samples chosen randomly. The latter is normally considered as standard practice in applications. Concepts of Momentum and Adaptive Learning Rate were also introduced to augment Gradient Descent methods.

**Model Evaluation**

Lastly, if we were to evaluated the model, there could be scenarios where the model is not powerful enough or overpowered to learn the data, which are called Underfitting and Overfitting. To improve the model, we can tune the neural network architecture, training schedule, model regularization and etc. After many iterations of evaluations and tuning, we can settle down on the model that meets our needs.

**Hands-on TensorFlow Tutorial**

A hands-on tutorial on TensorFlow was given after the lecture session. Some simple, representative examples were demonstrated in this session. You can gain some hands-on experience by yourself through reading the quick tutorial on the following link: https://github.com/ColinQiyangLi/MIST101

Thank you all for coming to our workshop #2, we hope you have gained some insights of supervised learning and Neural Network! MIST101 #3, on 7:00-9:00pm October 12th, in GB119, will be talking about Convolutional Neural Network(CNN). CNN is specifically efficient in image processing, and it is what enables the machine to achieve unprecedented success from distinguishing simple cats & dogs images to demonstrating super-human performances in object recognition, segmentation and etc.

If you would like to learn more about workshop #2, the slides can be found under this link: https://docs.google.com/presentation/d/1guDvX8jy461qH8SmtdOYj_2BU76QHcugW32MQfwXhQU/edit?usp=sharing