Neural Networks: Introduction & MatlabExamples · • Single-Layer Neural Network • Fundamentals:...

Ricardo de Castro

Technische Universität München

Lehrstuhl für Elektrische Antriebssysteme

Neural Networks: Introduction & Matlab Examples

Agenda

• Introduction & Motivation

• Single-Layer Neural Network

• Fundamentals: neuron, activation function and layer

• Matlab example: constructing & evaluating NN

• Learning algorithms

• Batch solution: least-squares

• Online solution: LMS

• Matlab example: online system identification with NN

• Multi-Layer Neural Network

• Network architecture

• Learning algorithm: backpropagation

• Matlab example: nonlinear fitting with noise

• Overfitting & regularization

• Case Study

• Matlab example: MPC solution via Neural Networks

References

[1] Hagan et al. Neural Network Design, 2nd edition, 2014 online version: https://hagan.okstate.edu/nnd.html

[2] Abu-Mostafa et al. Learning from Data, a Short Course, 2012.

[3] Mathworks, Neural Network Toolbox User‘s Guide (2017)

(aka Deep Learning Toolbox )Chapters 2,3, 10 and 11

Some Problems…

Computer visionsymptoms:

fever, headache,

blood pressure, …

biochemical analysis,

age, height, …

Medical Diagnosis

Finance (e.g. prediction)

stockPrice[k+1]

stockPrice[k],

stockPrice[k-1],

stockPrice[k-N]

diagnosis

Control

(e.g., prediction / system identification)

y[k+1]u[k], u[k-1],… u[k-N],

y[k], y[k-1], …, y[k-M]

u = control input, y=output, k=time index

How to build a system that can learn

these tasks?

Motivation Example: Credit Approval

Adapted from “Learning from Data, a Short Course”

credit approved

credit rejected

Application Information

Formalism:

• Input: � = ��, ��, ��, … ∈ ℝ�

• Output: � ∈ 0,1

• Ideal function: � = ℎ �

• Data: ��, �� , ��, �� , … , (��, ��)

• Candidate model: �� = �(�)

Learning FrameworkCredit Approval

� �

(0 credit rejected, 1 credit approved)

(customer application)

(ℎ=ideal credit approval function)

(historical records)

(formula to be learned from data, e.g. Neural Network)

Remarks:

• Ideal (credit approval) function is unknown,

• …, but banks have massive amounts of data (customer info, default history, etc… )

Learning Framework (II)

Unknown Function

Candidate Model

(e.g. Neural Net)

� � ≈ �

Learning Algorithm

Adapt �s.t. � ≈ ��

� = ℎ ��

input output

Data Set

� = (��, �� , … , (��, ��)}

�� = �(�, �)

model output,

Main goal: learn � (i.e. � ≈ ℎ) using data�

��, �� =data point

parameters, �

��

�, �

Classification vs Regression

� ∈ {0,1, . . 9, }

Classification Problems

Output discrete categories

��(�)��(� − 1)

…��(� − 1)

� ∈ [0,∞)

Regression Problems

Output continuous variable

Neuron (Single Input)

• � ∈ ℝ

Output

• � ∈ ℝ

Parameters

• weight: � ∈ ℝ

• bias : � ∈ ℝ

Activation/Transfer

Function

• �: ℝ → ℝ

� = �(�� + �)

Net Input

• � ∈ ℝ

Neuron: Activation Functions

�Hard LimitLinear

� � = � � � = �0� < 01� ≥ 0

Log-Sigmoid

� � =1

1 + ��

Hyperbolic TangentSigmoid

� � =�� − ��

�� + ��

Matlab: purelin Matlab: hardlim

Matlab: logsig Matlab: tansig

Neuron: Multiple Inputs

� = �(�)

element-wise representation vector representation

Net input: � = ��,�� + ��,�� + ⋯��,�� + �

Inputs: ��, ��, … , ��

Weights: ��,�, ��,�, … , ��,�

Bias: �

Weight vector:

� = ��,� ��,� … ��,�

Bias: �

Net input: � = �� + �

Inputs: � = �� … ��

� = �(�� + �)

Number of inputs: �

Single-Layer Neural Network (NN)

element-wise representation

Number of inputs=�

�� = � ��,�� + ��,�� + ⋯��,�� + ��

Number of outputs= �

Remark: each of the � inputs is connected to a

neuron through a weight ��,�

��,�= weight connecting neuron �

with input �

Notation:

Single-Layer Neural Network (NN)

vector representation

� = number of inputs�= number of outputs

� = �(�� + �)

P = [1;

2]; % data set - input (R=2)

Y = 1; % data set - output (S=1)

net = linearlayer;

% Creates a single layer of linear neurons

% OUTPUT: net = neural network object

net = configure(net, P, Y);

% Configures network dimensions (inputs, outputs, weight,

% bias, etc) to match the size of data set (P,Y)

% INPUT

% net= neural network object

% P = [R-by-N] matrix with data set (input)

% Y = [S-by-N] matrix with data set (output)

% OUTPUT

% net= configured neural network object

Create and Evaluate Single-Layer NN (I)

Matlab Code

% net = Neural network object

net.layers{1}.transferFcn ='hardlim';

net.IW{1,1} = W; % set weights Wnet.b{1} = b; % set bias b

A= sim(net, P)

% Simulates neural network

% INPUT

% P = [R-by-N] input data

% OUTPUT

% A= [S-by-N] neural network output

Create and Evaluate Single-Layer NN (II)

Matlab Code

define activation function

evaluate NN for input(s) P

define W,b

Matlab Example

Consider a two-input network with one neuron and the following parameters:

� = 3 2 b = 1.2� =−11

Write a Matlab script that computes the output of the network for the following

activation functions:

a) Hard limit

b) Linear

c) Log-Sigmoid

Answers:

a) 1.000

b) 0.200

c) 0.197

Data set: � = (��, �� , … , (��, ��)},

vector input: �� ∈ ℝ�

scalar output: � ∈ ℝ

Single-layer linear NN: � = �� + �

parameters � = ��

Error function:

Learning Algorithm: Problem Setup

�(�) =1

2�� − ��(�) �

��

��(�) = �� + �

min�

�(�)Problem:

�∗ = �∗�∗ �

= optimal NN parameters

Learning Algorithm: Batch Solution

�� = 0

�� = � ��

1= ��

Problem: solution might be costly to compute for large data sets

Solution 1:

min�

�(�) = min�

�� − �� (� − ��) � =

��

⋮��

, � =

�� 1

�� 1⋮

�� 1

�∗ = ��

A. vector representation

B. first-order optimality condition

Least-squares Solution

min�

�(�) = min1

2�� − ��(�) �

��

Learning Algorithm: On-line Solution

Solution 2:

A. Split error function

�� =1

2(�� − �� )�

�� = � ��

1= ��

B. Sequential gradient descent

�(��) = �(�) − ��

� = ��

�(�) = ��

��(�) = −(�� − ��

��

min�

�(�) = min1

��

�(��) = �(�) + �(�� − �� 1 � � )��

Remarks:

• data points are processed one at a time (useful for real-time applications)

• learning rate needs to be chosen with care to ensure algorithm convergence [Hagan, Chap. 10]

C. Compute gradient

aka least-mean squares (LMS) algorithm

min�

�(�) = min1

2�� − ��(�) �

��

On-line Learning Algorithm: Matlab API

Matlab Code

%net = Neural network object

net.IW{1,1} = …; % set initial weightsnet.b{1} = …; % set initial bias

% setup learning algorithmnet.inputWeights{1,1}.learnFcn = 'learnwh'; net.biases{1}.learnFcn = 'learnwh';

net.inputWeights{1,1}.learnParam.lr = alpha;

net.biases{1,1}.learnParam.lr = alpha;

net.trainFcn = 'trains'; %

[net] = adapt(net,p,y); %

% INPUT

% p = [R-by-1] data point- input

% y = [S-by-1] data point- output

% OUTPUT

% net= updated neural network object (with new weights and bias)

define learning rate (�)

define learning algorithm

(Widrow-Hoff weight/bias learning=LMS)

set sequential/online training

apply 1 steps of the LMS algorithm

Remark: simplified API for function adapt()

Online System Identification via NN (I)

controlinput

outputPlant

Neural Network

Learning

Algorithm

�,�

NN output

Consider the following discrete-time model of a plant

Goal: design and train a single-layer NN capable of approximating the output of this plant

Proposed structure for the NN

• Single layer

• Inputs: � = � � � � − 1 � � − 2 �

• Activation function: linear

Write a Matlab script that:

a) generates a data set for the problem

- assume: � � = sin(2��), �� = 1�, � = 0.02��, � = 1,2,3, … , 150

b) plots data set

c) creates the NN and define initial weights and bias: � = 0 −10 0 , � = −2

d) adapts the weight and bias of the NN in order to approximate the plant’s output � �

- option 1: via adapt(.); option 2: manual implementation of the LMS algorithm

- learning rate � = 0.2

e) plots i) plant’s output, NN’s output; ii) difference between plant’s output and NN’s output23

Online System Identification via NN (II)

� � = � � + 1.8� � − 1 + 0.9�[� − 2]

Results:

Online System Identification via NN (III)

a) Training Data b) Fitting Results

Final weights/bias� = 6.23 −5.06 3.58 , � = 0.0883

0 50 100 150

u[k-1]

u[k-2]

0 50 100 150-4

4 y(true)

y(NN prediction)

0 50 100 150

Multi-Layer Neural Network (I)Example: 2-layer NN

� = number of inputs

Layer 1 Layer 2

Number of neurons ��

Outputs ��, ��

�,… ��

�, ��,… ��

Weights ��

� = 1,… , ��; � = 1,… , �

��

� = 1,… , ��; � = 1,… , ��

Bias ��, � = 1, … , ��

�, � = 1,… , ��

Activation function ��

Element-wise representation

Multi-Layer Neural Network (II)

Layer 1 Layer 2

Number of neurons ��

Outputs ��

Weights ��

Bias ��

Activation function ��

Vector representation

�� = ��(�� + ��) �� = �� + ��

Example: 2-layer NN

Multi-Layer Neural Network (III)

�� = ��(�� + ��) �� = �� + ��

Remarks:

• Layer 1 Hidden/Input layer

• Layer 2 Output layer

• �, �� defined by data set (input and output dimensions)

• ��, �� , �� defined by the designer

• ��, ��, ��, ��parameters defined by the learning algorithm

�� = �� (�� + ��) + ��

Hidden layer Output layer

Data set: � = (��, �� , … , (��, ��)},

vector input: �� ∈ ℝ�

scalar output: � ∈ ℝ

Two-layer NN:

� = � �, � = �� (�� + ��) + ��

parameters � = (��,��, ��, ��)

Error function:

Learning Algorithm: Problem Setup

�(�) =1

2�� − ��(�) �

��

��(�) = � ��, �

min�

�(�)Problem:

�∗ = optimal NN parameters

Learning Algorithm: Backpropagation

Solution :

A. Split error function

�� =1

2(�� − �� )�

�� = � ��, �

B. Sequential gradient descent

�(��) = �(�) − ��

� = ��

�(�) = ��

min�

�(�) = min1

��

min�

�(�) = min1

2�� − ��(�) �

��

��(�) = −(�� − ��)��(��, �)

��

C. Compute gradient

Backpropagation algorithm

compute gradients of �(��, �) using chain rule

(see [Hagan, Chap. 11] for details)

� ��, � =�� (�� + ��) + ��

Matlab API: Create a 2-layer NN

P = [1;

2]; % data set - input (R=2)

Y = 1; % data set - output

hiddensize = [10];

net = feedforwardnet(hiddensize);

% Creates a multi-layer neuron network

% INPUT

% hiddensize = row vector with one or more

% hidden layer sizes

% OUTPUT: net = neural network object

net = configure(net, P, Y);

% configures network dimensions (inputs, outputs, weight,

% bias, etc) to match the size of data set (P,Y)

Matlab Code

�� = 10 � = 2, �� = 1

Matlab API: Setup activation functions, W, b

% define activation function,

% weight and bias of Layer 1net.layers{1}.transferFcn ='logsig';net.IW{1,1} = … ;% set W1 net.b{1} = …; % set b1

% define activation function,

% weight and bias of Layer 2net.layers{2}.transferFcn ='purelin';net.LW{2,1} = …; % set W2net.b{2} = …; % set b2

Matlab Code

��, ��

% Remark: net.b{i} = bias of layer i; net.LW{i,j} = weights conecting layer i with layer j

Matlab API: Training Algorithm

net.trainFcn = 'traingd'; % Gradient descent backpropagation

%net.trainFcn = ' trainlm'; % Levenberg-Marquardt backpropagation (default)

net.trainParam.lr = 0.01; % learning rate (\alpha)net.trainParam.epochs=1000; % Maximum number of epochs to trainnet.trainParam.goal=0; % performance goal (stop criteria)

net.trainParam.min_grad=1e-5; % Minimum performance gradient (stop critera)

Matlab Code

Learning/Training

Algorithm��, ��,��, ��

(min acceptable value for �(�) )

(min acceptable value for��(�))

Matlab API: Training Algorithm

net.trainParam.showWindow = 0; % deactivate interactive GUInet.trainParam.showCommandLine=1; % Generate command-line outputnet.divideFcn = ’’; % use entire data set for training

[net]=train(net, P,Y);

% train trains a network net according to net.trainFcn and net.trainParam.

% INPUT

% P = [R-by-N] data set - input

% Y = [S-by-N] data set - output

% OUTPUT

% net = (trained) neural network output

Matlab Code

Learning/Training

Algorithm��, ��,��, ��

Consider the following system

a) generates an artificial data set for this problem, i.e.

� = { ��, �� , }, where �� ∈ {−1,−0.9, −0.8, … , 1} is the input for the system

b) creates a neural network (NN) with 2 layers;

a) hidden layer with 10 neurons and activation function=logistic sigmoid

b) output layer with 1 neuron and activation function=linear

c) trains the weights of the NN to fit the data set �

d) plots

training results, i.e. the data set � and the output of the trained NN

testing results, i.e. the data set �, the (true) function and trained NN evaluated in the

domain �� ∈ {−1,−0.99, −0.98,−0.97,… , 1}34

Matlab Example: Nonlinear Fitting

• input: � ∈ [−1,1]

• output: � ∈ ℝ

• function: ℎ � = 1 + sin ��

• random noise: � ∼ �(0, ��)

Gaussian distribution with zero

mean and variance �� = 0.2

ℎ ��

��

Matlab Example: Nonlinear Fitting (II)

Perfect fitting of training data We might be fitting noise, instead of

the true signal (ℎ)

Problem: previous example shown poor generalization, i.e.

• NN fits very well the training data set,

• … but produces large error when faced with new inputs

Solution*: Regularization

• Idea: penalize large values of NN parameters (weights/bias)

• � ∈ [0,1] → controls the importance of regularlization vs fitting

• larger � →smoother fitting of neural network

Overfitting

min�

(1 − �)� � + � � �

regularization term

Problem:

� = (��,��, ��,��,… )

Matlab Code

net.performParam.regularization = 0.1; % sets value of lambda(�)

*Additional solutions: early stopping, reduce number of neurons (see Hagen,Chap.13, for details)

(Continuation of previous problem)

e) adapt the previous Matlab script in order to train neural network with regularization

• use � = 10��

f) Plot training and testing results for both cases

Matlab Example: Nonlinear Fitting (III)

• input: � ∈ [−1,1]

• function: ℎ � = 1 + sin ��

• random noise: � ∼ �(0, ��)

Gaussian distribution with zero

mean and variance �� = 0.2

ℎ ��

��

Matlab Example: Nonlinear Fitting (IV)

� = 0

� = 10��

Matlab API: Auxiliary Commands

Matlab Code

gensim(net);

% net = neural network object

Generation of Simulink block

for neural network simulation

view(net);

Generation of a

graphical view

Consider the following discrete-time model of a mechanical system

MPC-NN: Motivation

Previous lecture: the MPC Toolbox was employed to design a controller that brings the

state of this system to the origin while fulfilling input and state constraints.

Issue: the computational time of the MPC is too high due to the numerical optimization

routines

Task: design a NN that learns the MPC’s control law, enabling us to quickly compute the

optimal control action

1) Design MPC Controller

MPC-NN: Design Approach

MPC Plant

controlinput states

2) Design NN

Neural Network

Training

Algorithm

parameters

3) Replace MPC with NN

controlinput states

Neural Network

a) generates data set* where

b) plots training data

c) trains a neural network to learn the MPC‘s control law using the following settings

2 layers

hidden layer: 20 neurons, „logsig“ activation function

output layer: 1 neuron, linear activation function

d) plots neural network’s fitting of training data

e) exports trained network to Simulink and simulates the closed-loop response of

plant with initial state

MPC-NN: Matlab Example

*Matlab implementation of MPC controller: i) previous lecture; or ii) https://tinyurl.com/tsfvdyp

MPC-NN: Matlab Example (II)

b) training data d) NN fitting

MPC-NN: Matlab Example (III)

e) Plant response with NN control

Summary

Introduction

• Data sets, learning algorithms, candidate models

Single-Layer Neural Network

• Fundamentals: neuron, activation function, layer

• Matlab example: constructing & evaluating NN

• Learning algorithms

• Batch solution: least-squares

• Online solution: LMS

Multi-Layer Neural Network

• Network architecture

• Learning algorithm: backpropagation

• Overfitting & regularization

Key Matlab Functions

linearlayer()

configure()

net.layer{i}.transferFcn=...

adapt()

feedforwardnet()

train()

net.performParam.

regularization=…

Homework

Consider the following system

a) generates an artificial data set for this problem, i.e.

� = ��, �� , where �� is an input sampled from the grid −1,−0.9, . . , 1 × −1,−0.9, . . , 1

b) plots data set (hint use plot3)

c) creates a neural network (NN) with 2 layers;

a) hidden layer with 50 neurons and activation function= hyperbolic tangent sigmoid

b) output layer with 1 neuron and activation function=linear

d) trains the weights and bias of the NN to fit the data set �

• Use all data for training; number of epochs = 2000

e) plots training results, i.e. the data set � and the output of the trained NN

f) adds a regularization term to the training (� = 0.5) and plots results

• input: � = �� ∈ −1,1 × [−1,1]

• function: ℎ � = 10�� + 3�� + 7��

• random noise: � ∼ � 0, �� , �� = 2

ℎ ��

��

Homework: Expected Results

-0.5-1 -1

� = 0.5� = 0.0

Neural Networks: Introduction & MatlabExamples · • Single-Layer Neural Network • Fundamentals:...

Documents

Transcript of Neural Networks: Introduction & MatlabExamples · • Single-Layer Neural Network • Fundamentals:...

Andreas Grübl Eine FPGA-basierte Plattform Diplomarbeit HD ...€¦ · An FPGA-based platform for neural networks The presented thesis describes the development of printed circuit

Modellierung eines rekurrenten neuronalen Netzwerks zur ... · Thus systems of neural networks have to be able to ﬁnd a more compact representa-tion for further processing. In mathematical

Multi-Layer Networks and Learning Algorithmscampar.in.tum.de/twiki/pub/Far/MachineLearningWiSe2003/NNVortrag.pdf · Auto fahren: (1993) ALVINN (Autonomous Land Vehicle In a Neural

SPSS Neural Networks (Neuronale Netze)™ 16 - uni … · Vorwort SPSS 16.0 ist ein umfassendes System zum Analysieren von Daten. Das optionale Erweiterungsmodul SPSS Neural Networks

Link Layer Security in BT LE. Physical Layer Link Layer Host Controller Interface L2CAP Attribute Protocol Attribute Profile PUIDRemote ControlProximityBatteryThermostatHeart.

AI in 45 minutes - Cybernetics Lab · Deep learning The age of deep learning (deep neural networks)! ^Today, computers are beginning to be able to generate human-like insights into

TU Darmstadt Learning Torsten Reil Neural Networks TU Darmstadt Einführung in die Künstliche Intelligenz 2 V2.0 | J. Fürnkranz Learning Learning is essential for unknown environments,

Learning to Play 3 3 Games: Neural Networks as Bounded ...€¦ · Neural Networks as Bounded-Rational Players Abstract We present a neural network methodology for learning game-playing

Digitalisierung Was wird von Künstlicher Intelligenz erwartet? · 2018-06-21 · Artificial Intelligence Neural Networks Rule Based Automation Genetic Algorithms ots on Deep Learning

How Artificial Intelligence can improve Healthcare...There has been too much hype and too much fear around what A.I. competence is ... Deep neural networks cannot explain how a diagnosis

3rd Edition: Chapter 1kth/ns.pdf · Networks and Security 2 Table of Contents Student Contributions 3 Introduction (ch. 1) 4 Application Layer (ch. 2) 63 Transport Layer (ch. 3) 135

Spectral Temporal Graph Neural Network for Multivariate ...€¦ · Current state-of-the-art models highly depend on Graph Convoluational Networks (GCNs) [13] originated from the

IBM SPSS Neural Networks 22 - hilfe.uni-paderborn.de · „Bemerkungen” auf Seite 21 gelesen werden. Produktinformation Diese Ausgabe bezieht sich auf Version 22, Release 0, Modifikation

neural networks and tree search - Semantic Scholar€¦ · The final version of AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs. Distributed version of AlphaGo that exploited

ARTIFICIAL NEURAL NETWORKS FOR GENOME-ENABLED …€¦ · ARTIFICIAL NEURAL NETWORKS FOR GENOME-ENABLED PREDICTION IN CATTLE: POTENTIAL AND LIMITATIONS Dissertation zur Erlangung

Wide Neural Networks of Any Depth Evolve as Linear Models ... · Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent Jaehoon Lee , Lechao Xiao, Samuel

Exploring Strategies for Training Deep Neural Networksjmlr.org/papers/volume10/larochelle09a/larochelle09a.pdf · EXPLORING STRATEGIES FOR TRAINING DEEP NEURAL NETWORKS 2. using unsupervised

Deep Neural Networks for the Assessment of Surgical Skills ...

Exploring Strategies for Training Deep Neural Networks - Machine

Machine Translation and Neural Networks for a multilingual EU