'Deep Learning/Keras'에 해당되는 글 19건

[iou 추가] How to get accuracy, F1, precision and recall, iou, for a keras model?

Deep Learning/Keras 2021. 8. 14. 13:24

https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model

위 사이트에 accuracy, F1, precision and recall만 구하는데, 내가 iou 구하는 코드도 추가했다.

def iou_m(y_true, y_pred, dtype=tf.float32):
    # tf tensor casting
    y_pred = tf.convert_to_tensor(y_pred)
    y_pred = tf.cast(y_pred, dtype)
    y_true = tf.cast(y_true, y_pred.dtype)

    y_pred = tf.squeeze(y_pred)
    y_true = tf.squeeze(y_true)

    y_true_pos = tf.reshape(y_true, [-1])
    y_pred_pos = tf.reshape(y_pred, [-1])

    area_intersect = tf.reduce_sum(tf.multiply(y_true_pos, y_pred_pos))

    area_true = tf.reduce_sum(y_true_pos)
    area_pred = tf.reduce_sum(y_pred_pos)
    area_union = area_true + area_pred - area_intersect

    return tf.math.divide_no_nan(area_intersect, area_union)

-------------------------------------------------------------------------------------------

from keras import backend as K

def recall_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

def precision_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision

def f1_m(y_true, y_pred):
    precision = precision_m(y_true, y_pred)
    recall = recall_m(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))

# compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc',f1_m,precision_m, recall_m])

# fit the model
history = model.fit(Xtrain, ytrain, validation_split=0.3, epochs=10, verbose=0)

# evaluate the model
loss, accuracy, f1_score, precision, recall = model.evaluate(Xtest, ytest, verbose=0)

저작자표시 비영리 동일조건

'Deep Learning > Keras' 카테고리의 다른 글

#kerasexamples #모든예제 https://keras.io/examples/ 에 가보니 정말 많은 예제들이 만들어져 있네요. Know (0)	2021.03.01
Deep Learning with Keras Series By Ali Masri 1. Deep Learning with Keras Tutoria (0)	2020.07.06
[Keras] GPU sync failed, (0)	2019.08.21
www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13

Posted by uniqueone

#kerasexamples #모든예제 https://keras.io/examples/ 에 가보니 정말 많은 예제들이 만들어져 있네요. Know

Deep Learning/Keras 2021. 3. 1. 10:25

#kerasexamples #모든예제 https://keras.io/examples/ 에 가보니 정말 많은 예제들이 만들어져 있네요. Knowledge Distillation 그리고 최근에 나온 VIT, Switch Transformer까지 있네요. (며칠전에 허깅페이스에서 switch transfoemer 구현해달라는 issue를 본듯한데요. 3번째 이미지). 이 예제들은 한번씩 읽어 보시기에 너무 좋을듯 합니다.

https://youtu.be/Y2K13XDqwiM 을 보니 이런 코드를 하나씩 골라서 설명을 하는데 저희 TF-KR 의 PR12 처럼 10여명 함께 팀으로 KR12 (Keras example Reading) 만들어서 예제 하나씩 설명해보고 또 이 예제를 어디 사용할수 있는지 응용한두게 찾아서 적용해보는것을 해볼까요? 요즈음 AI교육을 많이 하시던데 좋은 교제일듯 합니다.

KR12 관심있으신분들 아래 댓글로 남겨주시면 teaming 해서 PR12처럼 KR12 한번 달려보도록 하겠습니다. (12분이 Zoom으로 모여서 한주에 예제 2~3개 설명하고 토론하고 그 영상을 공개하는 모임입니다.)

'Deep Learning > Keras' 카테고리의 다른 글

[iou 추가] How to get accuracy, F1, precision and recall, iou, for a keras model? (0)	2021.08.14
Deep Learning with Keras Series By Ali Masri 1. Deep Learning with Keras Tutoria (0)	2020.07.06
[Keras] GPU sync failed, (0)	2019.08.21
www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13

Posted by uniqueone

Deep Learning with Keras Series By Ali Masri 1. Deep Learning with Keras Tutoria

Deep Learning/Keras 2020. 7. 6. 11:52

Deep Learning with Keras Series By Ali Masri
1. Deep Learning with Keras Tutorial https://www.marktechpost.com/2019/06/11/deep-learning-with-keras-tutorial-part-1/
2. Data Pre-processing for Deep Learning models https://www.marktechpost.com/2019/06/14/data-pre-processing-for-deep-learning-models-deep-learning-with-keras-part-2/
3. Regression with Keras https://www.marktechpost.com/2019/06/17/regression-with-keras-deep-learning-with-keras-part-3/
4. Classification https://www.marktechpost.com/2019/06/24/deep-learning-with-keras-part-4-classification/
5. Convolutional Neural Networks https://www.marktechpost.com/2019/07/04/deep-learning-with-keras-part-5-convolutional-neural-networks/
6. Textual Data Preprocessing https://www.marktechpost.com/2019/09/13/deep-learning-with-keras-part-6-textual-data-preprocessing/
7. Recurrent Neural Networks https://www.marktechpost.com/2019/10/01/deep-learning-with-keras-part-7-recurrent-neural-networks/

'Deep Learning > Keras' 카테고리의 다른 글

[iou 추가] How to get accuracy, F1, precision and recall, iou, for a keras model? (0)	2021.08.14
#kerasexamples #모든예제 https://keras.io/examples/ 에 가보니 정말 많은 예제들이 만들어져 있네요. Know (0)	2021.03.01
[Keras] GPU sync failed, (0)	2019.08.21
www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13

Posted by uniqueone

[Keras] GPU sync failed,

Deep Learning/Keras 2019. 8. 21. 15:27

1. https://m.blog.naver.com/wideeyed/221329619056

GPU기반 Keras로 코드를 작성하다보면 아래와 같은 오류 메시지에 직면할 때가 있다.

InternalError: Blas GEMM launch failed
CUDA_ERROR_OUT_OF_MEMORY
InternalError: GPU sync failed

GPU에 할당된 메모리를 다른 세션이 점유하고 있어서 발생할 가능성이 높다.
1) 점유하고 있는 세션을 중단하고 메모리를 회수한다.
2) Keras가 사용하는 Backend엔진(ex.Tensorflow)의 메모리 추가 사용을 허락한다.

이 문제를 해결한 후 오류가 발생한 세션을 다시 시작해야한다.
그렇지 않으면 "InternalError: GPU sync failed"가 발생할 수 있다.

[Tensorflow Backand 엔진 설정 방법]

from keras.backend import tensorflow_backend as K config = tf.ConfigProto() config.gpu_options.allow_growth = True K.set_session(tf.Session(config=config))

전체 소스코드는 아래 포스트를 참고하세요.

[Keras] IRIS데이터 이용한 DNN

IRIS데이터를 이용한 간단한 DNN 학습 및 추론을 해보자. 데이터를 Train/Test로 구분한 후 학습...

blog.naver.com

끝.

--------------------------------------------------------------------------------------

2. https://zereight.tistory.com/228

GPU 동기화 오류이다.

다음 코드를 돌려서 해결하자

import tensorflow as tf

config = tf.ConfigProto()

config.gpu_options.per_process_gpu_memory_fraction = 0.4

session = tf.Session(config=config)

session.close()

--------------------------------------------------------------------------------------

3. https://emmadeveloper.tistory.com/27

GPU sync failed 에러가 떴다.

러닝 시작한 것 확인하고 잤는데 에러 떠 있어서

확인해보니 전용 GPU메모리의 50% 정도를 이미 다른 곳에서 점유하고 있었다.

나머지 것들을 shutdown시켰다.

다시 실행해보려는데 안 됨.

그냥 jupyter notebook을 재실행했더니 다시 잘 된다.

이제 돌리기 전에 전용 GPU 메모리 사용량을 미리 확인하고, 깔끔하게 낮춘 후, 돌려야겠다.

--------------------------------------------------------------------------------------
4. https://stackoverflow.com/questions/51112126/gpu-sync-failed-while-using-tensorflow

TLDR: If you find that tensorflow is throwing a GPU sync failed Error, it may be because the model's inputs are too large (as was my case when first running into this problem) or you don't have cuDNN installed properly. Verify that cuDNN is installed correctly and reset your nvidia caches (ie. sudo -rf $HOME/.nv/) (if you have no yet done so after initially installing CUDA and cuDNN) and restart your machine.

저작자표시 비영리 동일조건

'Deep Learning > Keras' 카테고리의 다른 글

#kerasexamples #모든예제 https://keras.io/examples/ 에 가보니 정말 많은 예제들이 만들어져 있네요. Know (0)	2021.03.01
Deep Learning with Keras Series By Ali Masri 1. Deep Learning with Keras Tutoria (0)	2020.07.06
www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13
How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ (0)	2017.12.13

Posted by uniqueone

www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models

Deep Learning/Keras 2018. 1. 4. 10:39

http://www.learnopencv.com/keras-tutorial-transfer-learning-using-pre-trained-models/

'Deep Learning > Keras' 카테고리의 다른 글

Deep Learning with Keras Series By Ali Masri 1. Deep Learning with Keras Tutoria (0)	2020.07.06
[Keras] GPU sync failed, (0)	2019.08.21
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13
How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ (0)	2017.12.13
TensorFlow Speech Recognition - Kaggle competition keras (0)	2017.11.25

Posted by uniqueone

Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요

Deep Learning/Keras 2017. 12. 13. 07:24

https://m.facebook.com/groups/255834461424286?view=permalink&id=572281386446257

머신러닝 입문하시는 분들을 위해 흥미로운 사이트를 공유합니다. 나온지 좀 된거 같은데 Tensorflow KR에서 검색해 봐도 공유된 적이 없는것 같더군요. 혹시 한분이라도 보시고 도움되면 좋을 것 같아서 써요.

Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요. 복습한다고 생각하시고 보는것도 나쁘지 않을 것 같습니다.

PDF, 샘플코드, 동영상 조금으로 이루어져 있는데 깔끔하게 잘 만든것 같습니다. 코드는 keras를 사용합니다.

https://software.intel.com/en-us/ai-academy/students/kits

'Deep Learning > Keras' 카테고리의 다른 글

[Keras] GPU sync failed, (0)	2019.08.21
www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ (0)	2017.12.13
TensorFlow Speech Recognition - Kaggle competition keras (0)	2017.11.25
An Introduction to different Types of Convolutions in Deep Learning (0)	2017.11.22

Posted by uniqueone

How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/

Deep Learning/Keras 2017. 12. 13. 07:21

https://www.facebook.com/MachineLearningMastery/posts/1979228055625051

How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/

'Deep Learning > Keras' 카테고리의 다른 글

www.learnopencv.com Keras Tutorial : Transfer Learning using pre-trained models (0)	2018.01.04
Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13
TensorFlow Speech Recognition - Kaggle competition keras (0)	2017.11.25
An Introduction to different Types of Convolutions in Deep Learning (0)	2017.11.22
How to Use the Keras Functional API for Deep Learning (0)	2017.10.27

Posted by uniqueone

TensorFlow Speech Recognition - Kaggle competition keras

Deep Learning/Keras 2017. 11. 25. 10:25

https://m.facebook.com/groups/107107546348803?view=permalink&id=532112330514987

TensorFlow Speech Recognition - Kaggle competition is going on. I wrote a basic tutorial on speech (word) recognition using some of the datasets from the competition.
.
Hope it will be helpful for some of you. Thanks in advance for reading!
.

https://blog.manash.me/building-a-dead-simple-word-recognition-engine-using-convnet-in-keras-25e72c19c12b

'Deep Learning > Keras' 카테고리의 다른 글

Intel AI Academy 에서 무료로 공개한 Machine Learning 101, Deep Learning 101 수업 입니다. Regression / Classification 부터 CNN RNN까지 다룬걸 보니 김성훈 교수님 수업이랑 오버랩 되네요 (0)	2017.12.13
How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ (0)	2017.12.13
An Introduction to different Types of Convolutions in Deep Learning (0)	2017.11.22
How to Use the Keras Functional API for Deep Learning (0)	2017.10.27
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.10.18

Posted by uniqueone

An Introduction to different Types of Convolutions in Deep Learning

Deep Learning/Keras 2017. 11. 22. 16:47

An Introduction to different Types of Convolutions in Deep Learning
https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d

Homepage
Towards Data Science
Get started
HOMEDATA SCIENCEMACHINE LEARNINGPROGRAMMINGVISUALIZATIONEVENTSLETTERSCONTRIBUTE
Go to the profile of Paul-Louis Pröve
Paul-Louis Pröve
Artificial Intelligence @ PwC
Jul 22
An Introduction to different Types of Convolutions in Deep Learning

Let me give you a quick overview of different types of convolutions and what their benefits are. For the sake of simplicity, I’m focussing on 2D convolutions only.
Convolutions
First we need to agree on a few parameters that define a convolutional layer.

2D convolution using a kernel size of 3, stride of 1 and padding
Kernel Size: The kernel size defines the field of view of the convolution. A common choice for 2D is 3 — that is 3x3 pixels.
Stride: The stride defines the step size of the kernel when traversing the image. While its default is usually 1, we can use a stride of 2 for downsampling an image similar to MaxPooling.
Padding: The padding defines how the border of a sample is handled. A (half) padded convolution will keep the spatial output dimensions equal to the input, whereas unpadded convolutions will crop away some of the borders if the kernel is larger than 1.
Input & Output Channels: A convolutional layer takes a certain number of input channels (I) and calculates a specific number of output channels (O). The needed parameters for such a layer can be calculated by I*O*K, where K equals the number of values in the kernel.
Dilated Convolutions
(a.k.a. atrous convolutions)

2D convolution using a 3 kernel with a dilation rate of 2 and no padding
Dilated convolutions introduce another parameter to convolutional layers called the dilation rate. This defines a spacing between the values in a kernel. A 3x3 kernel with a dilation rate of 2 will have the same field of view as a 5x5 kernel, while only using 9 parameters. Imagine taking a 5x5 kernel and deleting every second column and row.
This delivers a wider field of view at the same computational cost. Dilated convolutions are particularly popular in the field of real-time segmentation. Use them if you need a wide field of view and cannot afford multiple convolutions or larger kernels.
Transposed Convolutions
(a.k.a. deconvolutions or fractionally strided convolutions)
Some sources use the name deconvolution, which is inappropriate because it’s not a deconvolution. To make things worse deconvolutions do exists, but they’re not common in the field of deep learning. An actual deconvolution reverts the process of a convolution. Imagine inputting an image into a single convolutional layer. Now take the output, throw it into a black box and out comes your original image again. This black box does a deconvolution. It is the mathematical inverse of what a convolutional layer does.
A transposed convolution is somewhat similar because it produces the same spatial resolution a hypothetical deconvolutional layer would. However, the actual mathematical operation that’s being performed on the values is different. A transposed convolutional layer carries out a regular convolution but reverts its spatial transformation.

2D convolution with no padding, stride of 2 and kernel of 3
At this point you should be pretty confused, so let’s look at a concrete example. An image of 5x5 is fed into a convolutional layer. The stride is set to 2, the padding is deactivated and the kernel is 3x3. This results in a 2x2 image.
If we wanted to reverse this process, we’d need the inverse mathematical operation so that 9 values are generated from each pixel we input. Afterward, we traverse the output image with a stride of 2. This would be a deconvolution.

Transposed 2D convolution with no padding, stride of 2 and kernel of 3
A transposed convolution does not do that. The only thing in common is it guarantees that the output will be a 5x5 image as well, while still performing a normal convolution operation. To achieve this, we need to perform some fancy padding on the input.
As you can imagine now, this step will not reverse the process from above. At least not concerning the numeric values.
It merely reconstructs the spatial resolution from before and performs a convolution. This may not be the mathematical inverse, but for Encoder-Decoder architectures, it’s still very helpful. This way we can combine the upscaling of an image with a convolution, instead of doing two separate processes.
Separable Convolutions
In a separable convolution, we can split the kernel operation into multiple steps. Let’s express a convolution as y = conv(x, k) where y is the output image, x is the input image, and k is the kernel. Easy. Next, let’s assume k can be calculated by: k = k1.dot(k2). This would make it a separable convolution because instead of doing a 2D convolution with k, we could get to the same result by doing 2 1D convolutions with k1 and k2.

Sobel X and Y filters
Take the Sobel kernel for example, which is often used in image processing. You could get the same kernel by multiplying the vector [1, 0, -1] and [1,2,1].T. This would require 6 instead of 9 parameters while doing the same operation.
The example above shows what’s called a spatial separable convolution, which to my knowledge isn’t used in deep learning. I just wanted to make sure you don’t get confused when stumbling upon those. In neural networks, we commonly use something called a depthwise separable convolution.
This will perform a spatial convolution while keeping the channels separate and then follow with a depthwise convolution. In my opinion, it can be best understood with an example.
Let’s say we have a 3x3 convolutional layer on 16 input channels and 32 output channels. What happens in detail is that every of the 16 channels is traversed by 32 3x3 kernels resulting in 512 (16x32) feature maps. Next, we merge 1 feature map out of every input channel by adding them up. Since we can do that 32 times, we get the 32 output channels we wanted.
For a depthwise separable convolution on the same example, we traverse the 16 channels with 1 3x3 kernel each, giving us 16 feature maps. Now, before merging anything, we traverse these 16 feature maps with 32 1x1 convolutions each and only then start to them add together. This results in 656 (16x3x3 + 16x32x1x1) parameters opposed to the 4608 (16x32x3x3) parameters from above.
The example is a specific implementation of a depthwise separable convolution where the so called depth multiplier is 1. This is by far the most common setup for such layers.
We do this because of the hypothesis that spatial and depthwise information can be decoupled. Looking at the performance of the Xception model this theory seems to work. Depthwise separable convolutions are also used for mobile devices because of their efficient use of parameters.
Questions?
This concludes our little tour through different types of convolutions. I hope it helped to get a brief overview of the matter. Drop a comment if you have any remaining questions and check out this GitHub page for more convolution animations.
Machine LearningConvolutionalCnnNeural NetworksDeep Learning
One clap, two clap, three clap, forty?
By clapping more or less, you can signal to us which stories really stand out.

572
5
Follow
Go to the profile of Paul-Louis Pröve
Paul-Louis Pröve
Medium member since Oct 2017
Artificial Intelligence @ PwC
Follow
Towards Data Science
Towards Data Science
Sharing concepts, ideas, and codes.
More from Towards Data Science
Making Your Own Spotify Discover Weekly Playlist
Go to the profile of Nick Behrens
Nick Behrens

1.4K

Also tagged Neural Networks
Yes you should understand backprop
Go to the profile of Andrej Karpathy
Andrej Karpathy

4.4K

Related reads
Using GANS for semi-supervised learning
In supervised learning, we have a training set of inputs x and class labels y. We train a model that takes x as input and gives y as output…
Go to the profile of Manish Chablani
Manish Chablani

34

Responses
Conversation between Krishna Teja and Paul-Louis Pröve.
Go to the profile of Krishna Teja
Krishna Teja
Sep 18
Hi Paul,
It’s a great post. I would like to add a bit to your explanation on usage of separable convolutions in neural networks.
Note: In neural networks a 2D convolution has 3 dimensions such as height, width and depth where the depth is always equivalent to the number of input channels. For example…
Read more…

8
1 response
Go to the profile of Paul-Louis Pröve
Paul-Louis Pröve
Oct 16
Krishna, thank you so much for taking the time and showing the details of the actual tensor transformations. When you use high level frameworks such as Keras you never touch this functional level. I have a couple of questions:
Are you aware of any papers using spatial separable convolutions in deep learning? It sounds like a…

'Deep Learning > Keras' 카테고리의 다른 글

How to Visualize a Deep Learning Neural Network Model in Keras https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ (0)	2017.12.13
TensorFlow Speech Recognition - Kaggle competition keras (0)	2017.11.25
How to Use the Keras Functional API for Deep Learning (0)	2017.10.27
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.10.18
케라스 강좌 내용 (0)	2017.07.12

Posted by uniqueone

How to Use the Keras Functional API for Deep Learning

Deep Learning/Keras 2017. 10. 27. 08:18

https://machinelearningmastery.com/keras-functional-api-deep-learning/

How to Use the Keras Functional API for Deep Learning

By Jason Brownlee on October 27, 2017 in Deep Learning

The Keras Python library makes creating deep learning models fast and easy.

The sequential API allows you to create models layer-by-layer for most problems. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs.

The functional API in Keras is an alternate way of creating models that offers a lot more flexibility, including creating more complex models.

In this tutorial, you will discover how to use the more flexible functional API in Keras to define deep learning models.

After completing this tutorial, you will know:

The difference between the Sequential and Functional APIs.
How to define simple Multilayer Perceptron, Convolutional Neural Network, and Recurrent Neural Network models using the functional API.
How to define more complex models with shared layers and multiple inputs and outputs.

Let’s get started.

Tutorial Overview

This tutorial is divided into 6 parts; they are:

Keras Sequential Models
Keras Functional Models
Standard Network Models
Shared Layers Model
Multiple Input and Output Models
Best Practices

1. Keras Sequential Models

As a review, Keras provides a Sequential model API.

This is a way of creating deep learning models where an instance of the Sequential class is created and model layers are created and added to it.

For example, the layers can be defined and passed to the Sequential as an array:

from keras.models import Sequential
from keras.layers import Dense
model = Sequential([Dense(2, input_dim=1), Dense(1)])

from keras.models import Sequential

from keras.layers import Dense

model = Sequential([Dense(2, input_dim=1), Dense(1)])

Layers can also be added piecewise:

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(2, input_dim=1))
model.add(Dense(1))

from keras.models import Sequential

from keras.layers import Dense

model = Sequential()

model.add(Dense(2, input_dim=1))

model.add(Dense(1))

The Sequential model API is great for developing deep learning models in most situations, but it also has some limitations.

For example, it is not straightforward to define models that may have multiple different input sources, produce multiple output destinations or models that re-use layers.

2. Keras Functional Models

The Keras functional API provides a more flexible way for defining models.

It specifically allows you to define multiple input or output models as well as models that share layers. More than that, it allows you to define ad hoc acyclic network graphs.

Models are defined by creating instances of layers and connecting them directly to each other in pairs, then defining a Model that specifies the layers to act as the input and output to the model.

Let’s look at the three unique aspects of Keras functional API in turn:

1. Defining Input

Unlike the Sequential model, you must create and define a standalone Input layer that specifies the shape of input data.

The input layer takes a shape argument that is a tuple that indicates the dimensionality of the input data.

When input data is one-dimensional, such as for a multilayer Perceptron, the shape must explicitly leave room for the shape of the mini-batch size used when splitting the data when training the network. Therefore, the shape tuple is always defined with a hanging last dimension (2,), for example:

from keras.layers import Input
visible = Input(shape=(2,))

1 2	from keras.layers import Input visible = Input(shape=(2,))

2. Connecting Layers

The layers in the model are connected pairwise.

This is done by specifying where the input comes from when defining each new layer. A bracket notation is used, such that after the layer is created, the layer from which the input to the current layer comes from is specified.

Let’s make this clear with a short example. We can create the input layer as above, then create a hidden layer as a Dense that receives input only from the input layer.

from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)

from keras.layers import Input

from keras.layers import Dense

visible = Input(shape=(2,))

hidden = Dense(2)(visible)

Note the (visible) after the creation of the Dense layer that connects the input layer output as the input to the dense hidden layer.

It is this way of connecting layers piece by piece that gives the functional API its flexibility. For example, you can see how easy it would be to start defining ad hoc graphs of layers.

3. Creating the Model

After creating all of your model layers and connecting them together, you must define the model.

As with the Sequential API, the model is the thing you can summarize, fit, evaluate, and use to make predictions.

Keras provides a Model class that you can use to create a model from your created layers. It requires that you only specify the input and output layers. For example:

from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)
model = Model(inputs=visible, outputs=hidden)

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

visible = Input(shape=(2,))

hidden = Dense(2)(visible)

model = Model(inputs=visible, outputs=hidden)

Now that we know all of the key pieces of the Keras functional API, let’s work through defining a suite of different models and build up some practice with it.

Each example is executable and prints the structure and creates a diagram of the graph. I recommend doing this for your own models to make it clear what exactly you have defined.

My hope is that these examples provide templates for you when you want to define your own models using the functional API in the future.

3. Standard Network Models

When getting started with the functional API, it is a good idea to see how some standard neural network models are defined.

In this section, we will look at defining a simple multilayer Perceptron, convolutional neural network, and recurrent neural network.

These examples will provide a foundation for understanding the more elaborate examples later.

Multilayer Perceptron

In this section, we define a multilayer Perceptron model for binary classification.

The model has 10 inputs, 3 hidden layers with 10, 20, and 10 neurons, and an output layer with 1 output. Rectified linear activation functions are used in each hidden layer and a sigmoid activation function is used in the output layer, for binary classification.

# Multilayer Perceptron
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(10,))
hidden1 = Dense(10, activation='relu')(visible)
hidden2 = Dense(20, activation='relu')(hidden1)
hidden3 = Dense(10, activation='relu')(hidden2)
output = Dense(1, activation='sigmoid')(hidden3)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multilayer_perceptron_graph.png')

# Multilayer Perceptron

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

visible = Input(shape=(10,))

hidden1 = Dense(10, activation='relu')(visible)

hidden2 = Dense(20, activation='relu')(hidden1)

hidden3 = Dense(10, activation='relu')(hidden2)

output = Dense(1, activation='sigmoid')(hidden3)

model = Model(inputs=visible, outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='multilayer_perceptron_graph.png')

Running the example prints the structure of the network.

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 10)                0
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110
_________________________________________________________________
dense_2 (Dense)              (None, 20)                220
_________________________________________________________________
dense_3 (Dense)              (None, 10)                210
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 11
=================================================================
Total params: 551
Trainable params: 551
Non-trainable params: 0
_________________________________________________________________

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

input_1 (InputLayer) (None, 10) 0

_________________________________________________________________

dense_1 (Dense) (None, 10) 110

_________________________________________________________________

dense_2 (Dense) (None, 20) 220

_________________________________________________________________

dense_3 (Dense) (None, 10) 210

_________________________________________________________________

dense_4 (Dense) (None, 1) 11

=================================================================

Total params: 551

Trainable params: 551

Non-trainable params: 0

_________________________________________________________________

A plot of the model graph is also created and saved to file.

Multilayer Perceptron Network Graph

Convolutional Neural Network

In this section, we will define a convolutional neural network for image classification.

The model receives black and white 64×64 images as input, then has a sequence of two convolutional and pooling layers as feature extractors, followed by a fully connected layer to interpret the features and an output layer with a sigmoid activation for two-class predictions.

# Convolutional Neural Network
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
visible = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
hidden1 = Dense(10, activation='relu')(pool2)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='convolutional_neural_network.png')

# Convolutional Neural Network

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers.convolutional import Conv2D

from keras.layers.pooling import MaxPooling2D

visible = Input(shape=(64,64,1))

conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)

pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)

pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

hidden1 = Dense(10, activation='relu')(pool2)

output = Dense(1, activation='sigmoid')(hidden1)

model = Model(inputs=visible, outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='convolutional_neural_network.png')

Running the example summarizes the model layers.

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 64, 64, 1)         0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 61, 61, 32)        544
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 30, 30, 32)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 27, 27, 16)        8208
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 16)        0
_________________________________________________________________
dense_1 (Dense)              (None, 13, 13, 10)        170
_________________________________________________________________
dense_2 (Dense)              (None, 13, 13, 1)         11
=================================================================
Total params: 8,933
Trainable params: 8,933
Non-trainable params: 0
_________________________________________________________________

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

input_1 (InputLayer) (None, 64, 64, 1) 0

_________________________________________________________________

conv2d_1 (Conv2D) (None, 61, 61, 32) 544

_________________________________________________________________

max_pooling2d_1 (MaxPooling2 (None, 30, 30, 32) 0

_________________________________________________________________

conv2d_2 (Conv2D) (None, 27, 27, 16) 8208

_________________________________________________________________

max_pooling2d_2 (MaxPooling2 (None, 13, 13, 16) 0

_________________________________________________________________

dense_1 (Dense) (None, 13, 13, 10) 170

_________________________________________________________________

dense_2 (Dense) (None, 13, 13, 1) 11

=================================================================

Total params: 8,933

Trainable params: 8,933

Non-trainable params: 0

_________________________________________________________________

A plot of the model graph is also created and saved to file.

Convolutional Neural Network Graph

Recurrent Neural Network

In this section, we will define a long short-term memory recurrent neural network for sequence classification.

The model expects 100 time steps of one feature as input. The model has a single LSTM hidden layer to extract features from the sequence, followed by a fully connected layer to interpret the LSTM output, followed by an output layer for making binary predictions.

# Recurrent Neural Network
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
visible = Input(shape=(100,1))
hidden1 = LSTM(10)(visible)
hidden2 = Dense(10, activation='relu')(hidden1)
output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='recurrent_neural_network.png')

# Recurrent Neural Network

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers.recurrent import LSTM

visible = Input(shape=(100,1))

hidden1 = LSTM(10)(visible)

hidden2 = Dense(10, activation='relu')(hidden1)

output = Dense(1, activation='sigmoid')(hidden2)

model = Model(inputs=visible, outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='recurrent_neural_network.png')

Running the example summarizes the model layers.

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 100, 1)            0
_________________________________________________________________
lstm_1 (LSTM)                (None, 10)                480
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 11
=================================================================
Total params: 601
Trainable params: 601
Non-trainable params: 0
_________________________________________________________________

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

input_1 (InputLayer) (None, 100, 1) 0

_________________________________________________________________

lstm_1 (LSTM) (None, 10) 480

_________________________________________________________________

dense_1 (Dense) (None, 10) 110

_________________________________________________________________

dense_2 (Dense) (None, 1) 11

=================================================================

Total params: 601

Trainable params: 601

Non-trainable params: 0

_________________________________________________________________

A plot of the model graph is also created and saved to file.

Recurrent Neural Network Graph

4. Shared Layers Model

Multiple layers can share the output from one layer.

For example, there may be multiple different feature extraction layers from an input, or multiple layers used to interpret the output from a feature extraction layer.

Let’s look at both of these examples.

Shared Input Layer

In this section, we define multiple convolutional layers with differently sized kernels to interpret an image input.

The model takes black and white images with the size 64×64 pixels. There are two CNN feature extraction submodels that share this input; the first has a kernel size of 4 and the second a kernel size of 8. The outputs from these feature extraction submodels are flattened into vectors and concatenated into one long vector and passed on to a fully connected layer for interpretation before a final output layer makes a binary classification.

# Shared Input Layer
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# input layer
visible = Input(shape=(64,64,1))
# first feature extractor
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
flat1 = Flatten()(pool1)
# second feature extractor
conv2 = Conv2D(16, kernel_size=8, activation='relu')(visible)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
flat2 = Flatten()(pool2)
# merge feature extractors
merge = concatenate([flat1, flat2])
# interpretation layer
hidden1 = Dense(10, activation='relu')(merge)
# prediction output
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='shared_input_layer.png')

# Shared Input Layer

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.convolutional import Conv2D

from keras.layers.pooling import MaxPooling2D

from keras.layers.merge import concatenate

# input layer

visible = Input(shape=(64,64,1))

# first feature extractor

conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)

pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

flat1 = Flatten()(pool1)

# second feature extractor

conv2 = Conv2D(16, kernel_size=8, activation='relu')(visible)

pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

flat2 = Flatten()(pool2)

# merge feature extractors

merge = concatenate([flat1, flat2])

# interpretation layer

hidden1 = Dense(10, activation='relu')(merge)

# prediction output

output = Dense(1, activation='sigmoid')(hidden1)

model = Model(inputs=visible, outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='shared_input_layer.png')

Running the example summarizes the model layers.

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 64, 64, 1)     0
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (None, 61, 61, 32)    544         input_1[0][0]
____________________________________________________________________________________________________
conv2d_2 (Conv2D)                (None, 57, 57, 16)    1040        input_1[0][0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)   (None, 30, 30, 32)    0           conv2d_1[0][0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)   (None, 28, 28, 16)    0           conv2d_2[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 28800)         0           max_pooling2d_1[0][0]
____________________________________________________________________________________________________
flatten_2 (Flatten)              (None, 12544)         0           max_pooling2d_2[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 41344)         0           flatten_1[0][0]
                                                                   flatten_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            413450      concatenate_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 1)             11          dense_1[0][0]
====================================================================================================
Total params: 415,045
Trainable params: 415,045
Non-trainable params: 0
____________________________________________________________________________________________________

____________________________________________________________________________________________________

Layer (type) Output Shape Param # Connected to

====================================================================================================

input_1 (InputLayer) (None, 64, 64, 1) 0

____________________________________________________________________________________________________

conv2d_1 (Conv2D) (None, 61, 61, 32) 544 input_1[0][0]

____________________________________________________________________________________________________

conv2d_2 (Conv2D) (None, 57, 57, 16) 1040 input_1[0][0]

____________________________________________________________________________________________________

max_pooling2d_1 (MaxPooling2D) (None, 30, 30, 32) 0 conv2d_1[0][0]

____________________________________________________________________________________________________

max_pooling2d_2 (MaxPooling2D) (None, 28, 28, 16) 0 conv2d_2[0][0]

____________________________________________________________________________________________________

flatten_1 (Flatten) (None, 28800) 0 max_pooling2d_1[0][0]

____________________________________________________________________________________________________

flatten_2 (Flatten) (None, 12544) 0 max_pooling2d_2[0][0]

____________________________________________________________________________________________________

concatenate_1 (Concatenate) (None, 41344) 0 flatten_1[0][0]

flatten_2[0][0]

____________________________________________________________________________________________________

dense_1 (Dense) (None, 10) 413450 concatenate_1[0][0]

____________________________________________________________________________________________________

dense_2 (Dense) (None, 1) 11 dense_1[0][0]

====================================================================================================

Total params: 415,045

Trainable params: 415,045

Non-trainable params: 0

____________________________________________________________________________________________________

A plot of the model graph is also created and saved to file.

Neural Network Graph With Shared Inputs

Shared Feature Extraction Layer

In this section, we will two parallel submodels to interpret the output of an LSTM feature extractor for sequence classification.

The input to the model is 100 time steps of 1 feature. An LSTM layer with 10 memory cells interprets this sequence. The first interpretation model is a shallow single fully connected layer, the second is a deep 3 layer model. The output of both interpretation models are concatenated into one long vector that is passed to the output layer used to make a binary prediction.

# Shared Feature Extraction Layer
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.merge import concatenate
# define input
visible = Input(shape=(100,1))
# feature extraction
extract1 = LSTM(10)(visible)
# first interpretation model
interp1 = Dense(10, activation='relu')(extract1)
# second interpretation model
interp11 = Dense(10, activation='relu')(extract1)
interp12 = Dense(20, activation='relu')(interp11)
interp13 = Dense(10, activation='relu')(interp12)
# merge interpretation
merge = concatenate([interp1, interp13])
# output
output = Dense(1, activation='sigmoid')(merge)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='shared_feature_extractor.png')

# Shared Feature Extraction Layer

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers.recurrent import LSTM

from keras.layers.merge import concatenate

# define input

visible = Input(shape=(100,1))

# feature extraction

extract1 = LSTM(10)(visible)

# first interpretation model

interp1 = Dense(10, activation='relu')(extract1)

# second interpretation model

interp11 = Dense(10, activation='relu')(extract1)

interp12 = Dense(20, activation='relu')(interp11)

interp13 = Dense(10, activation='relu')(interp12)

# merge interpretation

merge = concatenate([interp1, interp13])

# output

output = Dense(1, activation='sigmoid')(merge)

model = Model(inputs=visible, outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='shared_feature_extractor.png')

Running the example summarizes the model layers.

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 100, 1)        0
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 10)            480         input_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 10)            110         lstm_1[0][0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 20)            220         dense_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            110         lstm_1[0][0]
____________________________________________________________________________________________________
dense_4 (Dense)                  (None, 10)            210         dense_3[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 20)            0           dense_1[0][0]
                                                                   dense_4[0][0]
____________________________________________________________________________________________________
dense_5 (Dense)                  (None, 1)             21          concatenate_1[0][0]
====================================================================================================
Total params: 1,151
Trainable params: 1,151
Non-trainable params: 0
____________________________________________________________________________________________________

____________________________________________________________________________________________________

Layer (type) Output Shape Param # Connected to

====================================================================================================

input_1 (InputLayer) (None, 100, 1) 0

____________________________________________________________________________________________________

lstm_1 (LSTM) (None, 10) 480 input_1[0][0]

____________________________________________________________________________________________________

dense_2 (Dense) (None, 10) 110 lstm_1[0][0]

____________________________________________________________________________________________________

dense_3 (Dense) (None, 20) 220 dense_2[0][0]

____________________________________________________________________________________________________

dense_1 (Dense) (None, 10) 110 lstm_1[0][0]

____________________________________________________________________________________________________

dense_4 (Dense) (None, 10) 210 dense_3[0][0]

____________________________________________________________________________________________________

concatenate_1 (Concatenate) (None, 20) 0 dense_1[0][0]

dense_4[0][0]

____________________________________________________________________________________________________

dense_5 (Dense) (None, 1) 21 concatenate_1[0][0]

====================================================================================================

Total params: 1,151

Trainable params: 1,151

Non-trainable params: 0

____________________________________________________________________________________________________

A plot of the model graph is also created and saved to file.

Neural Network Graph With Shared Feature Extraction Layer

5. Multiple Input and Output Models

The functional API can also be used to develop more complex models with multiple inputs, possibly with different modalities. It can also be used to develop models that produce multiple outputs.

We will look at examples of each in this section.

Multiple Input Model

We will develop an image classification model that takes two versions of the image as input, each of a different size. Specifically a black and white 64×64 version and a color 32×32 version. Separate feature extraction CNN models operate on each, then the results from both models are concatenated for interpretation and ultimate prediction.

Note that in the creation of the Model() instance, that we define the two input layers as an array. Specifically:

model = Model(inputs=[visible1, visible2], outputs=output)

1	model = Model(inputs=[visible1, visible2], outputs=output)

The complete example is listed below.

# Multiple Inputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# first input model
visible1 = Input(shape=(64,64,1))
conv11 = Conv2D(32, kernel_size=4, activation='relu')(visible1)
pool11 = MaxPooling2D(pool_size=(2, 2))(conv11)
conv12 = Conv2D(16, kernel_size=4, activation='relu')(pool11)
pool12 = MaxPooling2D(pool_size=(2, 2))(conv12)
flat1 = Flatten()(pool12)
# second input model
visible2 = Input(shape=(32,32,3))
conv21 = Conv2D(32, kernel_size=4, activation='relu')(visible2)
pool21 = MaxPooling2D(pool_size=(2, 2))(conv21)
conv22 = Conv2D(16, kernel_size=4, activation='relu')(pool21)
pool22 = MaxPooling2D(pool_size=(2, 2))(conv22)
flat2 = Flatten()(pool22)
# merge input models
merge = concatenate([flat1, flat2])
# interpretation model
hidden1 = Dense(10, activation='relu')(merge)
hidden2 = Dense(10, activation='relu')(hidden1)
output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs=[visible1, visible2], outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_inputs.png')

# Multiple Inputs

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.convolutional import Conv2D

from keras.layers.pooling import MaxPooling2D

from keras.layers.merge import concatenate

# first input model

visible1 = Input(shape=(64,64,1))

conv11 = Conv2D(32, kernel_size=4, activation='relu')(visible1)

pool11 = MaxPooling2D(pool_size=(2, 2))(conv11)

conv12 = Conv2D(16, kernel_size=4, activation='relu')(pool11)

pool12 = MaxPooling2D(pool_size=(2, 2))(conv12)

flat1 = Flatten()(pool12)

# second input model

visible2 = Input(shape=(32,32,3))

conv21 = Conv2D(32, kernel_size=4, activation='relu')(visible2)

pool21 = MaxPooling2D(pool_size=(2, 2))(conv21)

conv22 = Conv2D(16, kernel_size=4, activation='relu')(pool21)

pool22 = MaxPooling2D(pool_size=(2, 2))(conv22)

flat2 = Flatten()(pool22)

# merge input models

merge = concatenate([flat1, flat2])

# interpretation model

hidden1 = Dense(10, activation='relu')(merge)

hidden2 = Dense(10, activation='relu')(hidden1)

output = Dense(1, activation='sigmoid')(hidden2)

model = Model(inputs=[visible1, visible2], outputs=output)

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='multiple_inputs.png')

Running the example summarizes the model layers.

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 64, 64, 1)     0
____________________________________________________________________________________________________
input_2 (InputLayer)             (None, 32, 32, 3)     0
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (None, 61, 61, 32)    544         input_1[0][0]
____________________________________________________________________________________________________
conv2d_3 (Conv2D)                (None, 29, 29, 32)    1568        input_2[0][0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)   (None, 30, 30, 32)    0           conv2d_1[0][0]
____________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)   (None, 14, 14, 32)    0           conv2d_3[0][0]
____________________________________________________________________________________________________
conv2d_2 (Conv2D)                (None, 27, 27, 16)    8208        max_pooling2d_1[0][0]
____________________________________________________________________________________________________
conv2d_4 (Conv2D)                (None, 11, 11, 16)    8208        max_pooling2d_3[0][0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)   (None, 13, 13, 16)    0           conv2d_2[0][0]
____________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)   (None, 5, 5, 16)      0           conv2d_4[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 2704)          0           max_pooling2d_2[0][0]
____________________________________________________________________________________________________
flatten_2 (Flatten)              (None, 400)           0           max_pooling2d_4[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 3104)          0           flatten_1[0][0]
                                                                   flatten_2[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            31050       concatenate_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 10)            110         dense_1[0][0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 1)             11          dense_2[0][0]
====================================================================================================
Total params: 49,699
Trainable params: 49,699
Non-trainable params: 0
____________________________________________________________________________________________________

____________________________________________________________________________________________________

Layer (type) Output Shape Param # Connected to

====================================================================================================

input_1 (InputLayer) (None, 64, 64, 1) 0

____________________________________________________________________________________________________

input_2 (InputLayer) (None, 32, 32, 3) 0

____________________________________________________________________________________________________

conv2d_1 (Conv2D) (None, 61, 61, 32) 544 input_1[0][0]

____________________________________________________________________________________________________

conv2d_3 (Conv2D) (None, 29, 29, 32) 1568 input_2[0][0]

____________________________________________________________________________________________________

max_pooling2d_1 (MaxPooling2D) (None, 30, 30, 32) 0 conv2d_1[0][0]

____________________________________________________________________________________________________

max_pooling2d_3 (MaxPooling2D) (None, 14, 14, 32) 0 conv2d_3[0][0]

____________________________________________________________________________________________________

conv2d_2 (Conv2D) (None, 27, 27, 16) 8208 max_pooling2d_1[0][0]

____________________________________________________________________________________________________

conv2d_4 (Conv2D) (None, 11, 11, 16) 8208 max_pooling2d_3[0][0]

____________________________________________________________________________________________________

max_pooling2d_2 (MaxPooling2D) (None, 13, 13, 16) 0 conv2d_2[0][0]

____________________________________________________________________________________________________

max_pooling2d_4 (MaxPooling2D) (None, 5, 5, 16) 0 conv2d_4[0][0]

____________________________________________________________________________________________________

flatten_1 (Flatten) (None, 2704) 0 max_pooling2d_2[0][0]

____________________________________________________________________________________________________

flatten_2 (Flatten) (None, 400) 0 max_pooling2d_4[0][0]

____________________________________________________________________________________________________

concatenate_1 (Concatenate) (None, 3104) 0 flatten_1[0][0]

flatten_2[0][0]

____________________________________________________________________________________________________

dense_1 (Dense) (None, 10) 31050 concatenate_1[0][0]

____________________________________________________________________________________________________

dense_2 (Dense) (None, 10) 110 dense_1[0][0]

____________________________________________________________________________________________________

dense_3 (Dense) (None, 1) 11 dense_2[0][0]

====================================================================================================

Total params: 49,699

Trainable params: 49,699

Non-trainable params: 0

____________________________________________________________________________________________________

A plot of the model graph is also created and saved to file.

Neural Network Graph With Multiple Inputs

Multiple Output Model

In this section, we will develop a model that makes two different types of predictions. Given an input sequence of 100 time steps of one feature, the model will both classify the sequence and output a new sequence with the same length.

An LSTM layer interprets the input sequence and returns the hidden state for each time step. The first output model creates a stacked LSTM, interprets the features, and makes a binary prediction. The second output model uses the same output layer to make a real-valued prediction for each input time step.

# Multiple Outputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import TimeDistributed
# input layer
visible = Input(shape=(100,1))
# feature extraction
extract = LSTM(10, return_sequences=True)(visible)
# classification output
class11 = LSTM(10)(extract)
class12 = Dense(10, activation='relu')(class11)
output1 = Dense(1, activation='sigmoid')(class12)
# sequence output
output2 = TimeDistributed(Dense(1, activation='linear'))(extract)
# output
model = Model(inputs=visible, outputs=[output1, output2])
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_outputs.png')

# Multiple Outputs

from keras.utils import plot_model

from keras.models import Model

from keras.layers import Input

from keras.layers import Dense

from keras.layers.recurrent import LSTM

from keras.layers.wrappers import TimeDistributed

# input layer

visible = Input(shape=(100,1))

# feature extraction

extract = LSTM(10, return_sequences=True)(visible)

# classification output

class11 = LSTM(10)(extract)

class12 = Dense(10, activation='relu')(class11)

output1 = Dense(1, activation='sigmoid')(class12)

# sequence output

output2 = TimeDistributed(Dense(1, activation='linear'))(extract)

# output

model = Model(inputs=visible, outputs=[output1, output2])

# summarize layers

print(model.summary())

# plot graph

plot_model(model, to_file='multiple_outputs.png')

Running the example summarizes the model layers.

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 100, 1)        0
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 100, 10)       480         input_1[0][0]
____________________________________________________________________________________________________
lstm_2 (LSTM)                    (None, 10)            840         lstm_1[0][0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            110         lstm_2[0][0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 1)             11          dense_1[0][0]
____________________________________________________________________________________________________
time_distributed_1 (TimeDistribu (None, 100, 1)        11          lstm_1[0][0]
====================================================================================================
Total params: 1,452
Trainable params: 1,452
Non-trainable params: 0
____________________________________________________________________________________________________

____________________________________________________________________________________________________

Layer (type) Output Shape Param # Connected to

====================================================================================================

input_1 (InputLayer) (None, 100, 1) 0

____________________________________________________________________________________________________

lstm_1 (LSTM) (None, 100, 10) 480 input_1[0][0]

____________________________________________________________________________________________________

lstm_2 (LSTM) (None, 10) 840 lstm_1[0][0]

____________________________________________________________________________________________________

dense_1 (Dense) (None, 10) 110 lstm_2[0][0]

____________________________________________________________________________________________________

dense_2 (Dense) (None, 1) 11 dense_1[0][0]

____________________________________________________________________________________________________

time_distributed_1 (TimeDistribu (None, 100, 1) 11 lstm_1[0][0]

====================================================================================================

Total params: 1,452

Trainable params: 1,452

Non-trainable params: 0

____________________________________________________________________________________________________

A plot of the model graph is also created and saved to file.

Neural Network Graph With Multiple Outputs

6. Best Practices

In this section, I want to give you some tips to get the most out of the functional API when you are defining your own models.

Consistent Variable Names. Use the same variable name for the input (visible) and output layers (output) and perhaps even the hidden layers (hidden1, hidden2). It will help to connect things together correctly.
Review Layer Summary. Always print the model summary and review the layer outputs to ensure that the model was connected together as you expected.
Review Graph Plots. Always create a plot of the model graph and review it to ensure that everything was put together as you intended.
Name the layers. You can assign names to layers that are used when reviewing summaries and plots of the model graph. For example: Dense(1, name=’hidden1′).
Separate Submodels. Consider separating out the development of submodels and combine the submodels together at the end.

Do you have your own best practice tips when using the functional API?
Let me know in the comments.

Summary

In this tutorial, you discovered how to use the functional API in Keras for defining simple and complex deep learning models.

Specifically, you learned:

The difference between the Sequential and Functional APIs.
How to define simple Multilayer Perceptron, Convolutional Neural Network, and Recurrent Neural Network models using the functional API.
How to define more complex models with shared layers and multiple inputs and outputs.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

저작자표시 비영리 동일조건

'Deep Learning > Keras' 카테고리의 다른 글

TensorFlow Speech Recognition - Kaggle competition keras (0)	2017.11.25
An Introduction to different Types of Convolutions in Deep Learning (0)	2017.11.22
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.10.18
케라스 강좌 내용 (0)	2017.07.12
Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow) (0)	2017.04.11

Posted by uniqueone

Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python

Deep Learning/Keras 2017. 10. 18. 08:48

Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python
https://elitedatascience.com/keras-tutorial-deep-learning-in-python?utm_content=bufferbce2c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

'Deep Learning > Keras' 카테고리의 다른 글

An Introduction to different Types of Convolutions in Deep Learning (0)	2017.11.22
How to Use the Keras Functional API for Deep Learning (0)	2017.10.27
케라스 강좌 내용 (0)	2017.07.12
Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow) (0)	2017.04.11
My experiments with AlexNet, using Keras and Theano (0)	2017.04.10

Posted by uniqueone

케라스 강좌 내용

Deep Learning/Keras 2017. 7. 12. 20:24

케라스 강좌 내용
https://tykimos.github.io/Keras/lecture/

'Deep Learning > Keras' 카테고리의 다른 글

How to Use the Keras Functional API for Deep Learning (0)	2017.10.27
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.10.18
Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow) (0)	2017.04.11
My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11

Posted by uniqueone

Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow)

Deep Learning/Keras 2017. 4. 11. 17:52

https://m.facebook.com/groups/5582633474?view=permalink&id=10155938232383475

Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow)

'Deep Learning > Keras' 카테고리의 다른 글

Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.10.18
케라스 강좌 내용 (0)	2017.07.12
My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28

Posted by uniqueone

My experiments with AlexNet, using Keras and Theano

Deep Learning/Keras 2017. 4. 10. 15:33

GitHub - duggalrahul/AlexNet-Experiments-Keras: Code examples for training AlexNet using Keras and Theano
https://github.com/duggalrahul/AlexNet-Experiments-Keras

'Deep Learning > Keras' 카테고리의 다른 글

케라스 강좌 내용 (0)	2017.07.12
Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow) (0)	2017.04.11
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28
keras로 공부하기 좋은 사이트 theano (0)	2017.02.28

Posted by uniqueone

Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library

Deep Learning/Keras 2017. 3. 11. 11:05

https://github.com/fchollet/keras-resources

Keras resources

This is a directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library.

If you have a high-quality tutorial or project to add, please open a PR.

Official starter resources

keras.io - Keras documentation
Getting started with the Sequential model
Getting started with the functional API
Keras FAQ

Tutorials

Quick start: the Iris dataset in Keras and scikit-learn
Using pre-trained word embeddings in a Keras model
Building powerful image classification models using very little data
Building Autoencoders in Keras
A complete guide to using Keras as part of a TensorFlow workflow
Introduction to Keras, from University of Waterloo: video - slides
Introduction to Deep Learning with Keras, from CERN: video - slides
Installing Keras for deep learning
Develop Your First Neural Network in Python With Keras Step-By-Step
Understanding Stateful LSTM Recurrent Neural Networks in Python with Keras
Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras
Keras video tutorials from Dan Van Boxel
Keras Deep Learning Tutorial for Kaggle 2nd Annual Data Science Bowl
Collection of tutorials setting up DNNs with Keras
Fast.AI - Practical Deep Learning For Coders, Part 1 (great information on deep learning in general, heavily uses Keras for the labs)

Code examples

Working with text

Working with images

Simple CNN on MNIST
Simple CNN on CIFAR10 with data augmentation
Inception v3
VGG 16 (with pre-trained weights)
VGG 19 (with pre-trained weights)
ResNet 50 (with pre-trained weights): 1 - 2
FractalNet
AlexNet, VGG 16, VGG 19, and class heatmap visualization
Visual-Semantic Embedding
Variational Autoencoder: with deconvolutions - with upsampling
Visual question answering
Deep Networks with Stochastic Depth
Smile detection with a CNN
VGG-CAM
t-SNE of image CNN fc7 activations
VGG16 Deconvolution network
Wide Residual Networks (with pre-trained weights): 1 - 2
Ultrasound nerve segmentation: 1 - 2
DeepMask object segmentation
Densely Connected Convolutional Networks: 1 - 2
Snapshot Ensembles: Train 1, Get M for Free

Creative visual applications

Real-time style transfer
Style transfer: 1 - 2
Image analogies: Generate image analogies using neural matching and blending.
Visualizing the filters learned by a CNN
Deep dreams
GAN / DCGAN: 1 - 2 - 3 - 4
InfoGAN
pix2pix
DFI: Deep Feature Interpolation
Colorful Image colorization: B&W to color

Reinforcement learning

DQN
FlappyBird DQN
async-RL: Tensorflow + Keras + OpenAI Gym implementation of 1-step Q Learning from "Asynchronous Methods for Deep Reinforcement Learning"
keras-rl: A library for state-of-the-art reinforcement learning. Integrates with OpenAI Gym and implements DQN, double DQN, Continuous DQN, and DDPG.

Miscallenous architecture blueprints

Third-party libraries

Elephas: Distributed Deep Learning with Keras & Spark
Hyperas: Hyperparameter optimization
Hera: in-browser metrics dashboard for Keras models
Kerlym: reinforcement learning with Keras and OpenAI Gym
Qlearning4K: reinforcement learning add-on for Keras
seq2seq: Sequence to Sequence Learning with Keras
Seya: Keras extras
Keras Language Modeling: Language modeling tools for Keras
Recurrent Shop: Framework for building complex recurrent neural networks with Keras
Keras.js: Run trained Keras models in the browser, with GPU support
keras-vis: Neural network visualization toolkit for keras.

Projects built with Keras

RocAlphaGo: An independent, student-led replication of DeepMind's 2016 Nature publication, "Mastering the game of Go with deep neural networks and tree search"
DeepJazz: Deep learning driven jazz generation using Keras
dataset-sts: Semantic Text Similarity Dataset Hub
snli-entailment: Independent implementation of attention model for textual entailment from the paper "Reasoning about Entailment with Neural Attention".
Headline generator: independent implementation of Generating News Headlines with Recurrent Neural Networks

저작자표시 비영리 동일조건

'Deep Learning > Keras' 카테고리의 다른 글

Trend Prediction using LSTM RNNs with Keras implementation (Tensorflow) (0)	2017.04.11
My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28
keras로 공부하기 좋은 사이트 theano (0)	2017.02.28
Keras 자료 (0)	2017.02.28

Posted by uniqueone

Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python

Deep Learning/Keras 2017. 2. 28. 21:35

Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python
https://elitedatascience.com/keras-tutorial-deep-learning-in-python

'Deep Learning > Keras' 카테고리의 다른 글

My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
keras로 공부하기 좋은 사이트 theano (0)	2017.02.28
Keras 자료 (0)	2017.02.28
Keras Tutorial: The Ultimate Beginner’s Guide to Deep Learning in Python (0)	2016.11.28

Posted by uniqueone

keras로 공부하기 좋은 사이트 theano

Deep Learning/Keras 2017. 2. 28. 18:21

Practical Deep Learning For Coders—18 hours of lessons for free
http://course.fast.ai/

'Deep Learning > Keras' 카테고리의 다른 글

My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28
Keras 자료 (0)	2017.02.28
Keras Tutorial: The Ultimate Beginner’s Guide to Deep Learning in Python (0)	2016.11.28

Posted by uniqueone

Keras 자료

Deep Learning/Keras 2017. 2. 28. 18:18

텐서플로우도 좋지만, 컴퓨터사이언스가 전공이 아니고
기존 알고리즘들의 응용을 목적으로 하는 (저와 같은) 사람들을 위해서는 Keras로도 충분하다고 생각합니다 (제 전공은 산업공학입니다..)

그래서 Keras를 공부중인데 자료가 많이 없더라구요.

구글링해서 찾은 Keras CNN 튜토리얼을
https://elitedatascience.com/keras-tutorial-deep-learning-in-python
제 환경에 맞게(최신 Keras의 변경점을 수정하여)

나름대로 정리해봤습니다
https://byeongkijeong.github.io/Keras-cnn-tutorial/

Keras의 Backend를 Tensorflow로 사용하니, 본 그룹에 위배되지 않을거라 생각합니다 ㅎㅎ

블로그도, Jekyll도 처음이고 Markdown도 처음이라
글이 지저분하네요.

조만간 수정해보겠습니다 ㅠㅠ

'Deep Learning > Keras' 카테고리의 다른 글

My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28
keras로 공부하기 좋은 사이트 theano (0)	2017.02.28
Keras Tutorial: The Ultimate Beginner’s Guide to Deep Learning in Python (0)	2016.11.28

Posted by uniqueone

Keras Tutorial: The Ultimate Beginner’s Guide to Deep Learning in Python

Deep Learning/Keras 2016. 11. 28. 14:44

https://elitedatascience.com/keras-tutorial-deep-learning-in-python

In this step-by-step Keras tutorial, you’ll learn how to build a convolutional neural network in Python!

In fact, we’ll be training a classifier for handwritten digits that boasts over 99% accuracy on the famous MNIST dataset.

Before we begin, we should note that this guide is geared toward beginners who are interested in applied deep learning.

Our goal is to introduce you to one of the most popular and powerful libraries for building neural networks in Python. That means we’ll brush over much of the theory and math, but we’ll also point you to great resources for learning those.

Before we start...

Recommended Prerequisites

The recommended prerequisites for this guide are:

To move quickly, we'll assume you have this background.

Why Keras?

Keras is our recommended library for deep learning in Python, especially for beginners. Its minimalistic, modular approach makes it a breeze to get deep neural networks up and running. You can read more about it here:

The Keras library for deep learning in Python

WTF is Deep Learning?

Deep learning refers to neural networks with multiple hidden layers that can learn increasingly abstract representations of the input data. This is obviously an oversimplification, but it's a practical definition for us right now.

For example, deep learning has led to major advances in computer vision. We're now able to classify images, find objects in them, and even label them with captions. To do so, deep neural networks with many hidden layers can sequentially learn more complex features from the raw input image:

The first hidden layers might only learn local edge patterns.
Then, each subsequent layer (or filter) learns more complex representations.
Finally, the last layer can classify the image as a cat or kangaroo.

These types of deep neural networks are called Convolutional Neural Networks.

WTF are Convolutional Neural Networks?

In a nutshell, Convolutional Neural Networks (CNN's) are multi-layer neural networks (sometimes up to 17 or more layers) that assume the input data to be images.

Typical CNN Architecture

By making this requirement, CNN's can drastically reduce the number of parameters that need to be tuned. Therefore, CNN's can efficiently handle the high dimensionality of raw images.

Their underlying mechanics are beyond the scope of this tutorial, but you can read more about them here.

What this tutorial is not:

This is not a complete course on deep learning. Instead, this tutorial is meant to get you from zero to your first Convolutional Neural Network with as little headache as possible!

If you're interested in mastering the theory behind deep learning, we recommend this great course from Stanford:

CS231n: Convolutional Neural Networks for Visual Recognition

A quick tip before we begin:

We tried to make this tutorial as streamlined as possible, which means we won't go into too much detail for any one topic. It's helpful to have the Keras documentation open beside you, in case you want to learn more about a function or module.

Keras Tutorial Contents

Here are the steps for building your first CNN using Keras:

Step 1: Set up your environment.

First, hang up a motivational poster:

Probably useless.

Next, make sure you have the following installed on your computer:

Python 2.7+ (Python 3 is fine too, but Python 2.7 is still more popular for data science overall)
SciPy with NumPy
Matplotlib (Optional, recommended for exploratory analysis)
Theano* (Installation instructions)

We strongly recommend installing Python, NumPy, SciPy, and matplotlib through the Anaconda Distribution. It comes with all of those packages.

*note: TensorFlow is also supported (as an alternative to Theano), but we stick with Theano to keep it simple. The main difference is that you'll need to reshape the data slightly differently before feeding it to your network.

You can check to see if you've installed everything correctly:

Go to your command line program (Terminal on a Mac) and type in:

$ python

You'll see the Python interpreter:

1	Python 2.7.12 \|Anaconda 4.0.0 (x86_64)\| (default, Jul 2 2016, 17:43:17)

Next, you can import your libraries and print their versions:

>>> import numpy

>>> import theano

>>> print numpy.__version__

1.11.0

>>> print theano.__version__

0.8.2

>>> quit()

Step 2: Install Keras.

It wouldn't be a Keras tutorial if we didn't cover how to install Keras.

The good news is that if you used Anaconda, then you'll already have a nice package management system called pip installed.

You can confirm you have it installed by typing $ pip in your command line. It should output a list of commands and options. If you don't have pip, you can install it here.

Once you have pip, installing Keras is easy as pie:

1	$ pip install keras

You can confirm it's installed correctly:

$ python -c "import keras; print keras.__version__"

Using Theano backend.

1.0.4

Oops... looks like that Keras version is outdated. Upgrading the version is easy:

$ pip install --upgrade keras

...

$ python -c "import keras; print keras.__version__"

Using Theano backend.

1.1.1

Perfect, now let's start a new Python file and name it keras_cnn_example.py.

Step 3: Import libraries and modules.

Let's start by importing numpy and setting a seed for the computer's pseudorandom number generator. This allows us to reproduce the results from our script:

1 2	import numpy as np np.random.seed(123) # for reproducibility

Next, we'll import the Sequential model type from Keras. This is simply a linear stack of neural network layers, and it's perfect for the type of feed-forward CNN we're building in this tutorial.

1	from keras.models import Sequential

Next, let's import the "core" layers from Keras. These are the layers that are used in almost any neural network:

1	from keras.layers import Dense, Dropout, Activation, Flatten

Then, we'll import the CNN layers from Keras. These are the convolutional layers that will help us efficiently train on image data:

1	from keras.layers import Convolution2D, MaxPooling2D

Finally, we'll import some utilities. This will help us transform our data later:

1	from keras.utils import np_utils

Now we have everything we need to build our neural network architecture.

Step 4: Load image data from MNIST.

MNIST is a great dataset for getting started with deep learning and computer vision. It's a big enough challenge to warrant neural networks, but it's manageable on a single computer. We discuss it more in our post: 6 Fun Machine Learning Projects for Beginners.

The Keras library conveniently includes it already. We can load it like so:

from keras.datasets import mnist

# Load pre-shuffled MNIST data into train and test sets

(X_train, y_train), (X_test, y_test) = mnist.load_data()

We can look at the shape of the dataset:

1 2	print X_train.shape # (60000, 28, 28)

Great, so it appears that we have 60,000 samples in our training set, and the images are 28 pixels x 28 pixels each. We can confirm this by plotting the first sample in matplotlib:

1 2	from matplotlib import pyplot as plt plt.imshow(X_train[0])

And here's the image output:

In general, when working with computer vision, it's helpful to visually plot the data before doing any algorithm work. It's a quick sanity check that can prevent easily avoidable mistakes (such as misinterpreting the data dimensions).

Step 5: Preprocess input data for Keras.

When using the Theano backend, you must explicitly declare a dimension for the depth of the input image. For example, a full-color image with all 3 RGB channels will have a depth of 3.

Our MNIST images only have a depth of 1, but we must explicitly declare that.

In other words, we want to transform our dataset from having shape (n, width, height) to (n, depth, width, height).

Here's how we can do that easily:

1 2	X_train = X_train.reshape(X_train.shape[0], 1, 28, 28) X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)

To confirm, we can print X_train's dimensions again:

1 2	print X_train.shape # (60000, 1, 28, 28)

The final preprocessing step for the input data is to convert our data type to float32 and normalize our data values to the range [0, 1].

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

X_train /= 255

X_test /= 255

Now, our input data are ready for model training.

Step 6: Preprocess class labels for Keras.

Next, let's take a look at the shape of our class label data:

1 2	print y_train.shape # (60000,)

Hmm... that may be problematic. We should have 10 different classes, one for each digit, but it looks like we only have a 1-dimensional array. Let's take a look at the labels for the first 10 training samples:

1 2	print y_train[:10] # [5 0 4 1 9 2 1 3 1 4]

And there's the problem. The y_train and y_test data are not split into 10 distinct class labels, but rather are represented as a single array with the class values.

We can fix this easily:

# Convert 1-dimensional class arrays to 10-dimensional class matrices

Y_train = np_utils.to_categorical(y_train, 10)

Y_test = np_utils.to_categorical(y_test, 10)

Now we can take another look:

1 2	print Y_train.shape # (60000, 10)

There we go... much better!

Step 7: Define model architecture.

Now we're ready to define our model architecture. In actual R&D work, researchers will spend a considerable amount of time studying model architectures.

To keep this tutorial moving along, we're not going to discuss the theory or math here. This alone is a rich and meaty field, and we recommend the CS231n class mentioned earlier for those who want to learn more.

Plus, when you're just starting out, you can just replicate proven architectures from academic papers or use existing examples. Here's a list of example implementations in Keras.

Let's start by declaring a sequential model format:

1	model = Sequential()

Next, we declare the input layer:

1	model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))

The input shape parameter should be the shape of 1 sample. In this case, it's the same (1, 28, 28) that corresponds to the (depth, width, height) of each digit image.

But what do the first 3 parameters represent? They correspond to the number of convolution filters to use, the number of rows in each convolution kernel, and the number of columns in each convolution kernel, respectively.

*Note: The step size is (1,1) by default, and it can be tuned using the 'subsample' parameter.

We can confirm this by printing the shape of the current model output:

1 2	print model.output_shape # (None, 32, 26, 26)

Next, we can simply add more layers to our model like we're building legos:

model.add(Convolution2D(32, 3, 3, activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Dropout(0.25))

Again, we won't go into the theory too much, but it's important to highlight the Dropout layer we just added. This is a method for regularizing our model in order to prevent overfitting. You can read more about it here.

MaxPooling2D is a way to reduce the number of parameters in our model by sliding a 2x2 pooling filter across the previous layer and taking the max of the 4 values in the 2x2 filter.

So far, for model parameters, we've added two Convolution layers. To complete our model architecture, let's add a fully connected layer and then the output layer:

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dropout(0.5))

model.add(Dense(10, activation='softmax'))

For Dense layers, the first parameter is the output size of the layer. Keras automatically handles the connections between layers.

Note that the final layer has an output size of 10, corresponding to the 10 classes of digits.

Also note that the weights from the Convolution layers must be flattened (made 1-dimensional) before passing them to the fully connected Dense layer.

Here's how the entire model architecture looks together:

model = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))

model.add(Convolution2D(32, 3, 3, activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dropout(0.5))

model.add(Dense(10, activation='softmax'))

Now all we need to do is define the loss function and the optimizer, and then we'll be ready to train it.

Step 8: Compile model.

Now we're in the home stretch! The hard part is already over.

We just need to compile the model and we'll be ready to train it. When we compile the model, we declare the loss function and the optimizer (SGD, Adam, etc.).

model.compile(loss='categorical_crossentropy',

optimizer='adam',

metrics=['accuracy'])

Keras has a variety of loss functions and out-of-the-box optimizers to choose from.

Step 9: Fit model on training data.

To fit the model, all we have to do is declare the batch size and number of epochs to train for, then pass in our training data.

model.fit(X_train, Y_train,

batch_size=32, nb_epoch=10, verbose=1)

# Epoch 1/10

# 7744/60000 [==>...........................] - ETA: 96s - loss: 0.5806 - acc: 0.8164

Easy, huh?

You can also use a variety of callbacks to set early-stopping rules, save model weights along the way, or log the history of each training epoch.

Step 10: Evaluate model on test data.

Finally, we can evaluate our model on the test data:

1	score = model.evaluate(X_test, Y_test, verbose=0)

Congratulations... you've made it to the end of this Keras tutorial!

We've just completed a whirlwind tour of Keras's core functionality, but we've only really scratched the surface. Hopefully you've gained the foundation to further explore all that Keras has to offer.

For continued learning, we recommend studying other example models in Keras and Stanford's computer vision class.

The complete code, from start to finish.

Here's all the code in one place, in a single script.

# 3. Import libraries and modules

import numpy as np

np.random.seed(123) # for reproducibility

from keras.models import Sequential

from keras.layers import Dense, Dropout, Activation, Flatten

from keras.layers import Convolution2D, MaxPooling2D

from keras.utils import np_utils

from keras.datasets import mnist

# 4. Load pre-shuffled MNIST data into train and test sets

(X_train, y_train), (X_test, y_test) = mnist.load_data()

# 5. Preprocess input data

X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)

X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)

X_train = X_train.astype('float32')

X_test = X_test.astype('float32')

X_train /= 255

X_test /= 255

# 6. Preprocess class labels

Y_train = np_utils.to_categorical(y_train, 10)

Y_test = np_utils.to_categorical(y_test, 10)

# 7. Define model architecture

model = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))

model.add(Convolution2D(32, 3, 3, activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dropout(0.5))

model.add(Dense(10, activation='softmax'))

# 8. Compile model

model.compile(loss='categorical_crossentropy',

optimizer='adam',

metrics=['accuracy'])

# 9. Fit model on training data

model.fit(X_train, Y_train,

batch_size=32, nb_epoch=10, verbose=1)

# 10. Evaluate model on test data

score = model.evaluate(X_test, Y_test, verbose=0)

Enjoyed the tutorial?

Sign up for our newsletter for more content like this. You’ll also get instant access to a free PDF guide: Supercharge Your Data Science Career: 88 Free, Hand-Picked Resources Every Data Scientist Should Have.

저작자표시 비영리 동일조건

'Deep Learning > Keras' 카테고리의 다른 글

My experiments with AlexNet, using Keras and Theano (0)	2017.04.10
Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library (0)	2017.03.11
Keras Tutorial: The Ultimate Beginner's Guide to Deep Learning in Python (0)	2017.02.28
keras로 공부하기 좋은 사이트 theano (0)	2017.02.28
Keras 자료 (0)	2017.02.28

Posted by uniqueone

이전 1 다음

'Deep Learning/Keras'에 해당되는 글 19건

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

How to Use the Keras Functional API for Deep Learning

Tutorial Overview

1. Keras Sequential Models

2. Keras Functional Models

1. Defining Input

2. Connecting Layers

3. Creating the Model

3. Standard Network Models

Multilayer Perceptron

Convolutional Neural Network

Recurrent Neural Network

4. Shared Layers Model

Shared Input Layer

Shared Feature Extraction Layer

5. Multiple Input and Output Models

Multiple Input Model

Multiple Output Model

6. Best Practices

Further Reading

Summary

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

Official starter resources

Tutorials

Code examples

Working with text

Working with images

Creative visual applications

Reinforcement learning

Miscallenous architecture blueprints

Third-party libraries

Projects built with Keras

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

'Deep Learning > Keras' 카테고리의 다른 글

Before we start...

Keras Tutorial Contents

Step 1: Set up your environment.

Step 2: Install Keras.

Step 3: Import libraries and modules.

Step 4: Load image data from MNIST.

Step 5: Preprocess input data for Keras.

Step 6: Preprocess class labels for Keras.

Step 7: Define model architecture.

Step 8: Compile model.

Step 9: Fit model on training data.

Step 10: Evaluate model on test data.

The complete code, from start to finish.

Enjoyed the tutorial?

'Deep Learning > Keras' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바