'분류 전체보기'에 해당되는 글 1027건

  1. 2017.01.21 Learn TensorFlow and deep learning, without a Ph.D. 동영상강의 슬라이드
  2. 2017.01.20 parameter tuning of 'patternnet' function in neural networks based classification using Matlab
  3. 2017.01.19 YouTube에서 'How to Do Linear Regression the Right Way [LIVE]' 보기
  4. 2017.01.19 sequence-to-sequence 모델인 "Pointer Networks 구현
  5. 2017.01.18 Linear Regression / Bias-Variance Decomposition
  6. 2017.01.18 Implementation of Grad CAM in tensorflow
  7. 2017.01.17 PCA first or normalization first?
  8. 2017.01.17 GPU-accelerated Theano & Keras on Windows 10 native
  9. 2017.01.17 matlab dist function
  10. 2017.01.17 matlab de2bi source code
  11. 2017.01.16 Domain Adaptation(DA)에 대한 정리
  12. 2017.01.13 dist function substitution code
  13. 2017.01.13 Recognizing Traffic Lights With Deep LearningHow I learned deep learning in 10 weeks and won $5,000
  14. 2017.01.13 Embossing filter
  15. 2017.01.12 YouTube에서 'How to Install Tensorflow on Windows10 with GPU support' 보기
  16. 2017.01.12 Installing Theano in Windows 7 64-bit
  17. 2017.01.11 free-programming-books 한글
  18. 2017.01.11 DeepLab-ResNet-TensorFlow
  19. 2017.01.10 Deep Learning in Action
  20. 2017.01.10 Up and running with Theano (GPU) + PyCUDA on Windows
  21. 2017.01.10 If CNN denoises images knowing content, it gets better results. Simple idea, great paper! https://arxiv.org/abs/1701.01698 Deep Class Aware Denoising
  22. 2017.01.09 딥러닝을 이용한 숫자 이미지 인식 #1/2-학습
  23. 2017.01.08 Top Machine learning Books
  24. 2017.01.08 마코프에 이어서 강화학습에 대해서는 아래 영상이 가장 쉬웠습니다.
  25. 2017.01.08 강화학습 공부할때 처음 넘어야 하는 산이 "마코프"인데요. 저도 이에 대해서 여러가지 자료를 봤는데 아래 영상이 가장 쉽고 명확하게 설명해 주는 거 같습니다. 큰수의 법칙과 베르누이 과정에 이어서 마코프 과정
  26. 2017.01.06 20170106_theano keras 설치 동영상 및 사이트
  27. 2017.01.06 Jupyter notebook 이해하기https://www.slideshare.net/mobile/dahlmoon/jupyter-notebok-20160815?from_m_app=ios
  28. 2017.01.06 Keras, Theano and TensorFlow on Windows and Linux | GetToCode
  29. 2017.01.05 YouTube에서 'Installing CPU and GPU TensorFlow on Windows' 보기
  30. 2017.01.05 YouTube에서 'how to natively install tensorflow on windows' 보기
https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd?1484940758691=1

https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd?1484940758691=1

얼마전에 김성훈 교수님이 여기 포함되어 있는 비디오를 올려주신 적이 있는 걸로 기억하는데 강사 본인이 좀더 자세한 설명과 함께 정리해서 올렸네요~
Posted by uniqueone
,

 

http://stackoverflow.com/questions/22916915/parameter-settings-for-neural-networks-based-classification-using-matlab

 

 

 

 

Parameter settings for neural networks based classification using Matlab

Recently, I am trying to using Matlab build-in neural networks toolbox to accomplish my classification problem. However, I have some questions about the parameter settings.

a. The number of neurons in the hidden layer:

The example on this page Matlab neural networks classification example shows a two-layer (i.e. one-hidden-layer and one-output-layer) feed forward neural networks. In this example, it uses 10 neurons in the hidden layer

net = patternnet(10);

My first question is how to define the best number of neurons for my classification problem? Should I use cross-validation method to get the best performed number of neurons using a training data set?

b. Is there a method to choose three-layer or more multi-layer neural networks?

c. There are many different training method we can use in the neural networks toolbox. A list can be found at Training methods list. The page mentioned that the fastest training function is generally 'trainlm'; however, generally speaking, which one will perform best? Or it totally depends on the data set I am using?

d. In each training method, there is a parameter called 'epochs', which is the training iteration for my understanding. For each training method, Matlab defined the maximum number of epochs to train. However, from the example, it seems like 'epochs' is another parameter we can tune. Am I right? Or we just set the maximum number of epochs or leave it as default?

Any experience with Matlab neural networks toolbox is welcome and thanks very much for your reply. A.

 

 

------------------------------------------------------------------

a. You can refer to How to choose number of hidden layers and nodes in neural network? and ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hu
Surely you can do cross-validation to determine the parameter of best number of neurons. But it's not recommended as it's more suitable to use it in the stage of weights training of a certain network.

b. Refer to ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hl
And for more layers of neural network, you can refer to Deep Learning, which is very hot in recent years and gets state-of-the-art performances in many of the pattern recognition tasks.

c. It depends on your data. trainlm performs better on function fitting (nonlinear regression) problems than on pattern recognition problems while training large networks and pattern recognition networks, trainscg and trainrp are good choices. Generally, Gradient Descent and Resilient Backpropagation is recommended. More detailed comparison can be found here: http://www.mathworks.cn/cn/help/nnet/ug/choose-a-multilayer-neural-network-training-function.html

d. Yes, you're right. We can tune the epochs parameter. Generally you can output the recognition results/accuracy at every epoch and you will see that it is promoting more and more slowly, and the more epochs the more computing time. You can make a compromise between the accuracy and computation time.

 

------------------------------------------------------------------

For part b of your question: You can use like this code:

net = patternnet([10 15 20]);

This script create a network with 3 hidden layer that first layer has 10 neurons, second layer has 15 neurons and 3th layer has 20 neurons.

 

 

 

Posted by uniqueone
,
https://youtu.be/uwwWVAgJBcM
Posted by uniqueone
,
입력 값에서 답을 찾아내는 sequence-to-sequence 모델인 "Pointer Networks" https://arxiv.org/abs/1506.03134 를 TensorFlow로 구현했습니다.

https://github.com/devsisters/pointer-network-tensorflow

논문은 랜덤한 점들을 모두 한번씩만 방문하고 출발 도시로 돌아오는 최단 거리를 찾는 Travelling Salesman Problem(TSP)처럼 입력 값에서 정해진 순서로 답을 찾아야 하는 문제를 풀 수 있는 모델을 제안했습니다.

2015년 6월에 arxiv에 올라온 논문이지만 굉장히 유용한 모델 중 하나라 생각되어 구현하게 되었습니다. 이번에는 I/O 시간을 최소화 하기 위해서 멀티스레드로 데이터 큐를 따로 두어 TensorFlow graph의 실행이 I/O에 방해받지 않도록 구현했습니다.

===

그리고 제가 현재 학사 병특 중인 데브시스터즈에서 함께 일할 머신러닝 리서쳐를 찾고 있습니다 :)

저희 회사에서는 강화학습, Computer vision, NLP 등 자신이 풀고자 하는 문제를 자유롭게 정하고 논문 세미나와 코드 구현을 통해서 팀의 역량을 키워나가고 있습니다. 관심이 있으신 분들은 언제든지 연락주세요!

지원 방법 : http://www.devsisters.com/jobs
채용 문의 : career@devsisters.com

저희 팀의 외부 발표자료 및 오픈소스 프로젝트를 공유합니다.

- 딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 http://www.slideshare.net/carpedm20/ai-67616630
- 텐서플로우 설치도 했고 튜토리얼도 봤고 기초 예제도 짜봤다면 http://www.slideshare.net/carpedm20/ss-63116251
- 지적 대화를 위한 깊고 넓은 딥러닝 http://www.slideshare.net/carpedm20/pycon-korea-2016
- 강화 학습 기초 http://www.slideshare.net/carpedm20/reinforcement-learning-an-introduction-64037079

- 애플의 Simulated+Unsupervised (S+U) learning 구현 https://github.com/carpedm20/simulated-unsupervised-tensorflow
- Neural Combinatorial Optimization 구현 https://github.com/devsisters/neural-combinatorial-rl-tensorflow
- Deep Q-network 구현 https://github.com/devsisters/DQN-tensorflow
Posted by uniqueone
,
안녕하세요. 오랜만에 저희 연구실 내부에서 진행한 PRML 세미나 동영상을 공유해봅니다.

[PRML 3.1~3.2] Linear Regression / Bias-Variance Decomposition

Linear Regression에서 Least Square Error를 사용하는 수학적인 근거를 확률적인 접근에서부터 유도하고, 나아가 Regularizer의 의미와 Bias-Variance Decomposition을 이용한 Regression모델의 Overfitting 분석에 대해서 주로 다루었습니다.

앞으로 몇 챕터 더 세미나를 발표할 예정이고, 해당 동영상은 제가 올린 링크의 플레이리스트에 계속 업로드됩니다~ 감사합니다.
https://www.youtube.com/watch?v=dt8RvYEOrWw&list=PLzWH6Ydh35ggVGbBh48TNs635gv2nxkFI&sns=em
Posted by uniqueone
,
grad-cam.tensorflow

Implementation of Grad CAM in tensorflow

Gradient class activation maps are a visualization technique for deep learning networks.
https://github.com/Ankush96/grad-cam.tensorflow

also:

Grad-CAM implementation in Keras
https://github.com/jacobgil/keras-grad-cam

Grad-CAM: Gradient-weighted Class Activation Mapping
https://github.com/ramprs/grad-cam (torch)
Posted by uniqueone
,
http://stackoverflow.com/questions/10119913/pca-first-or-normalization-first

 

 

 

When doing regression or classification, what is the correct (or better) way to preprocess the data?

  1. Normalize the data -> PCA -> training
  2. PCA -> normalize PCA output -> training
  3. Normalize the data -> PCA -> normalize PCA output -> training

Which of the above is more correct, or is the "standardized" way to preprocess the data? By "normalize" I mean either standardization, linear scaling or some other techniques.

 

 

------------

You should normalize the data before doing PCA. For example, consider the following situation. I create a data set X with a known correlation matrix C:

>> C = [1 0.5; 0.5 1];
>> A = chol(rho);
>> X = randn(100,2) * A;

If I now perform PCA, I correctly find that the principal components (the rows of the weights vector) are oriented at an angle to the coordinate axes:

>> wts=pca(X)
wts =
    0.6659    0.7461
   -0.7461    0.6659

If I now scale the first feature of the data set by 100, intuitively we think that the principal components shouldn't change:

>> Y = X;
>> Y(:,1) = 100 * Y(:,1);

However, we now find that the principal components are aligned with the coordinate axes:

>> wts=pca(Y)
wts =
    1.0000    0.0056
   -0.0056    1.0000

To resolve this, there are two options. First, I could rescale the data:

>> Ynorm = bsxfun(@rdivide,Y,std(Y))

(The weird bsxfun notation is used to do vector-matrix arithmetic in Matlab - all I'm doing is subtracting the mean and dividing by the standard deviation of each feature).

We now get sensible results from PCA:

>> wts = pca(Ynorm)
wts =
   -0.7125   -0.7016
    0.7016   -0.7125

They're slightly different to the PCA on the original data because we've now guaranteed that our features have unit standard deviation, which wasn't the case originally.

The other option is to perform PCA using the correlation matrix of the data, instead of the outer product:

>> wts = pca(Y,'corr')
wts =
    0.7071    0.7071
   -0.7071    0.7071

In fact this is completely equivalent to standardizing the date by subtracting the mean and then dividing by the standard deviation. It's just more convenient. In my opinion you should always do this unless you have a good reason not to (e.g. if you want to pick up differences in the variation of each feature).

 

Posted by uniqueone
,

 

https://github.com/philferriere/dlwin

GitHub - philferriere_dlwin_ GPU-accelerated Deep Learning on Windows 10 native.pdf

 

GPU-accelerated Theano & Keras on Windows 10 native

>> LAST UPDATED JANUARY, 2017 <<

There are certainly a lot of guides to assist you build great deep learning (DL) setups on Linux or Mac OS (including with Tensorflow which, unfortunately, as of this posting, cannot be easily installed on Windows), but few care about building an efficient Windows 10-native setup. Most focus on running an Ubuntu VM hosted on Windows or using Docker, unnecessary - and ultimately sub-optimal - steps.

We also found enough misguiding/deprecated information out there to make it worthwhile putting together a step-by-step guide for the latest stable versions of Theano and Keras. Used together, they make for one of the simplest and fastest DL configurations to work natively on Windows.

If you must run your DL setup on Windows 10, then the information contained here may be useful to you.

Dependencies

Here's a summary list of the tools and libraries we use for deep learning on Windows 10 (Version 1607 OS Build 14393.222):

  1. Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0
    • Used for its C/C++ compiler (not its IDE) and SDK
  2. Anaconda (64-bit) w. Python 2.7 (Anaconda2-4.2.0) or Python 3.5 (Anaconda3-4.2.0)
    • A Python distro that gives us NumPy, SciPy, and other scientific libraries
  3. CUDA 8.0.44 (64-bit)
    • Used for its GPU math libraries, card driver, and CUDA compiler
  4. MinGW-w64 (5.4.0)
    • Used for its Unix-like compiler and build tools (g++/gcc, make...) for Windows
  5. Theano 0.8.2
    • Used to evaluate mathematical expressions on multi-dimensional arrays
  6. Keras 1.1.0
    • Used for deep learning on top of Theano
  7. OpenBLAS 0.2.14 (Optional)
    • Used for its CPU-optimized implementation of many linear algebra operations
  8. cuDNN v5.1 (August 10, 2016) for CUDA 8.0 (Conditional)
    • Used to run vastly faster convolution neural networks

For an older setup using VS2013 and CUDA 7.5, please refer to README-2016-07.md (July, 2016 setup)

Hardware

  1. Dell Precision T7900, 64GB RAM
    • Intel Xeon E5-2630 v4 @ 2.20 GHz (1 processor, 10 cores total, 20 logical processors)
  2. NVIDIA GeForce Titan X, 12GB RAM
    • Driver version: 372.90 / Win 10 64

Installation steps

We like to keep our toolkits and libraries in a single root folder boringly called c:\toolkits, so whenever you see a Windows path that starts with c:\toolkits below, make sure to replace it with whatever you decide your own toolkit drive and folder ought to be.

Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0

You can download Visual Studio 2015 Community Edition from here:

Select the executable and let it decide what to download on its own:

Run the downloaded executable to install Visual Studio, using whatever additional config settings work best for you:

  1. Add C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin to your PATH, based on where you installed VS 2015.
  2. Define sysenv variable INCLUDE with the value C:\Program Files (x86)\Windows Kits\10\Include\10.0.10240.0\ucrt
  3. Define sysenv variable LIB with the value C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\um\x64;C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x64

Reference Note: We couldn't run any Theano python files until we added the last two env variables above. We would get a c:\program files (x86)\microsoft visual studio 14.0\vc\include\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory error at compile time and missing kernel32.lib uuid.lib ucrt.lib errors at link time. True, you could probably run C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat (with proper params) every single time you open a MINGW cmd prompt, but, obviously, none of the sysenv vars would stick from one session to the next.

Anaconda (64-bit)

This tutorial was created with Python 2.7, but if you prefer to use Python 3.5 it should work too.

Depending on your installation use c:\toolkits\anaconda3-4.2.0 instead of c:\toolkits\anaconda2-4.2.0.

Download the appropriate Anaconda version from here:

Run the downloaded executable to install Anaconda in c:\toolkits\anaconda2-4.2.0:

Warning: Below, we enabled Register Anaconda as the system Python 2.7 because it works for us, but that may not be the best option for you!

  1. Define sysenv variable PYTHON_HOME with the value c:\toolkits\anaconda2-4.2.0
  2. Add %PYTHON_HOME%, %PYTHON_HOME%\Scripts, and %PYTHON_HOME%\Library\bin to PATH

After anaconda installation open a command prompt and execute:

$ cd $PYTHON_HOME; conda install libpython

Note: The version of MinGW above is old (gcc 4.7.0). Instead, we will use MinGW 5.4.0, as shown below.

CUDA 8.0.44 (64-bit)

Download CUDA 8.0 (64-bit) from the NVidia website

Select the proper target platform:

Download the installer:

Run the downloaded installer. Install the files in c:\toolkits\cuda-8.0.44:

After completion, the installer should have created a system environment (sysenv) variable named CUDA_PATH and added %CUDA_PATH%\bin as well as%CUDA_PATH%\libnvvp to PATH. Check that it is indeed the case. If, for some reason, the CUDA env vars are missing, then:

  1. Define a system environment (sysenv) variable named CUDA_PATH with the value c:\toolkits\cuda-8.0.44
  2. Add%CUDA_PATH%\libnvvp and %CUDA_PATH%\bin to PATH

MinGW-w64 (5.4.0)

Download MinGW-w64 from here:

Install it to c:\toolkits\mingw-w64-5.4.0 with the following settings (second wizard screen):

  1. Define the sysenv variable MINGW_HOME with the value c:\toolkits\mingw-w64-5.4.0
  2. Add %MINGW_HOME%\mingw64\bin to PATH

Run the following to make sure all necessary build tools can be found:

$ where gcc; where g++; where cl; where nvcc; where cudafe; where cudafe++
$ gcc --version; g++ --version
$ cl
$ nvcc --version; cudafe --version; cudafe++ --version

You should get results similar to:

Theano 0.8.2

Version 0.8.2? Why not just install the latest bleeding-edge version of Theano since it obviously must work better, right? Simply put, because it makes reproducible research harder. If your work colleagues or Kaggle teammates install the latest code from the dev branch at a different time than you did, you will most likely be running different code bases on your machines, increasing the odds that even though you're using the same input data (the same random seeds, etc.), you still end up with different results when you shouldn't. For this reason alone, we highly recommend only using point releases, the same one across machines, and always documenting which one you use if you can't just use a setup script.

Clone a stable Theano release (0.8.2) from GitHub into c:\toolkits\theano-0.8.2 using the following commands:

$ cd /c/toolkits
$ git clone https://github.com/Theano/Theano.git theano-0.8.2 --branch rel-0.8.2

Install Theano as follows:

$ cd /c/toolkits/theano-0.8.2
$ python setup.py install --record installed_files.txt

The list of files installed can be found here

Verify Theano was installed by querying Anaconda for the list of installed packages:

$ conda list | grep -i theano

Note: We also tried installing Theano with the following command:

$ pip install git+https://github.com/Theano/Theano.git@rel-0.8.2

In our case, this resulted in conflicts between 32-bit and 64-bit DLL when trying to run Theano code.

OpenBLAS 0.2.14 (Optional)

If we're going to use the GPU, why install a CPU-optimized linear algebra library? With our setup, most of the deep learning grunt work is performed by the GPU, that is correct, but the CPU isn't idle. An important part of image-based Kaggle competitions is data augmentation. In that context, data augmentation is the process of manufacturing additional input samples (more training images) by transformation of the original training samples, via the use of image processing operators. Basic transformations such as downsampling and (mean-centered) normalization are also needed. If you feel adventurous, you'll want to try additional pre-processing enhancements (noise removal, histogram equalization, etc.). You certainly could use the GPU for that purpose and save the results to file. In practice, however, those operations are often executed in parallel on the CPU while the GPU is busy learning the weights of the deep neural network and the augmented data discarded after use. For this reason, we highly recommend installing the OpenBLAS library.

According to the Theano documentation, the multi-threaded OpenBLAS library performs much better than the un-optimized standard BLAS (Basic Linear Algebra Subprograms) library, so that's what we use.

Download OpenBLAS from here and extract the files to c:\toolkits\openblas-0.2.14-int32

  1. Define sysenv variable OPENBLAS_HOME with the value c:\toolkits\openblas-0.2.14-int32
  2. Add %OPENBLAS_HOME%\bin to PATH

Switching between CPU and GPU mode

Next, create the two following sysenv variables:

  • sysenv variable THEANO_FLAGS_CPU with the value:

floatX=float32,device=cpu,lib.cnmem=0.8,blas.ldflags=-LC:/toolkits/openblas-0.2.14-int32/bin -lopenblas

  • sysenv variable THEANO_FLAGS_GPU with the value:

floatX=float32,device=gpu,dnn.enabled=False,lib.cnmem=0.8,blas.ldflags=-LC:/toolkits/openblas-0.2.14-int32/bin -lopenblas

Theano only cares about the value of the sysenv variable named THEANO_FLAGS. All we need to do to tell Theano to use the CPU or GPU is to set THEANO_FLAGS to either THEANO_FLAGS_CPU or THEANO_FLAGS_GPU. You can verify those variables have been successfully added to your environment with the following command:

$ env | grep -i theano

Validating our OpenBLAS install (Optional)

We can use the following program from the Theano documentation:

import numpy as np
import time
import theano

print('blas.ldflags=', theano.config.blas.ldflags)

A = np.random.rand(1000, 10000).astype(theano.config.floatX)
B = np.random.rand(10000, 1000).astype(theano.config.floatX)
np_start = time.time()
AB = A.dot(B)
np_end = time.time()
X, Y = theano.tensor.matrices('XY')
mf = theano.function([X, Y], X.dot(Y))
t_start = time.time()
tAB = mf(A, B)
t_end = time.time()
print("numpy time: %f[s], theano time: %f[s] (times should be close when run on CPU!)" % (
np_end - np_start, t_end - t_start))
print("Result difference: %f" % (np.abs(AB - tAB).max(), ))

Save the code above to a file named openblas_test.py in the current directory (or download it from this GitHub repo) and run the next commands:

$ THEANO_FLAGS=$THEANO_FLAGS_CPU
$ python openblas_test.py

Note: If you get a failure of the kind NameError: global name 'CVM' is not defined, it may be because, like us, you've messed with the value of THEANO_FLAGS_CPU and switched back and forth between floatX=float32 and floatX=float64 several times. Cleaning your C:\Users\username\AppData\Local\Theano directory (replace username with your login name) will fix the problem (See here, for reference)

Checking our PATH sysenv var

At this point, the PATH environment variable should look something like:

%MINGW_HOME%\mingw64\bin;
%CUDA_PATH%\bin;
%CUDA_PATH%\libnvvp;
%OPENBLAS_HOME%\bin;
%PYTHON_HOME%;
%PYTHON_HOME%\Scripts;
%PYTHON_HOME%\Library\bin;
C:\ProgramData\Oracle\Java\javapath;
C:\WINDOWS\system32;
C:\WINDOWS;
C:\WINDOWS\System32\Wbem;
C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin;
C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\;
C:\Program Files\Git\cmd;
C:\Program Files\Git\mingw64\bin;
C:\Program Files\Git\usr\bin
...

Validating our GPU install with Theano

We'll run the following program from the Theano documentation to compare the performance of the GPU install vs using Theano in CPU-mode. Save the code to a file named cpu_gpu_test.py in the current directory (or download it from this GitHub repo):

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')

First, let's see what kind of results we get running Theano in CPU mode:

$ THEANO_FLAGS=$THEANO_FLAGS_CPU
$ python cpu_gpu_test.py

Next, let's run the same program on the GPU:

$ THEANO_FLAGS=$THEANO_FLAGS_GPU
$ python cpu_gpu_test.py

Note: If you get a c:\program files (x86)\microsoft visual studio 14.0\vc\include\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory with the above, please see the Reference Note at the end of the Visual Studio 2015 Community Edition Update 3 section.

Almost a 68:1 improvement. It works! Great, we're done with setting up Theano 0.8.2.

Keras 1.1.0

Clone a stable Keras release (1.1.0) to your local machine from GitHub using the following commands:

$ cd /c/toolkits
$ git clone https://github.com/fchollet/keras.git keras-1.1.0 --branch 1.1.0

This should clone Keras 1.1.0 in c:\toolkits\keras-1.1.0:

Install it as follows:

$ cd /c/toolkits/keras-1.1.0
$ python setup.py install --record installed_files.txt

The list of files installed can be found here

Verify Keras was installed by querying Anaconda for the list of installed packages:

$ conda list | grep -i keras

Recent builds of Keras can either use Tensorflow or Theano as a backend. At the time of this writing, TensorFlow supports only 64-bit Python 3.5 on Windows. This doesn't work for us, but if you are using Python 3.5, then by all means, feel free to give it a try. By default, we will use Theano as our backend, using the commands below:

$ cp ~/.keras/keras.json ~/.keras/keras.json.bak
$ echo -e '{\n\t"image_dim_ordering": "th",\n\t"epsilon": 1e-07,\n\t"floatx": "float32",\n\t"backend": "theano"\n}' >> ~/.keras/keras_theano.json
$ echo -e '{\n\t"image_dim_ordering": "tf",\n\t"epsilon": 1e-07,\n\t"floatx": "float32",\n\t"backend": "tensorflow"\n}' >> ~/.keras/keras_tensorflow.json
$ cp -f ~/.keras/keras_theano.json ~/.keras/keras.json

Validating our GPU install with Keras

We can train a simple convnet (convolutional neural network) on the MNIST dataset by using one of the example scripts provided with Keras. The file is called mnist_cnn.py and can be found in the examples folder:

$ THEANO_FLAGS=$THEANO_FLAGS_GPU
$ cd /c/toolkits/keras-1.1.0/examples
$ python mnist_cnn.py

Without cuDNN, each epoch takes about 20s. If you install TechPowerUp's GPU-Z, you can track how well the GPU is being leveraged. Here, in the case of this convnet (no cuDNN), we max out at 92% GPU usage on average:

cuDNN v5.1 (August 10, 2016) for CUDA 8.0 (Conditional)

If you're not going to train convnets then you might not really benefit from installing cuDNN. Per NVidia's website, "cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers," hallmarks of convolution network architectures. Theano is mentioned in the list of frameworks that support cuDNN v5 for GPU acceleration.

If you are going to train convnets, then download cuDNN from here. Choose the cuDNN Library for Windows10 dated August 10, 2016:

The downloaded ZIP file contains three directories (bin, include, lib). Extract those directories and copy the files they contain to the identically named folders in C:\toolkits\cuda-8.0.44.

To enable cuDNN, create a new sysenv variable named THEANO_FLAGS_GPU_DNN with the following value:

floatX=float32,device=gpu,optimizer_including=cudnn,lib.cnmem=0.8,dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic,blas.ldflags=-LC:/toolkits/openblas-0.2.14-int32/bin -lopenblas

Then, run the following commands:

$ THEANO_FLAGS=$THEANO_FLAGS_GPU_DNN
$ cd /c/toolkits/keras-1.1.0/examples
$ python mnist_cnn.py

Note: If you get a cuDNN not available message after this, try cleaning your C:\Users\username\AppData\Local\Theano directory (replace username with your login name). If you get an error similar to cudnn error: Mixed dnn version. The header is from one version, but we link with a different version (5010, 5005), try cuDNN v5.0 instead of cuDNN v5.1. Windows will sometimes also helpfully block foreign .dll files from running on your computer. If that is the case, right click and unblock the files to allow them to be used.

Here's the (cleaned up) execution log for the simple convnet Keras example, using cuDNN:

Now, each epoch takes about 3s, instead of 20s, a large improvement in speed, with slightly lower GPU usage:

The Your cuDNN version is more recent than the one Theano officially supports message certainly sounds ominous but a test accuracy of 0.9899 would suggest that it can be safely ignored. So...

...we're done!

References

Setup a Deep Learning Environment on Windows (Theano & Keras with GPU Enabled), by Ayse Elvan Aydemir

Installation of Theano on Windows, by Theano team

A few tips to install theano on Windows, 64 bits, by Kagglers

How do I install Keras and Theano in Anaconda Python 2.7 on Windows?, by S.O. contributors

Additional Thanks Go To...

Kaggler Vincent L. for recommending adding dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic to THEANO_FLAGS_GPU_DNN in order to improve reproducibility with no observable impact on performance.

If you'd rather use Python3, conda's built-in MinGW package, or pip, please refer to @stmax82's note here.

Suggested viewing/reading

Intro to Deep Learning with Python, by Alec Radford

@ https://www.youtube.com/watch?v=S75EdAcXHKk

@ http://slidesha.re/1zs9M11

@ https://github.com/Newmu/Theano-Tutorials

About the Author

For information about the author, please visit:

https://www.linkedin.com/in/philferriere

Posted by uniqueone
,

dist_func.m

 

matlab dist function

'Matlab > Source Code' 카테고리의 다른 글

Top 10 most popular MATLAB & Simulink file downloads from last year  (0) 2017.03.18
random forest using matlab  (0) 2017.03.12
Some Matlab Code  (0) 2017.03.07
plot standard deviation and mean  (0) 2017.02.08
matlab de2bi source code  (0) 2017.01.17
Posted by uniqueone
,

de2bi1.m

matlab de2bi source code

'Matlab > Source Code' 카테고리의 다른 글

Top 10 most popular MATLAB & Simulink file downloads from last year  (0) 2017.03.18
random forest using matlab  (0) 2017.03.12
Some Matlab Code  (0) 2017.03.07
plot standard deviation and mean  (0) 2017.02.08
matlab dist function  (0) 2017.01.17
Posted by uniqueone
,

Domain Adaptation(DA)에 대한 정리를 올려봅니다. 원래는 제가 하는 딥러닝 스터디에서 발표했던 자료인데 최근에 DA에 대한 관심이 있는 분들이 많아지는 것 같아 올려봅니다. Analysis of Representation for Domain Adaptation 논문에 대한 내용이 대부분이고, Domain Adversarial Training of Neural Networks와 Domain Separation Network의 loss function에 사용되었습니다.

먼저 DA라는 문제 정의는 다음과 같습니다. S라는 source domain에서는 라벨이 있는 데이터를 얻고, T라는 target domain에서는 라벨이 없는 입력 데이터만을 얻게 됩니다. 이 때 우리는 T에서 잘 동작하는 분류기를 찾고 싶은겁니다. 이 세팅은 데이터를 synthetic 환경에서 얻어서 실제 환경에서 동작시키길 원하는 모든 문제에 적용가능한 매우 실용...적인 세팅이라 생각합니다.

DA의 목적은 입력 공간X에서 feature들의 공간 Z로 가는 어떤 좋은 매핑을 찾고자 하는데 있습니다. 우리에게 익숙한 CNN이라면 좋은 convolutional feature map을 찾고자 합니다.

분석을 위해서 조금 더 수학적으로 써보면 입력들의 공간을 measurable space (X, D)로 표현하고, feature들의 measurable space (Z, \tilde{D})로 보내는 어떤 매핑 R을 찾고 싶은거죠.

S와 T의 차이는 다음과 같이 표현됩니다. 우리가 다루는 것이 이미지라 할 때 X는 이미지의 공간이고, domain과 source의 차이는 이 공간 속에서 분류하고자 하는 이미지 사이의 분포의 차이로 정의됩니다. 즉 X에서 정의된 D_{S}와 D_{T}가 있는 것이지요.

이 논문은 크게 두 theorem을 보이는데 첫번째 thm은 target 공간에서의 어떤 분류기 h의 expected error는 source 공간에서의 h의 expected error와 VC bound에서 등장하는 term과 S와 T 공간 사이의 거리와 관심있는 target function 자체의 intrinsic loss에 해당하는 term으로 표현됩니다. 첨부한 정리에선 thm1에 대한 증명을 논문에 써 본 것 보단 조금 더 자세히 정리해봤으니 한번 봐보시면 재밌으실 듯 합니다. VCD나 PAC관련 정리를 보신 분이라면 쉽게 따라가실 수 있을거에요.

Thm1의 물리적 의미를 한번 더 생각해보면 우리가 T에서 잘 동작하는 분류기를 만들기 위해선 먼저 S에서 잘 동작하는 분류기를 만들어야 하고, S와 D 사이의 '거리'를 줄여야 한다는 것이지요. 문젠 이 '거리'를 정의함에 sup이 들어가서 finite sample로 근사가 안된다는 것이지요. 그래서 이를 잘 sample기반으로 잘 근사할 수 있는 다른 metric를 제시합니다. (정확히는 이의 convex upper-bound를요) 그리고 이 근사는 놀랍게도 S공간의 입력들과 T공간의 입력들을 잘 '구분'할 수 없을수록 거리가 가까워지게 됩니다.

뒤에 나오는 Domain Adversarial Trianing of Neural Networks와 Domain Separation Network에선 이 '개념'을 차용해서 새로운 loss functoin을 제안하는데, 입력이 들어왔을 때 이 입력이 S인지 T인지 구분하는 domain classifier를 하나 추가하고, 이 classifier의 성능을 '악화'시키도록 학습을 시킵니다.

개인적으로는 Domain Adversarial Trianing of Neural Networks의 첫번째 실험 파트의 해석이 참 좋은 것 같아요. 각 알고리즘의 decision boundary를 보여주며 DA를 했을 때와 안했을 때의 차이를 보여줍니다.

 

Domain-Adaptation.pdf

 

Posted by uniqueone
,
https://www.mathworks.com/matlabcentral/answers/181820-how-dist-function-works

 

 

 

The dist function is a 'Euclidean distance weight function' which applies weights to an input to get weighted inputs. At your example:

W is the (random) weight matrix. P is the input vector Z is the weighted input

If you type in the matlab prompt 'edit dist.apply' you find the formula behind this function. For your example, the weighted matrix is subtracted from the transposed and copied vector. Now it is squared and then the square root is taken. This is how the Euclidian norm is defined Norm = square((a-b)^2)

I have copied the code and made the example simpler to understand in the code below better:

clc;clear all;close all; 
p=[1;2;3]
w=[1 1 2;1 1 1;1 2 1;1 1 1]
z1=dist(w,p)
% dist function
S = size(w,1);
Q = size(p,2);
z2 = zeros(S,Q);
if (Q<S)
  p = p';
  copies = zeros(1,S);
  for q=1:Q
    z2(:,q) = sum((w-p(q+copies,:)).^2,2);
  end
else
  w = w';
  copies = zeros(1,Q);
  for i=1:S
    z2(i,:) = sum((w(:,i+copies)-p).^2,1);
  end
end
z2 = sqrt(z2)

Here z1 and z2 should give the same answer.

I hope this makes it clearer.

Kind regards, Christiaan

Posted by uniqueone
,
https://medium.com/@davidbrai/recognizing-traffic-lights-with-deep-learning-23dae23287cc#.8jeuztjfi

 

https://medium.freecodecamp.com/recognizing-traffic-lights-with-deep-learning-23dae23287cc#.bq4dwhjf4

 

 

I recently won first place in the Nexar Traffic Light Recognition Challenge, computer vision competition organized by a company that’s building an AI dash cam app.

In this post, I’ll describe the solution I used. I’ll also explore approaches that did and did not work in my effort to improve my model.

Don’t worry — you don’t need to be an AI expert to understand this post. I’ll focus on the ideas and methods I used as opposed to the technical implementation.

Demo of a deep learning based classifier for recognizing traffic lights

The challenge

The goal of the challenge was to recognize the traffic light state in images taken by drivers using the Nexar app. In any given image, the classifier needed to output whether there was a traffic light in the scene, and whether it was red or green. More specifically, it should only identify traffic lights in the driving direction.

Here are a few examples to make it clearer:

The images above are examples of the three possible classes I needed to predict: no traffic light (left), red traffic light (center) and green traffic light (right).

The challenge required the solution to be based on Convolutional Neural Networks, a very popular method used in image recognition with deep neural networks. The submissions were scored based on the model’s accuracy along with the model’s size (in megabytes). Smaller models got higher scores. In addition, the minimum accuracy required to win was 95%.

Nexar provided 18,659 labeled images as training data. Each image was labeled with one of the three classes mentioned above (no traffic light / red / green).

Software and hardware

I used Caffe to train the models. The main reason I chose Caffe was because of the large variety of pre-trained models.

Python, NumPy & Jupyter Notebook were used for analyzing results, data exploration and ad-hoc scripts.

Amazon’s GPU instances (g2.2xlarge) were used to train the models. My AWS bill ended up being $263 (!). Not cheap. 😑

The code and files I used to train and run the model are on GitHub.

The final classifier

The final classifier achieved an accuracy of 94.955% on Nexar’s test set, with a model size of ~7.84 MB. To compare, GoogLeNet uses a model size of 41 MB, and VGG-16 uses a model size of 528 MB.

Nexar was kind enough to accept 94.955% as 95% to pass the minimum requirement 😁.

The process of getting higher accuracy involved a LOT of trial and error. Some of it had some logic behind it, and some was just “maybe this will work”. I’ll describe some of the things I tried to improve the model that did and didn’t help. The final classifier details are described right after.

What worked?

Transfer learning

I started off with trying to fine-tune a model which was pre-trained on ImageNet with the GoogLeNet architecture. Pretty quickly this got me to >90% accuracy! 😯

Nexar mentioned in the challenge page that it should be possible to reach 93% by fine-tuning GoogLeNet. Not exactly sure what I did wrong there, I might look into it.

SqueezeNet

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size.

Since the competition rewards solutions that use small models, early on I decided to look for a compact network with as few parameters as possible that can still produce good results. Most of the recently published networks are very deep and have a lot of parameters. SqueezeNet seemed to be a very good fit, and it also had a pre-trained model trained on ImageNet available in Caffe’s Model Zoo which came in handy.

SqueezeNet network architecture. Slides

The network manages to stay compact by:

  • Using mostly 1x1 convolution filters and some 3x3
  • Reducing number of input channels into the 3x3 filters

For more details, I recommend reading this blog post by Lab41 or the original paper.

After some back and forth with adjusting the learning rate I was able to fine-tune the pre-trained model as well as training from scratch with good accuracy results: 92%! Very cool! 🙌

Rotating images

Source: Nexar

Most of the images were horizontal like the one above, but about 2.4% were vertical, and with all kinds of directions for “up”. See below.

Different orientations of vertical images. Source: Nexar challenge

Although it’s not a big part of the data-set, we want our model classify them correctly too.

Unfortunately, there was no EXIF data in the jpeg images specifying the orientation. At first I considered doing some heuristic to identify the sky and flip the image accordingly, but that did not seem straightforward.

Instead, I tried to make the model invariant to rotations. My first attempt was to train the network with random rotations of 0°, 90°, 180°, 270°. That didn’t help 🤔. But when averaging the predictions of 4 rotations for each image, there was improvement!

92% → 92.6% 👍

To clarify: by “averaging the predictions” I mean averaging the probabilities the model produced of each class across the 4 image variations.

Oversampling crops

During training the SqueezeNet network first performed random cropping on the input images by default, and I didn’t change it. This type of data augmentation makes the network generalize better.

Similarly, when generating predictions, I took several crops of the input image and averaged the results. I used 5 crops: 4 corners and a center crop. The implementation was free by using existing caffe code for this.

92% → 92.46% 👌

Rotating images together with oversampling crops showed very slight improvement.

Additional training with lower learning rate

All models were starting to overfit after a certain point. I noticed this by watching the validation-set loss start to rise at some point.

Validation loss rising from around iteration 40,000

I stopped the training at that point because the model was probably not generalizing any more. This meant that the learning rate didn’t have time to decay all the way to zero. I tried resuming the training process at the point where the model started overfitting with a learning rate 10 times lower than the original one. This usually improved the accuracy by 0-0.5%.

More training data

At first, I split my data into 3 sets: training (64%), validation (16%) & test (20%). After a few days, I thought that giving up 36% of the data might be too much. I merged the training & validations sets and used the test-set to check my results.

I retrained a model with “image rotations” and “additional training at lower rate” and saw improvement:

92.6% → 93.5% 🤘

Relabeling mistakes in the training data

When analyzing the mistakes the classifier had on the validation set, I noticed that some of the mistakes have very high confidence. In other words, the model is certain it’s one thing (e.g. green light) while the training data says another (e.g. red light).

Notice that in the plot above, the right-most bar is pretty high. That means there’s a high number of mistakes with >95% confidence. When examining these cases up close I saw these were usually mistakes in the ground-truth of the training set rather than in the trained model.

I decided to fix these errors in the training set. The reasoning was that these mistakes confuse the model, making it harder for it to generalize. Even if the final testing-set has mistakes in the ground-truth, a more generalized model has a better chance of high accuracy across all the images.

I manually labeled 709 images that one of my models got wrong. This changed the ground-truth for 337 out of the 709 images. It took about an hour of manual work with a python script to help me be efficient.

Above is the same plot after re-labeling and retraining the model. Looks better!

This improved the previous model by:

93.5% → 94.1% ✌️

Ensemble of models

Using several models together and averaging their results improved the accuracy as well. I experimented with different kinds of modifications in the training process of the models involved in the ensemble. A noticeable improvement was achieved by using a model trained from scratch even though it had lower accuracy on its own together with the models that were fine-tuned on pre-trained models. Perhaps this is because this model learned different features than the ones that were fine-tuned on pre-trained models.

The ensemble used 3 models with accuracies of 94.1%, 94.2% and 92.9% and together got an accuracy of 94.8%. 👾

What didn’t work?

Lots of things! 🤕 Hopefully some of these ideas can be useful in other settings.

Combatting overfitting

While trying to deal with overfitting I tried several things, none of which produced significant improvements:

  • increasing the dropout ratio in the network
  • more data augmentation (random shifts, zooms, skews)
  • training on more data: using 90/10 split instead of 80/20

Balancing the dataset

The dataset wasn’t very balanced:

  • 19% of images were labeled with no traffic light
  • 53% red light
  • 28% green light.

I tried balancing the dataset by oversampling the less common classes but didn’t notice any improvement.

Separating day & night

My intuition was that recognizing traffic lights in daylight and nighttime is very different. I thought maybe I could help the model by separating it into two simpler problems.

It was fairly easy to separate the images to day and night by looking at their average pixel intensity:

You can see a very natural separation of images with low average values, i.e. dark images, taken at nighttime, and bright images, taken at daytime.

I tried two approaches, both didn’t improve the results:

  • Training two separate models for day images and night images
  • Training the network to predict 6 classes instead of 3 by also predicting whether it’s day or night

Using better variants of SqueezeNet

I experimented a little bit with two improved variants of SqueezeNet. The first used residual connections and the second was trained with dense→sparse→dense training (more details in the paper). No luck. 😕

Localization of traffic lights

After reading a great post by deepsense.io on how they won the whale recognition challenge, I tried to train a localizer, i.e. identify the location of the traffic light in the image first, and then identify the traffic light state on a small region of the image.

I used sloth to annotate about 2,000 images which took a few hours. When trying to train a model, it was overfitting very quickly, probably because there was not enough labeled data. Perhaps this could work if I had annotated a lot more images.

Training a classifier on the hard cases

I chose 30% of the “harder” images by selecting images which my classifier was less than 97% confident about. I then tried to train classifier just on these images. No improvement. 😑

Different optimization algorithm

I experimented very shortly with using Caffe’s Adam solver instead of SGD with linearly decreasing learning rate but didn’t see any improvement. 🤔

Adding more models to ensemble

Since the ensemble method proved helpful, I tried to double-down on it. I tried changing different parameters to produce different models and add them to the ensemble: initial seed, dropout rate, different training data (different split), different checkpoint in the training. None of these made any significant improvement. 😞

Final classifier details

The classifier uses an ensemble of 3 separately trained networks. A weighted average of the probabilities they give to each class is used as the output. All three networks were using the SqueezeNet network but each one was trained differently.

Model #1 — Pre-trained network with oversampling

Trained on the re-labeled training set (after fixing the ground-truth mistakes). The model was fine-tuned based on a pre-trained model of SqueezeNet trained on ImageNet.

Data augmentation during training:

  • Random horizontal mirroring
  • Randomly cropping patches of size 227 x 227 before feeding into the network

At test time, the predictions of 10 variations of each image were averaged to calculate the final prediction. The 10 variations were made of:

  • 5 crops of size 227 x 227: 1 for each corner and 1 in the center of the image
  • for each crop, a horizontally mirrored version was also used

Model accuracy on validation set: 94.21%
Model size: ~2.6 MB

Model #2 — Adding rotation invariance

Very similar to Model #1, with the addition of image rotations. During training time, images were randomly rotated by 90°, 180°, 270° or not at all. At test-time, each one of the 10 variations described in Model #1 created three more variations by rotating it by 90°, 180° and 270°. A total of 40 variations were classified by our model and averaged together.

Model accuracy on validation set: 94.1%
Model size: ~2.6 MB

Model #3 — Trained from scratch

This model was not fine-tuned, but instead trained from scratch. The rationale behind it was that even though it achieves lower accuracy, it learns different features on the training set than the previous two models, which could be useful when used in an ensemble.

Data augmentation during training and testing are the same as Model #1: mirroring and cropping.

Model accuracy on validation set: 92.92%
Model size: ~2.6 MB

Combining the models together

Each model output three values, representing the probability that the image belongs to each one of the three classes. We averaged their outputs with the following weights:

  • Model #1: 0.28
  • Model #2: 0.49
  • Model #3: 0.23

The values for the weights were found by doing a grid-search over possible values and testing it on the validation set. They are probably a little overfitted to the validation set, but perhaps not too much since this is a very simple operation.

Model accuracy on validation set: 94.83%
Model size: ~7.84 MB
Model accuracy on Nexar’s test set: 94.955% 🎉

Examples of the model mistakes

Source: Nexar

The green dot in the palm tree produced by the glare probably made the model predict there’s a green light by mistake.

Source: Nexar

The model predicted red instead of green. Tricky case when there is more than one traffic light in the scene.

The model said there’s no traffic light while there’s a green traffic light ahead.

Conclusion

This was the first time I applied deep learning on a real problem! I was happy to see it worked so well. I learned a LOT during the process and will probably write another post that will hopefully help newcomers waste less time on some of the mistakes and technical challenges I had.

I want to thank Nexar for providing this great challenge and hope they organize more of these in the future! 🙌


If you enjoyed reading this post, please tap below!

Would love to get your feedback and questions below!

 

 

 

Posted by uniqueone
,
http://archive.gamedev.net/archive/reference/programming/features/imageproc/page2.html

 

 

Algorithm 9 : Embossing effect filter
  • Convert the image into grayscale
  • For every pixel ( i , j ) on the output bitmap
    • Compute its color using formula (R13)
    • Set the pixel
#define emboss_w 3
#define emboss_h 3    

sumr=0;
sumg=0;
sumb=0;
    
int emboss_filter[emboss_w][emboss_h]={{2,0,0},{0,-1,0},{0,0,-1}};
int emboss_sum=1;
 
for(i=1;i<temp->w-1;i++){
  for(j=1;j<temp->h-1;j++){
    color=getpixel(temp,i,j);
    r=getr32(color);
    g=getg32(color);
    b=getb32(color);
    h=(r+g+b)/3;
    if(h>255)
      h=255;
    if(h<0)
      h=0;
    putpixel(temp1,i,j,makecol(h,h,h));
  } 
} 

for(i=1;i<temp->w-1;i++){
  for(j=1;j<temp->h-1;j++){
    sumr=0;
    for(k=0;k<emboss_w;k++){
      for(l=0;l<emboss_h;l++){
        color=getpixel(temp1,i-((emboss_w-1)>>1)+k,j-((
        emboss_h-1)>>1)+l);
        r=getr32(color);
        sumr+=r*emboss_filter[k][l];
      }
    }
    sumr/=emboss_sum;
    sumr+=128;
    if(sumr>255)
      sumr=255;
    if(sumr<0)
      sumr=0;
    putpixel(temp2,i,j,makecol(sumr,sumr,sumr));
  }
}

Here are the effects of this algorithm:


Picture 9: Embossing filter

Posted by uniqueone
,

https://youtu.be/cF7tIo6Njo4
Posted by uniqueone
,

http://www.gergltd.com/home/2015/04/installing-theano-in-windows-7-64-bit/

 

 

 

 

 

Installing Theano in Windows 7 64-bit

My instructions for installing Theano 0.6 with

  • Windows 7-64 bit
  • Anaconda 2.1.0 (Python 2.7).  This tutorial only works with 2.1.0.  I tested it with 2.2.0 and it did not work.  I have no plans to fix this issue.
  • CUDA 7.0

Steps

  1. Download Anaconda 2.1.0 from here.
  2. Install pip via command line using “pip install https://pypi.python.org/packages/source/T/Theano/Theano-0.6.0.zip#md5=0a2211b250c358809014adb945dd0ba7
  3. Create a .theanorc.txt file in your user area (C:\Users\username\.theanorc.txt) with the specified text listed below.
  4. Open Anaconda
  5. Import and test/build theano by typeing import theano and then theano.test()
  6. Sit back and relax while everything builds.

.theanorc.txt file contents (you must create at %USERDIR%/.theanorc.txt, for me this is c:\users\username\.theanorc.txt)

[global]
openmp=False
device = gpu0
floatX = float32

[blas] ldflags=

Notes

If you get an error about “CVM,” you must delete the cache files that are in C:\Users\MyUsername\AppData\Local\Theano. Once you delete everything, start python again and continue from there.

If you have path issues when trying to import theano, try using the Visual Studio 64-bit command prompt if you have it.  It sets a bunch of paths for you and “just works” for me.  For reference, the path I use is:

PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Windows\Microsoft.NET\Framework64\v3.5;C:\Program Files (x86)\Microsoft Visual
Studio 10.0\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools;C:\Program Files (x86)\HTML Help Workshop;C:
\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin;C:\Program
 Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\libnvvp;C:\Python34\Lib\site-packages\PyQt5;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Co
mmon;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\libnvvp;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS C
lient\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel
(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\MATL
AB\R2014a\bin;C:\Program Files\TortoiseHg\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\D
TS\Binn\;C:\Users\username\AppData\Local\Continuum\Anaconda;C:\Users\username\AppData\Local\Continuum\Anaconda\Scripts

Update June 11, 2015
Added link to Ananconda download

 

 

 

 

 

 

 

 

 

 

 

 

Posted by uniqueone
,
Python

A Byte of Python 한글 번역 by Jeongbin Park (PDF)
모두의 파이썬: 20일 만에 배우는 프로그래밍 기초
왕초보를 위한 Python 2.7
점프 투 파이썬 - Python 3
R

R을 이용한 데이터 처리 & 분석 실무 - 서민구 (HTML, PDF - 이전 버젼)
The R Manuals (translated in Korean)
Posted by uniqueone
,

This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset

 

https://github.com/DrSleep/tensorflow-deeplab-resnet

Posted by uniqueone
,
http://www.datasciencecentral.com/m/blogpost?id=6448529%3ABlogPost%3A504738
Posted by uniqueone
,
https://lepisma.github.io/articles/2015/07/30/up-with-theano-and-cuda/

 

 

 

Up and running with Theano (GPU) + PyCUDA on Windows


Getting CUDA to work with python on Windows is really frustrating. Its not exactly hard, but is sure irritating when you start doing it. If you have ever tried it, you might be knowing that many possible combinations of compilers, cuda toolkit, python etc. don’t work.

This post describes the steps that I followed for a working setup of theano working with GPU acceleration and PyCUDA for general access to GPU from python. Hopefully, it will help if you haven’t found the sweet spot yet.

Setting up

Starting with my machine, it is a Pavilion DV6 7012tx Laptop with Nvidia GeForce GT 630m card. Right now its running Windows 10 x64. If you are already having cygwin or mingw based gcc in place, you might want to remove that since our scientific python stack will provide that.

1. Install Visual Studio

This is needed to get Nvidia’s CUDA compiler (nvcc) working. For choosing the version, go to the latest CUDA on Windows doc and see which version of visual studio the current CUDA toolkit supports.

At the time of writing, CUDA 7 was the latest release and Visual Studio 2013 was the latest supported version. You also don’t need to install 2008 or 2010 version of compiler for python. This will be taken care of later, just go with everything latest.

After installation, you don’t actually need to add cl.exe (usually in a directory like C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin, depending on your Visual Studio version) to PATH for theano since we will define this explicitly in .theanorc, but it is better to do this as many other tools might be using it.

2. Install CUDA toolkit

This should be easy, get the latest CUDA and install it. Keep the samples while installing, they are nice for checking if things are working fine.

3. Setup Python

This is where most of the trouble is. Its easy to get lost while setting up vanilla python for theano specially since you also are setting up gcc and related tools. The theano installation tutorial will bog you down in this phase if you don’t actually read it carefully. Most likely you would end up downloading lots of legacy Visual Studio versions and other stuff. We won’t be going that way.

Install a scientific python distribution like Anaconda. I haven’t tried setting up theano using other distributions but this should be one of the easier ways because of conda package manager. This really relieves you from setting up a separate mingw environment and handling commonly used libraries which are as easy as conda install boost in Anaconda.

If you feel Anaconda is a bit too heavy, try miniconda and adding basic packages like numpy on top of it.

Once you install Anaconda, install additional dependencies.

conda install mingw libpython

4. Install theano

Install theano using pip install theano and create a .theanorc file in your HOME directory with following contents.

[global]
floatX = float32
device = gpu

[nvcc]
flags=-LC:\Anaconda\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin

Make sure to change the path C:\Anaconda\libs according to your Anaconda install directory and compiler_bindir to the path with cl.exe in it.

5. Install PyCUDA

Best way to install PyCUDA is to get the Unofficial Windows Binaries by Christoph Gohlke here.

For quick setup, you can use pipwin which basically automates the process of installing Gohlke’s packages.

pip install pipwin
pipwin install pycuda

6. Testing it out

Theano

A very basic test is to simply import theano.

import theano
Using gpu device 0: GeForce GT 630M (CNMeM is disabled)

This should tell you if the GPU is getting used.

One error says that CUDA is installed, but device gpu is not available. For me, this was solved after installing mingw and libpython via conda since Anaconda doesn’t setup gcc along with it as it used to do earlier.

For a more extensive test, try the following snippet taken from theano docs.

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print f.maker.fgraph.toposort()
t0 = time.time()
for i in xrange(iters):
    r = f()
t1 = time.time()
print 'Looping %d times took' % iters, t1 - t0, 'seconds'
print 'Result is', r
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print 'Used the cpu'
else:
    print 'Used the gpu'

You should see something like this.

Using gpu device 0: GeForce GT 630M (CNMeM is disabled)
[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 1.42199993134 seconds
Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761
  1.62323296]
Used the gpu
PyCUDA

Here is a quick test snippet from the PyCUDA web page here

import pycuda.autoinit
import pycuda.driver as drv
import numpy

from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
  const int i = threadIdx.x;
  dest[i] = a[i] * b[i];
}
""")

multiply_them = mod.get_function("multiply_them")

a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)

dest = numpy.zeros_like(a)
multiply_them(
        drv.Out(dest), drv.In(a), drv.In(b),
        block=(400,1,1), grid=(1,1))

print dest-a*b

Seeing an array of zeros ? Its working fine then.

Hopefully, this should give you a working pythonish CUDA setup using the latest versions of VS, Windows, Python etc.

Have something to say ?


| email | twitter
contact
Posted by uniqueone
,
If CNN denoises images knowing content, it gets better results. Simple idea, great paper! https://arxiv.org/abs/1701.01698 Deep Class Aware Denoising
Posted by uniqueone
,
http://bcho.tistory.com/m/1156
Posted by uniqueone
,
http://www.kalitut.com/2017/01/machine-learning-book.html?m=1
Posted by uniqueone
,


https://m.youtube.com/watch?feature=share&v=ggqnxyjaKe4
Posted by uniqueone
,
YouTube 동영상을 받았습니다. https://www.youtube.com/watch?v=Ws63I3F7Moc
Posted by uniqueone
,

 

https://www.youtube.com/watch?v=-9XyUr2F9dU

 

 

https://gettocode.com/2016/12/02/keras-on-theano-and-tensorflow-on-windows-and-linux/

 

 

https://www.youtube.com/watch?v=r7-WPbx8VuY&feature=youtu.be

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

'ref_sites' 카테고리의 다른 글

20170201_Calling Matlab from Java  (0) 2017.02.01
20170130_svm params tuning  (0) 2017.01.30
20170106_matlab parameter tuning  (0) 2017.01.05
20170103 matlab colorbar  (0) 2017.01.03
20161215_svm parameter optimization matlab  (0) 2016.12.15
Posted by uniqueone
,

Jupyter notebook 이해하기
https://www.slideshare.net/mobile/dahlmoon/jupyter-notebok-20160815?from_m_app=ios
Posted by uniqueone
,

https://gettocode.com/2016/12/02/keras-on-theano-and-tensorflow-on-windows-and-linux/
Posted by uniqueone
,

https://youtu.be/r7-WPbx8VuY
Posted by uniqueone
,

https://youtu.be/BtDgICVvkHE
Posted by uniqueone
,