'2019/10'에 해당되는 글 78건

2019.10.13 ICYMI: NADS-Net: Driver and Seat Belt Detection via Convolutional Neural Network! https://www.profillic.com/paper/arxiv:1910.03695
2019.10.11 안녕하세요, 수아랩의 이호성입니다. 이번 ICCV 2019에 accept된 Object Detection 주제의 논문 "Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 ..
2019.10.10 Computer Vision, Deep Learning, and OpenCV step-by-step guides! - pyimagesearch
2019.10.08 안녕하세요 :) 금요일에 찬성님이 공유해주신 Full Stack Deep Learning Bootcamp 강의가 너무 좋아 바로 강의를 듣고 정리했습니다 단순히 Production할 때 Serving을 어떻게 해야한다 이런 한정적 내용만 ..
2019.10.07 Hi guys, Do you want to build computer vision models for cattle monitoring? I the COCO json, masks, and images freely available here: https://nsmb.me/aw0f I'm planning on sharing more, maybe writing tutorials if anybody is interested. Would love to g..
2019.10.04 State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit..
2019.10.04 ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u..
2019.10.04 Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538
2019.10.02 Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163
2019.10.02 Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping
2019.10.02 Adam을 개선한 RAdam의 배경과 실행방법
2019.10.02 Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165
2019.10.01 Top 20 Data Science interview questions and answers.
2019.10.01 안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄..
2019.10.01 이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모..
2019.10.01 TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다
2019.10.01 예전에 keras모델(SavedModel)을 tflite 모델로 변환시키는 과정에서 bn(batch normalization)이 전부 빠져서 알아보니 tflite는 bn을 지원하지 않는다는 글을 본적이 있었습니다.혹시 아직도 tflite는 bn을 지..
2019.10.01 인공지능을 공부하면서 느꼈던 점들과 공부자료들을 공유하고 싶어 이렇게 글을 남깁니다.

ICYMI: NADS-Net: Driver and Seat Belt Detection via Convolutional Neural Network! https://www.profillic.com/paper/arxiv:1910.03695

Deep Learning/Papers2read 2019. 10. 13. 12:34

ICYMI: NADS-Net: Driver and Seat Belt Detection via Convolutional Neural Network!

https://www.profillic.com/paper/arxiv:1910.03695
https://www.facebook.com/groups/PyTorchKR/permalink/1497904133682596/?sfnsn=mo

'Deep Learning > Papers2read' 카테고리의 다른 글

[PR12 Season 2 종료 및 Season 3 시작 안내] 안녕하세요, Tensorflow Korea에서 시작된 유튜브 딥러닝 논문읽기 모임인 PR12가 지난 1년간의 새로운 100편 (PR-101 ~ PR-200)을 마무리 하였습니다. 처음 100편도 .. (0)	2019.10.15
From Satellites to 3D reconstruction! ICCV 2019: Papers that foretell the future of computer vision #computervision #futureofcomputervision #3dreconstruction (0)	2019.10.13
안녕하세요, 수아랩의 이호성입니다. 이번 ICCV 2019에 accept된 Object Detection 주제의 논문 "Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 .. (0)	2019.10.11
State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit.. (0)	2019.10.04
ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u.. (0)	2019.10.04

Posted by uniqueone

안녕하세요, 수아랩의 이호성입니다. 이번 ICCV 2019에 accept된 Object Detection 주제의 논문 "Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 ..

Deep Learning/Papers2read 2019. 10. 11. 18:35

안녕하세요, 수아랩의 이호성입니다.

이번 ICCV 2019에 accept된 Object Detection 주제의 논문
"Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 글로 정리를 해보았습니다.

https://hoya012.github.io/blog/Tutorials-of-Object-Detection-Using-Deep-Learning-GaussianYOLOv3/

기존 YOLO를 포함하여 대부분의 Object Detection 알고리즘들이 예측하는 결과물 중 class 정보와 objectness 정보는 확률값으로 예측을 하지만 bounding box 좌표 값들은 deterministic하게 예측을 하는 문제점을 개선하기 위한 방법을 제안하고 있습니다.

또한 이러한 방법들을 통해 localization uncertainty를 예측하고 이를 활용하여 모델의 False-Positive를 줄이고 전체적인 정확도를 높이는 방법을 제안하였습니다.

자세한 리뷰는 제 블로그 글에서 확인하실 수 있습니다! 공부하시는데 도움이 되셨으면 좋겠습니다. 감사합니다!
https://www.facebook.com/groups/PyTorchKR/permalink/1497904133682596/?sfnsn=mo

'Deep Learning > Papers2read' 카테고리의 다른 글

From Satellites to 3D reconstruction! ICCV 2019: Papers that foretell the future of computer vision #computervision #futureofcomputervision #3dreconstruction (0)	2019.10.13
ICYMI: NADS-Net: Driver and Seat Belt Detection via Convolutional Neural Network! https://www.profillic.com/paper/arxiv:1910.03695 (0)	2019.10.13
State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit.. (0)	2019.10.04
ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u.. (0)	2019.10.04
Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538 (0)	2019.10.04

Posted by uniqueone

Computer Vision, Deep Learning, and OpenCV step-by-step guides! - pyimagesearch

카테고리 없음 2019. 10. 10. 14:07

https://www.pyimagesearch.com/start-here/

What do you need help with?

How Do I Get Started?

You’re interested in Computer Vision, Deep Learning, and OpenCV…but you don’t know how to get started.

Follow these steps to get OpenCV configured/installed on your system, learn the fundamentals of Computer Vision, and graduate to more advanced topics, including Deep Learning, Face Recognition, Object Detection, and more!

Step #1: Install OpenCV + Python on Your System (Beginner)
- Before you can start learning OpenCV you first need to install the OpenCV library on your system.
- By far the easiest way to install OpenCV is via pip:
  - Install OpenCV the “easy way” using pip
- However, for the full, optimized install I would recommend compiling from source:
- Compiling from source will take longer and requires basic Unix command line and Operating System knowledge (but is worth it for the full install).
- If you’re brand new to OpenCV and/or Computer Science in general, I would recommend you follow the pip install. Otherwise, you can compile from source.
  - If you run into any problems compiling from source you should revert to the pip install method.
- Please note do that I do not support Windows.
  - I do not recommend Windows for Computer Vision, Deep Learning, and OpenCV.
  - Furthermore, I have not used the Windows OS in over 10+ years so I cannot provide support for it.
  - If you are using Windows and want to install OpenCV, be sure to follow the official OpenCV documentation.
    - Once you have OpenCV installed on your Windows system all code examples included in my tutorials should work (just understand that I cannot provide support for them if you are using Windows).
- If you are struggling to configure your development environment be sure to take a look at my book, Practical Python and OpenCV, which includes a pre-configured VirtualBox Virtual Machine.
  - All you need to do is install VirtualBox, download the VM file, import it and load the pre-configured development environment.
  - And best of all, this VM will work on Linux, macOS, and Windows!
Step #2: Understand Command Line Arguments (Beginner)
- Command line arguments aren’t a Computer Vision concept but they are used heavily here on PyImageSearch and elsewhere online.
- If you intend on studying advanced Computer Science topics such as Computer Vision and Deep Learning then you need to understand command line arguments:
  - Python, argparse, and command line arguments
- Take the time now to understand them as they are a crucial Computer Science topic that cannot, under any circumstance, be overlooked.
Step #3: Learn OpenCV by Example (Beginner)
- Congrats, you are now ready to learn the fundamentals of Computer Vision and the OpenCV library!
- This OpenCV Tutorial will teach you the basics of the OpenCV library, including:
  - Loading an image
  - Accessing individual pixels
  - Array/Region of Interest (ROI) cropping
  - Resizing images
  - Rotating an image
  - Edge detection
  - Thresholding
  - Drawing lines, rectangles, circles, and text on an image
  - Masking and bitwise operations
  - Contour and shape detection
  - …and more!
- Additionally, if you want a consolidated review of the OpenCV library that will get you up to speed in less than a weekend, you should take a look at my book, Practical Python and OpenCV.
Step #4: Build OpenCV Mini-Projects (Beginner)
- At this point you have learned the basics of OpenCV and have a solid foundation to build upon.
- Take the time now to follow these guides and practice building mini-projects with OpenCV.
  - To start, I highly recommend you follow this guide on debugging common “NoneType” errors with OpenCV:
    - OpenCV: Resolving NoneType errors
    - You’ll see these types of errors when (1) your path to an input image is incorrect, returning in cv2.imread returning None or (2) OpenCV cannot properly access your video stream.
    - Trust me, at some point in your Computer Vision/OpenCV career you’ll see this error — take the time now to read the article above to learn how to diagnose and resolve the error.
  - The following tutorials will help you extend your OpenCV knowledge and build on the fundamentals:
  - Contours are a very basic image processing technique — but they are also very powerful if you use them correctly.
    - The following tutorials will teach you the basics of contours with OpenCV:
    - From there, follow this guide to build a document scanner using OpenCV:
      - How to Build a Document Scanner in Just 5 Minutes
    - This tutorial extends the document scanner to create an automatic standardized test (i.e, bubble multiple choice) scanner and grader:
      - Bubble sheet multiple choice scanner and test grader using OMR, Python and OpenCV
- Additionally, I recommend that you take these projects and extend them in some manner, enabling you to gain additional practice.
  - As you work through each tutorial, keep a notepad handy and jot down inspiration as it comes to you.
  - For example:
    - How might you apply the algorithm covered in a tutorial to your particular dataset of images?
    - What would you change if you wanted to filter out specific objects using contours?
  - Make notes to yourself and come back and try to solve these mini-projects later.
Step #5: Solve More Advanced OpenCV Projects (Intermediate)
- Practice makes perfect and Computer Vision/OpenCV are no different.
- After working through the tutorials in Step #4 (and ideally extending them in some manner), you are now ready to apply OpenCV to more intermediate projects.
- My first suggestion is to learn how to access your webcam using OpenCV.
  - The following tutorial will enable you to access your webcam in a threaded, efficient manner:
    - Unifying picamera and cv2.VideoCapture into a single class with OpenCV
    - Again, refer to the resolving NoneType errors post if you cannot access your webcam
- Next, you should learn how to write to video using OpenCV as well as capture “key events” and log them to disk as video clips:
  - Saving key event video clips with OpenCV
  - Writing to video with OpenCV
- Let’s now access a video stream and combine it contour techniques to build a real-world project:
  - Finding targets in drone and quadcopter video streams using Python and OpenCV
- One of my favorite algorithms to teach computer vision is image stitching:
  - OpenCV panorama stitching
  - Real-time panorama and image stitching with OpenCV
  - Image Stitching with OpenCV and Python
  - These algorithms utilize keypoint detection, local invariant descriptor extraction, and keypoint matching to build a program capable of stitching multiple images together, resulting in a panorama.
- There is a dedicated Optical Character Recognition (OCR) section later in this guide, but it doesn’t hurt to gain some experience with it now:
  - Recognizing digits with OpenCV and Python
- You should also gain some experience using image gradients:
  - Detecting Barcodes in Images with Python and OpenCV
  - Real-time barcode detection in video with Python and OpenCV
- Eventually, you’ll want to build an OpenCV project that can stream your output to a web browser — this tutorial will show you how to do exactly that:
  - OpenCV – Stream video to web browser/HTML page
- The following guides are miscellaneous tutorials that I recommend you work through to gain experience working with various Computer Vision algorithms:
- Again, keep a notepad handy as you work through these projects.
- Practice extending them in some manner to gain additional experience.
Step #6: Pick Your Niche (Intermediate)
- Congratulations, you have now learned the fundamentals of Image Processing, Computer Vision, and OpenCV!
- The Computer Vision field is compromised of subfields (i.e., niches), including Deep Learning, Medical Computer Vision, Face Applications, and many others.
  - Many of these fields overlap and intertwine as well — they are not mutually exclusive.
  - That said, as long as you follow this page you’ll always have the proper prerequisites for a given niche, so don’t worry!
- Most readers jump immediately into Deep Learning as it’s one of the most popular fields in Computer Science; however,
Where to Next?
- If you need additional help learning the basics of OpenCV, I would recommend you read my book, Practical Python and OpenCV.
  - This book is meant to be a gentle introduction to the world of Computer Vision and Image Processing through the OpenCV library.
  - And if you don’t know Python, don’t worry!
    - Since I explain every code examples in the book line-by-line, 1000s of PyImageSearch readers have used this book to not only learn OpenCV, but also Python at the same time!
- If you’re looking for a more in-depth treatment of the Computer Vision field, I would instead recommend the PyImageSearch Gurus course.
  - The PyImageSearch Gurus course is similar to a college survey course in Computer Vision, but much more hands-on and practical (including well documented source code examples).
- Otherwise, my personal recommendation would be to jump into the Deep Learning section — most PyImageSearch readers who are interested in Computer Vision are also interested in Deep Learning as well.

Deep Learning

Deep Learning algorithms are capable of obtaining unprecedented accuracy in Computer Vision tasks, including Image Classification, Object Detection, Segmentation, and more.

Follow these steps and you’ll have enough knowledge to start applying Deep Learning to your own projects.

Step #1: Configure your Deep Learning environment (Beginner)
- Before you can apply Deep Learning to your projects, you first need to configure your Deep Learning development environment.
- The following guides will help you install Keras, TensorFlow, OpenCV, and all other necessary CV and DL libraries you need to be successful when applying Deep Learning to your own projects:
  - Ubuntu 18.04: Install TensorFlow and Keras for Deep Learning
  - macOS: Install TensorFlow and Keras for Deep Learning
- Again, I do not provide support for the Windows OS.
  - I do not recommend Windows for Computer Vision and Deep Learning.
  - Definitely consider using a Unix-based OS (i.e., Ubuntu, macOS, etc.) when building your Computer Vision and Deep Learning projects.
- If you are struggling to configure your Deep Learning development environment, you can:
  - Use my Pre-configured Amazon AWS deep learning AMI with Python
  - Pick up a copy of my book, Deep Learning for Computer Vision with Python, which includes a VirtualBox Virtual Machine with all the DL and CV libraries you need pre-configured and pre-installed.
    - All you need to do is install VirtualBox, download the VM file, import it and load the pre-configured development environment.
    - And best of all, this VM will work on Linux, macOS, and Windows!
Step #2: Train Your First Neural Network (Beginner)
- Provided that you have successfully configured your Deep Learning development environment, you can move now to training your first Neural Network!
- I recommend starting with this tutorial which will teach you the basics of the Keras Deep Learning library:
  - Keras Tutorial: How to get started with Keras, Deep Learning, and Python
- After that, you should read this guide on training LeNet, a classic Convolutional Neural Network that is both simple to understand and easy to implement:
  - LeNet – Convolutional Neural Network in Python
  - Implementing LeNet by hand is often the “Hello, world!” of deep learning projects.
Step #3: Understand Convolutional Neural Networks (Beginner)
- Convolutional Neural Networks rely on a Computer Vision/Image Processing technique called convolution.
- A CNN automatically learns kernels that are applied to the input images during the training process.
- But what exactly are kernels and convolution?
  - To answer that, you should read this tutorial:
    - Convolutions with OpenCV and Python
- Now that you understand what kernels and convolution are, you should move on to this guide which will teach you how Keras’ utilizes convolution to build a CNN:
  - Keras Conv2D and Convolutional Layers
Step #4: Build Your Own Image Dataset (Intermediate)
- So far you’ve learned how to train CNNs on pre-compiled datasets — but what if you wanted to work with your own custom data?
- But how are you going to train a CNN to accomplish a given task if you don’t already have a dataset of such images?
- The short answer is you can’t — you need to gather your image dataset first:
  - How to create a deep learning dataset using Google Images
  - How to (quickly) build a deep learning image dataset
- The Google Images method is fast and easy, but can also be a bit tedious at the same time.
- If you are an experiencing programming you will likely prefer the Bing API method as it’s “cleaner” and you have more control over the process.
Step #5: Train a CNN on Your Dataset (Intermediate)
- At this point you have used Step #4 to gather your own custom dataset.
- Let’s now learn how to train a CNN on top of that data:
  - Keras and Convolutional Neural Networks (CNNs)
- You’ll also want to refer to this guide which will give you additional practice training CNNs with Keras:
  - Image classification with Keras and deep learning
- Along the way you should learn how to save and load your trained models, ensuring you can make predictions on images after your model has been trained:
  - Keras – Save and Load Your Deep Learning Models
Step #6: Tuning Your Learning Rate (Intermediate)
- So, you trained your own CNN from Step #5 — but your accurate isn’t as good as what you want it to be.
- What now?
- In order to obtain a highly accurate Deep Learning model, you need to tune your learning rate, the most important hyperparameter when training a Neural Network.
- The following tutorial will teach you how to start training, stop training, reduce your learning rate, and continue training, a critical skill when training neural networks:
  - Keras: Starting, stopping, and resuming training
- This guide will teach you about learning rate schedules and decay, a method that can be quickly implemented to slowly lower your learning rate when training, allowing it to descend into lower areas of the loss landscape, and ideally obtain higher accuracy:
  - Keras learning rate schedules and decay
- You should also read about Cyclical Learning Rates (CLRs), a technique used to oscillate your learning rate between an upper and lower bound, enabling your model to break out of local minima:
  - Cyclical Learning Rates with Keras and Deep Learning
- But what if you don’t know what your initial learning rate should be?
  - Don’t worry, I have a simple method that will help you out:
    - Keras Learning Rate Finder
Step #7: Data Augmentation (Intermediate)
- If you haven’t already, you will run into two important terms in Deep Learning literature:
  - Generalization: The ability of your model to correctly classify images that are outside the training set used to train the model.
    - Your model is said to “generalize well” if it can correctly classify images that it has never seen before.
    - Generalization is absolutely critical when training a Deep Learning model.
      - Imagine if you were working for Tesla and needed to train a self-driving car application used to detect cars on the road.
      - Your model worked well on the training set…but when you evaluated it on the testing set you found that the model failed to detect the majority of cars on the road!
      - In such a situation we would say that your model “failed to generalize”.
        
        To fix this problem you need to apply regularization.
  - Regularization: The term “regularization” is used to encompass all techniques used to (1) prevent your model from overfitting and (2) generalize well to your validation and testing sets.
    - Regularization techniques include:
      - L2 regularization (also called weight decay)
      - Updating the CNN architecture to include dropout
    - You can read the following tutorial for an introduction/motivation to regularization:
      - Understanding regularization for image classification and machine learning
- Data augmentation is a type of regularization technique.
  - There are three types of data augmentation, including:
    - Type #1: Dataset generation and expanding an existing dataset (less common)
    - Type #2: In-place/on-the-fly data augmentation (most common)
    - Type #3: Combining dataset generation and in-place augmentation
  - Unless you have a good reason not to apply data augmentation, you should always utilize data augmentation when training your own CNNs.
  - You can read more about data augmentation here:
    - Keras ImageDataGenerator and Data Augmentation
Step #8: Feature Extraction and Fine-tuning Pre-trained Networks (Intermediate)
- So far we’ve trained our CNNs from scratch — but is it possible to take a pre-trained model and use it to classify images it was never trained on?
- Yes, it absolutely is!
- Taking a pre-trained model and using it to classify data it was never trained on is called transfer learning.
- There are two types of transfer learning:
  - Feature extraction: Here we treat our CNN as an arbitrary feature extractor.
    - An input image is presented to the CNN.
    - The image is forward-propagated to an arbitrary layer of the network.
    - We take those activations as our output and treat them like a feature vector.
    - Given feature vectors for all input images in our dataset we train an arbitrary Machine Learning model (ex., Logistic Regression, Support Vector Machine, SVM) on top of our extracted features.
    - When making a prediction, we:
      - Forward-propagate the input image.
      - Take the output features.
      - Pass them to our ML classifier to obtain our output prediction.
    - You can read more about feature extraction here:
      - Keras: Feature extraction on large datasets with Deep Learning
      - Online/Incremental Learning with Keras and Creme
  - Fine-tuning: Here we modify the CNN architecture itself by performing network surgery.
    - Think of yourself as a “CNN Surgeon”.
    - We start by removing the Fully-Connected (FC) layer head from the pre-trained network.
    - Next, we add a brand new, randomly initialized FC layer head to the network
    - Optionally, we freeze layers earlier in the CNN prior to training
      - Keep in mind that CNNs are hierarchical feature learners:
        
        Layers earlier in the CNN can detect “structural building blocks”, including blobs, edges, corners, etc.
        
        Intermediate layers use these building blocks to start learning actual shapes
        
        Finally, higher-level layers of the network learn abstract concepts (such as the objects themselves).
      - We freeze layers earlier in the network to ensure we retain our structural building blocks
    - Training is then started using a very low learning rate.
    - Once our new FC layer head is “warmed up” we may then optionally unfreeze our earlier layers and continue training
    - You can learn more about fine-tuning here:
      - Fine-tuning with Keras and Deep Learning
      - Change input shape dimensions for fine-tuning with Keras
- I’ll wrap up this section by saying that transfer learning is a critical skill for you to properly learn.
  - Use the above tutorials to help you get started, but for a deeper dive into my tips, suggestions, and best practices when applying Deep Learning and Transfer Learning, be sure to read my book:
    - Deep Learning for Computer Vision with Python
  - Inside the text I not only explain transfer learning in detail, but also provide a number of case studies to show you how to successfully apply it to your own custom datasets.
Step #9: Video Classification (Advanced)
- At this point you have a good understanding of how to apply CNNs to images — but what about videos?
- Can the same algorithms and techniques be applied?
- Video classification is an entirely different beast — typical algorithms you may want to use here include Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs).
- However, before you start breaking out the “big guns” you should read this guide:
  - Video classification with Keras and Deep Learning
  - Inside you’ll learn how to use prediction averaging to reduce “prediction flickering” and create a CNN capable of applying stable video classification.
Step #10: Multi-Input and Multi-Output Networks (Advanced)
- Imagine you are hired by a large clothing company (ex., Nordstorms, Neiman Marcus, etc.) and are tasked with building a CNN to classify two attributes of an input clothing image:
  - Clothing Type: Shirt, dress, pants, shoes, etc.
  - Color: The actual color of the item of clothing (i.e., blue, green, red, etc.).
- To get started building such a model, you should refer to this tutorial:
  - Multi-label classification with Keras
- As you’ll find out in the above guide, building a more accurate model requires you to utilize a multi-output network:
  - Keras: Multiple outputs and multiple losses
- Now, let’s imagine that for your next job you are hired by real estate company used to automatically predict the price of a house based solely on input images.
  - You are given images of the bedroom, bathroom, living room, and house exterior.
  - You now need to train a CNN to predict the house price using just those images.
  - To accomplish that task you’ll need a multi-input network:
- Both multi-input and multi-output networks are a bit on the “exotic” side.
  - You won’t need them often, but when you do, you’ll be happy you know how to use them!
Step #11: Improve Your Deep Learning Models (Advanced)
- The best way to improve your Deep Learning model performance is to learn via case studies.
- The following case studies and tutorials will help you learn techniques that you can apply to your projects.
  - To start, I would familiarize yourself with common state-of-the-art architectures including VGGNet, ResNet, Inception/GoogLeNet, Xception, and others:
    - ImageNet: VGGNet, ResNet, Inception, and Xception with Keras
  - If you want to learn how to implement your own custom data generators when training Keras models, refer here:
    - How to use Keras fit and fit_generator (a hands-on tutorial)
  - For training your Keras models with multiple GPUs, you’ll want to read this guide:
    - How-To: Multi-GPU training with Keras, Python, and deep learning
  - You can also use Keras for regression problems:
  - The OpenCV library ships with a number of pre-trained models for neural style transfer, black and white image colorization, holistically-nested edge detection and others — you can learn about these models using the links below:
  - While SGD is the most popular optimizer used to train deep neural networks, others exist, including Adam, RMSprop, Adagrad, Adadelta and others
    - These two tutorials cover the Rectified Adam (RAdam) optimizer, including comparing Rectified Adam to the standard Adam optimizer:
      - Rectified Adam (RAdam) optimizer with Keras
      - Is Rectified Adam actually *better* than Adam?
  - If you intend on deploying your models to production, and more specifically, behind a REST API, I’ve authored three tutorials on the topic, each building on top of each other:
- Take your time practicing and working through them — the experience you gain will be super valuable when you go off on your own!
Step #12: AutoML and Auto-Keras (Advanced)
- What if you…
  - Didn’t have to select and implement a Neural Network architecture?
  - Didn’t have to tune your learning?
  - Didn’t have to tune your regularization parameters?
- What if you instead could treat the training process like a “black box”:
  - Input your data to an API
  - And let the algorithms inside automatically train the model for you!
- Sound too good to be true?
- In some cases it is…
- …but in others it works just fine!
- We call these sets of algorithms Automatic Machine Learning (AutoML) — you can read more about these algorithms here:
  - Auto-Keras and AutoML: A Getting Started Guide
- The point here is that AutoML algorithms aren’t going to be replacing you as a Deep Learning practitioner anytime soon.
  - They are super important to learn about, but they have a long way to go if they are ever going to replace you!
Where to Next?
- Congratulations! If you followed the above steps then you now have enough Deep Learning knowledge to consider yourself a “practitioner”!
- But where should you go from here?
  - If you’re interested in a deeper dive into the world of Deep Learning, I would recommend reading my book, Deep Learning for Computer Vision with Python.
  - Inside the book you’ll find:
    - Super practical walkthroughs that present solutions to actual, real-world image classification problems, challenges, and competitions.
    - Hands-on tutorials (with lots of code) that not only show you the algorithms behind deep learning for computer vision but their implementations as well.
    - A no-nonsense teaching style that is guaranteed to help you master deep learning for image understanding and visual recognition
  - You can learn more about the book here.
- Otherwise, I would recommend reading the following sections of this guide:
  - Object Detection: State-of-the-art object detectors, including Faster R-CNN, Single Shot Detectors (SSDs), YOLO, and RetinaNet all rely on Deep Learning.
    - If you want to learn how to not only classify an input image but also locate where in the object is, then you’ll want to read these guides.
  - Embedded and IoT Computer Vision and Computer Vision on the Raspberry Pi If you’re interested in applying DL to resource constrained devices such as the Raspberry Pi, Google Coral, and NVIDIA Jetson Nano, these are the sections for you!
  - Medical Computer Vision: Apply Computer Vision and Deep Learning to medical image analysis and learn how to classify blood cells and detect cancer.

Face Applications

Using Computer Vision we can perform a variety of facial applications, including facial recognition, building a virtual makeover system (i.e., makeup, cosmetics, eyeglasses/sunglasses, etc.), or even aiding in law enforcement to help detect, recognize, and track criminals.

Computer Vision is powering facial recognition at a massive scale — just take a second to consider that over 350 million images are uploaded to Facebook every day.

For each of those images, Facebook is running face detection (to detect the presence) of faces followed by face recognition (to actually tag people in photos).

In this section you’ll learn the basics of facial applications using Computer Vision.

Step #1: Install OpenCV, dlib, and face_recognition (Beginner)
- Before you can build facial applications, you first need to configure your development environment.
  - Start by following Step #1 of the How Do I Get Started? section to install OpenCV on your system.
- From there, you’ll need to install the dlib and face_recognition libraries.
  - The Install your face recognition libraries of this tutorial will help you install both dlib and face_recognition.
- Make sure you have installed OpenCV, dlib, and face_recognition before continuing!
Step #2: Detect Faces in Images and Video (Beginner)
- In order to apply Computer Vision to facial applications you first need to detect and find faces in an input image.
- Face detection is different than face recognition.
  - During face detection we are simply trying to locate where in the image faces are.
  - Our face detection algorithms do not know who is in the image, simply that a given face exists at a particular location.
  - Once we have our detected faces, we pass them into a facial recognition algorithm which outputs the actual identify of the person/face.
- Thus, all Computer Vision and facial applications must start with face detection.
- There are a number of face detectors that you can use, but my favorite is OpenCV’s Deep Learning-based face detector:
  - Face detection with OpenCV and deep learning
- OpenCV’s face detector is accurate and able to run in real-time on modern laptops/desktops.
  - That said, if you’re using a resource constrained devices (such as the Raspberry Pi), the Deep Learning-based face detector may be too slow for your application.
  - In that case, you may want to utilize Haar cascades or HOG + Linear SVM instead:
    - Detecting cats in images with OpenCV
    - Histogram of Oriented Gradients and Object Detection
  - Haar cascades are very fast but prone to false-positive detections.
    - It can also be a pain to properly tune the parameters to the face detector.
  - HOG + Linear SVM is a nice balance between the Haar cascades and OpenCV’s Deep Learning-based face detector.
    - This detector is slower than Haar but is also more accurate.
- Here’s my suggestion:
  - If you need accuracy, go with OpenCV’s Deep Learning face detector.
  - If you need pure speed, go with Haar cascades.
  - And if you need a balance between the two, go with HOG + Linear SVM.
- Finally, make sure you try all three detectors before you decide!
  - Gather a few example images and test out the face detectors.
  - Let your empirical results guide you — apply face detection using each of the algorithms, examine the results, and double-down on the algorithm that gave you the best results.
Step #3: Discover Facial Landmarks (Intermediate)
- At this point you can detect the location of a face in an image.
- But what if we wanted to localize various facial structures, including:
  - Nose
  - Eyes
  - Mouth
  - Jawline
- Using facial landmarks we can do exactly that!
  - And best of all, facial landmark algorithms are capable of running in real-time!
  - Most of your computation is going to be spent detecting the actual face — once you have the face detected, facial landmarks are quite fast!
  - Start by reading the following tutorials to learn how localize facial structures on a detected face:
Step #4: Create Face Application Mini-Projects (Intermediate)
- Now that you have some experience with face detection and facial landmarks, let’s practice these skills and continue to hone them.
- I suggest going through the following guides to help you apply Computer Vision to facial applications:
Step #5: Build a Face Recognition Dataset (Intermediate)
- Are you ready to build your first facial recognition system?
- Hold up — I get that you’re eager, but before you can build a face recognition system, you first need to gather your dataset of example images.
- The following tutorials will help you create a face recognition dataset:
  - How to build a custom face recognition dataset
  - How to create a deep learning dataset using Google Images
- You can then take the dataset you created and proceed to the next step to build your actual face recognition system.
- Note: If you don’t want to build your own dataset you can proceed immediately to Step #6 — I’ve provided my own personal example datasets for the tutorials in Step #6 so you can continue to learn how to apply face recognition even if you don’t gather your own images.
Step #6: Face Recognition (Intermediate)
- At this point you have either (1) created your own face recognition dataset using the previous step or (2) elected to use my own example datasets I put together for the face recognition tutorials.
- To build your first face recognition system, follow this guide:
  - Face recognition with OpenCV, Python, and deep learning
  - This tutorial utilizes OpenCV, dlib, and face_recognition to create a facial recognition application.
- The problem with the first method is that it relies on a modified k-Nearest Neighbor (k-NN) search to perform the actual face identification.
  - k-NN, while simple, can easily fail as the algorithm doesn’t “learn” any underlying patterns in the data.
- To remedy the situation (and obtain probabilities associated with the face recognition), you should follow this guide:
  - OpenCV Face Recognition
  - You’ll note that this tutorial does not rely on the dlib and face_recognition libraries — instead, we use OpenCV’s FaceNet model.
  - A great project for you would be to:
    - Replace OpenCV’s FaceNet model with the dlib and face_recognition packages.
    - Extract the 128-d facial embeddings
    - Train a Logistic Regression or Support Vector Machine (SVM) on the embeddings extracted by dlib/face_recognition
  - Take your time whewn implementing the above project — it will be a great learning experience for you.
Step #7: Improve Your Face Recognition Accuracy (Intermediate)
- Whenever I write about face recognition the #1 question I get asked is:
  - “How can I improve my face recognition accuracy?”
- I’m glad you asked — and in fact, I’ve already covered the topic.
  - Make sure you refer to the Drawbacks, limitations, and how to obtain higher face recognition accuracy section (right before the Summary) of the following tutorial:
    - OpenCV Face Recognition
  - You should also read up on face alignment as proper face alignment can improve your face recognition accuracy:
    - Face Alignment with OpenCV and Python
  - Inside that section I discuss how you can improve your face recognition accuracy.
Step #8: Detect Fake Faces and Perform Anti-Face Spoofing
- You may have noticed that it’s possible to “trick” and “fool” your face recognition system by holding up a printed photo of a person or photo of the person on your screen.
  - In those situations your face recognition correctly recognizes the person, but fails to realize that it’s a fake/spoofed face!
  - What do you do then?
- The answer is to apply liveness detection:
  - Liveness Detection with OpenCV
  - Liveness detection algorithms are used to detect real vs. fake/spoofed faces.
    - Once you have determined that the face is indeed real, then you can pass it into your face recognition system.
Where to Next?
- Congrats on making it all the way through the Facial Applications section!
  - That was quite a lot of content to cover and you did great.
  - Take a second now to be proud of yourself and your accomplishments.
- But what now — where should you go next?
  - My recommendation would be the PyImageSearch Gurus course.
    - The PyImageSearch Gurus course includes additional modules and lessons on face recognition.
    - Additionally, you’ll also find:
      - An actionable, real-world course on OpenCV and computer vision (similar to a college survey course on Computer Vision but much more hands-on and practical).
      - The most comprehensive computer vision education online today. The PyImageSearch Gurus course covers 13 modules broken out into 168 lessons, with other 2,161 pages of content. You won’t find a more detailed computer vision course anywhere else online, I guarantee it.
      - A community of like-minded developers, researchers, and students just like you, who are eager to learn computer vision and level-up their skills.
  - To learn more about the PyImageSearch Gurus course, just use the link below:
    - Tell me more about the PyImageSearch Gurus course

Optical Character Recognition (OCR)

One of the first applications of Computer Vision was Optical Character Recognition (OCR).

OCR algorithms seek to (1) take an input image and then (2) recognize the text/characters in the image, returning a human-readable string to the user (in this case a “string” is assumed to be a variable containing the text that was recognized).

While OCR is a simple concept to comprehend (input image in, human-readable text out) it’s actually extremely challenging problem that is far from solved.

The steps in this section will arm you with the knowledge you need to build your own OCR pipelines.

Step #1: Install OpenCV (Beginner)
- Before you can apply OCR to your own projects you first need to install OpenCV.
- Follow Step #1 of the How Do I Get Started? section above to install OpenCV on your system.
- Once you have OpenCV installed you can move on to Step #2.
Step #2: Discover Tesseract for OCR (Beginner)
- Tesseract is an OCR engine/API that was originally developed by Hewlett-Packard in the 1980s.
- The library was open-sourced in 2005 and later adopted by Google in 2006.
- Tesseract supports over 100 written languages, ranging from English to to Punjabi to Yiddish.
- Combining OpenCV with Tesseract is by far the fastest way to get started with OCR.
- First, make sure you Tesseract installed on your system:
  - Installing Tesseract for OCR
- From there, you can create your first OCR application using OCR and Tesseract:
  - Using Tesseract OCR with Python
Step #3: OCR Without Tesseract (Intermediate)
- It’s entirely possible to perform OCR without libraries such as Tesseract.
- To accomplish this task you need to combine feature extraction along with a bit of heuristics and/or machine learning.
- The following guide will give you experience recognizing digits on a 7-segment display using just OpenCV:
  - Recognizing digits with OpenCV and Python
- Take your time and practice with that tutorial — it will help you learn how to approach OCR projects.
Step #4: Practice OCR with Mini-Projects (Intermediate)
- Let’s continue our study of OCR by solving mini-projects:
- Again, follow the guides and practice with them — they will help you learn how to apply OCR to your tasks.
Step #5: Text Detection in Natural Scenes (Intermediate)
- So far we’ve applied OCR to images that were captured under controlled environments (i.e., no major changes in lighting, viewpoint, etc.).
- But what if we wanted to apply OCR to images in uncontrolled environments?
  - Imagine we were tasked with building a Computer Vision system for Facebook to handle OCR’ing the 350+ million new images uploaded to their new system.
  - In that we case, we can make zero assumptions regarding the environment in which the images were captured.
  - Some images may be captured using a high quality DSLR camera, others with a standard iPhone camera, and even others with a decade old flip phone — again, we can make no assumptions regarding the quality, viewing angle, or even contents of the image.
- In that case, we need to break OCR into a two stage process:
  - Stage #1: Use the EAST Deep Learning-based text detector to locate where text resides in the input image.
  - Stage #2: Use an OCR engine (ex., Tesseract) to take the text locations and then actually recognize the text itself.
- To perform Stage #1 (Text Detection) you should follow this tutorial:
  - OpenCV Text Detection (EAST text detector)
- If you’ve read the Face Applications section above you’ll note that our OCR pipeline is similar to our face recognition pipeline:
  - First, we detect the text in the input image (akin to to detecting/locating a face in an image)
  - And then we take the regions of the image that contain the text, and then actually recognize it (which is similar to taking the location of a face and then actually recognizing who is in the face).
Step #6: Combine Text Detection with OCR (Advanced)
- Now that we know where in the input image text resides, we can then take those text locations and actually recognize the text.
- To accomplish this task we’ll again be using Tesseract, but this time we’ll want to use Tesseract v4.
  - The v4 release of Tesseract contains a LSTM-based OCR engine that is far more accurate than previous releases.
- You can learn how to combine Text Detection with OCR using Tesseract v4 here:
  - OpenCV OCR and text recognition with Tesseract
Where to Next?
- Keep in mind that OCR, while widely popular, is still far from being solved.
  - It is likely, if not inevitable, that your OCR results will not be 100% accurate.
    - Commercial OCR engines anticipate results not being 100% correct as well.
    - These engines will sometimes apply auto-correction/spelling correction to the returned results to make them more accurate.
      - The pyspellchecker package would likely be a good starting point for you if you’re interested in spell checking the OCR results.
    - Additionally, you may want to look at the Google Vision API:
      - While the Google Vision API requires (1) an internet connection and (2) payment to utilize, in my opinion it’s one of the best OCR engines available to you.
- OCR is undoubtedly one of the most challenging areas of Computer Vision.
  - If you need help building your own custom OCR systems or increasing the accuracy of your current OCR system,, I would recommend joining the PyImageSearch Gurus course.
    - The course includes private forums where I hang out and answer questions daily.
    - It’s a great place to get expert advice, both from me, as well as the more advanced students in the course.
  - Click here to learn more about the PyImageSearch Gurus course.

Object Detection

Object detection algorithms seek to detect the location of where an object resides in an image.

These algorithms can be as simple as basic color thresholding or as advanced as training a complex deep neural network from scratch.

In the first part of this section we’ll look at some basic methods of object detection, working all the way up to Deep Learning-based object detectors including YOLO and SSDs.

Step #1: Configure Your Development Environment (Beginner)
- Prior to working with object detection you’ll need to configure your development environment.
- To start, make sure you:
  - Follow Step #1 of the How Do I Get Started? section to install OpenCV.
  - Install Keras and TensorFlow via Step #1 of the Deep Learning section.
- Provided you have OpenCV, TensorFlow, and Keras installed, you are free to continue with the rest of this tutorial.
Step #2: Create a Basic Object Detector/Tracker (Beginner)
- We’ll keep our first object detector/tracker super simple.
- We’ll rely strictly on basic image processing concepts, namely color thresholding.
  - To apply color threshold we define an upper and lower range in a given color space (such as RGB, HSV, L*a*b*, etc.)
  - Then, for an incoming image/frame, we use OpenCV’s cv2.inRange function to apply color thresholding, yielding a mask, where:
    - All foreground pixels are white
    - And all background pixels are black
  - Therefore, all pixels that fall into our upper and lower boundaries will be marked as foreground.
- Color thresholding methods, as the name suggestions, are super useful when you know the color of the object you want to detect and track will be different than all other colors in the frame.
- Furthermore, color thresholding algorithms are very fast, enabling them to run in super real-time, even on resource constrained devices, such as the Raspberry Pi.
- Let’s go ahead and implement your first object detector now:
  - Ball Tracking with OpenCV
- Then, when you’re done, you can extend it to track object movement (north, south, east, west, etc.):
  - OpenCV Track Object Movement
- Once you’ve implemented the above two guides I suggest you extend the project by attempting to track your own objects.
  - Again, keep in mind that this object detector is based on color, so make sure the object you want to detect has a different color than the other objects/background in the scene!
Step #3: Basic Person Detection (Beginner)
- Color-based object detectors are fast and efficient, but they do nothing to understand the semantic contents of an image.
- For example, how would you go about defining a color range to detect an actual person?
  - Would you attempt to track based on skin tone?
    - That would fail pretty quickly — humans have a large variety of skin tones, ranging from ethnicity, to exposure to the sun. Defining such a range would be impossible.
    - Would clothing work?
      - Well, maybe if you were at a soccer/football game and wanted to track players on the pitch via their jersey colors.
      - But for general purpose applications that wouldn’t work either — clothing comes in all shapes, sizes, colors, and designs.
- I think you get my point here — trying to detect a person based on color thresholding methods alone simply isn’t going to work.
- Instead, you need to use a dedicated object detection algorithm.
  - One of the most common object detectors is the Viola-Jones algorithm, also known as Haar cascades.
  - The Viola-Jones algorithm was published back in 2001 but is still used today (although Deep Learning-based object detectors obtain far better accuracy).
  - To try out a Haar cascade out, follow this guide:
    - Detecting cats in images with OpenCV
- In 2005, Dalal and Triggs published the seminal paper, Histogram of Oriented Gradients for Human Detection.
  - This paper introduces what we call the HOG + Linear SVM Object Detector:
    - Histogram of Oriented Gradients and Object Detection
- Let’s gain some experience applying HOG + Linear SVM to pedestrian detection:
  - Pedestrian Detection OpenCV
- You’ll then want to understand the parameters to OpenCV’s detectMultiScale function, including how to tune them obtain higher accuracy:
  - HOG detectMultiScale parameters explained
Step #4: Improving Our Basic Object Detector (Beginner)
- Now that we’ve seen how HOG + Linear SVM works in practice, let’s dissect the algorithm a bit.
- To start, the HOG + Linear SMV object detectors uses a combination of sliding windows, HOG features, and a Support Vector Machine to localize objects in images.
  - Image pyramids allow us to detect objects at different scales (i.e., objects that are closer to the camera as well as objects farther away):
    - Image Pyramids with Python and OpenCV
  - Sliding windows enable us to detect objects at different locations in a given scale of the pyramid:
    - Sliding Windows for Object Detection with Python and OpenCV
- Finally, you need to understand the concept of non-maxima suppression, a technique used in both traditional object detection as well as Deep Learning-based object detection:
  - Non-Maxima Suppression for Object Detection in Python
  - (Faster) Non-Maxima Suppression in Python
  - When performing object detection you’ll end up locating multiple bounding boxes surrounding a single object.
  - This behavior is actually a good thing — it implies that your object detector is working correctly and is “activating” when it gets close to objects it was trained to detect.
  - The problem is that we now have multiple bounding boxes for one object.
    - To rectify the problem we can apply non-maxima suppression, which as the name suggestions, suppresses (i.e., ignores/deletes) weak, overlapping bounding boxes.
      - The term “weak” here is used to indicate bounding boxes of low confidence/probability.
- If you are interested in learning more about the HOG + Linear SVM object detector, including:
  - How to train your own custom HOG + Linear SVM object detector
  - The inner-workings of the HOG + Linear SVM detector
  - …then you’ll want to refer to the PyImageSearch Gurus course
    - Inside the course you’ll find 30+ lessons on HOG feature extraction and the HOG + Linear SVM object detection algorithm.
Step #5: Your First Deep Learning Object Detector (Intermediate)
- For ~10 years HOG + Linear SVM (including its variants) was considered the state-of-the-art in terms of object detection.
- However, Deep Learning-based object detectors, including Faster R-CNN, Single Shot Detector (SSDs), You Only Look Once (YOLO), and RetinaNet have obtained unprecedented object detection accuracy.
- The OpenCV library is compatible with a number of pre-trained object detectors — let’s start by taking a look at this SSD:
  - Object detection with deep learning and OpenCV
Step #6: Real-time Object Detection with Deep Learning (Intermediate)
- In Step #5 you learned how to apply object detection to images — but what about video?
- Is it possible to apply object detection to real-time video streams?
- On modern laptops/desktops you’ll be able to run some (but not all) Deep Learning-based object detectors in real-time.
- This tutorial will get you started:
  - Real-time object detection with deep learning and OpenCV
Step #7: Deep Learning Object Detectors (Intermediate)
- For a deeper dive into Deep Learning-based object detection, including how to filter/remove classes that you want to ignore/not detect, refer to this tutorial:
  - A gentle guide to deep learning object detection
- Next, you’ll want to practice applying the YOLO object detector:
  - YOLO object detection with OpenCV
- The YOLO object detector is designed to be super fast; however, it appears that the OpenCV implementation is actually far slower than the SSD counterparts.
  - I’m not entirely sure why that is.
- Furthermore, OpenCV’s Deep Neural Network ( dnn ) module does not yet support NVIDIA GPUs, meaning that you cannot use your GPU to improve inference speed.
  - OpenCV is reportedly working on NVIDIA GPU support but it may not be until 2020 until that support is available.
Step #8: Evaluate Deep Learning Object Detector Performance (Intermediate)
- If you decide you want to train your own custom object detectors from scratch you’ll need a method to evaluate the accuracy of the model.
- To do that we use two metrics: Intersection over Union (IoU) and mean Average Precision (mAP) — you can read about them here:
  - Intersection over Union (IoU) for object detection
Step #9: From Object Detection to Semantic/Instance Segmentation (Intermediate)
- If you’ve followed along so far, you know that object detection produces bounding boxes that report the location and class label of each detected object in an image.
- But what if you wanted to extend object detection to produce pixel-wise masks?
  - These masks would not only report the bounding box location of each object, but would report which individual pixels belong to the object.
- These types of algorithms are covered in the Instance Segmentation and Semantic Segmentation section.
Step #10: Object Detection on Embedded Devices (Advanced)
- Deep Learning-based object detectors, while accurate, are extremely computationally hungry, making them incredibly challenging to apply them to resource constrained devices such as the Raspberry Pi, Google Coral, and NVIDIA Jetson Nano.
- If you would like to apply object detection to these devices, make sure you read the Embedded and IoT Computer Vision and Computer Vision on the Raspberry Pi sections, respectively.
Where to Next?
- Congratulations, you now have a solid foundation on how object detection algorithms work!
- If you’re looking to study object detection in more detail, I would recommend you:
  - Join the PyImageSearch Gurus course
    - Inside the course I cover the inner-workings of the HOG + Linear SVM algorithm, including how to train your own custom HOG + Linear SVM detector.
  - Take a look at Deep Learning for Computer Vision with Python
    - That book covers Deep Learning-based object detection in-depth, including how to (1) annotate your dataset and (2) train the follow object detectors:
      - Faster R-CNNs
      - Single Shot Detectors (SSDs)
      - RetinaNet
    - If you’re interested in instance/semantic segmentation, the text covers Mask R-CNN as well.
  - Read through Raspberry Pi for Computer Vision
    - As the name suggestions, this book is dedicated to developing and optimizing Computer Vision and Deep Learning algorithms on resource constrained devices, including the:
      - Raspberry Pi
      - Google Coral
      - Intel Movidius NCS
      - NVIDIA Jetson Nano
    - Inside you’ll learn how to train your own object detectors, optimize/convert them for the RPi, Coral, NCS, and/or Nano, and then run the detectors in real-time!

Object Tracking

Object Tracking algorithms are typically applied after and object has already been detected; therefore, I recommend you read the Object Detection section first. Once you’ve read those sets of tutorials, come back here and learn about object tracking.

Object detection algorithms tend to be accurate, but computationally expensive to run.

It may be infeasible/impossible to run a given object detector on every frame of an incoming video stream and still maintain real-time performance.

Therefore, we need an intermediary algorithm that can accept the bounding box location of an object, track it, and then automatically update itself as the object moves about the frame.

We’ll learn about these types of object tracking algorithms in this section.

Step #1: Install OpenCV on Your System (Beginner)
- Prior to working through this section you’ll need to install OpenCV on your system.
  - Make sure you follow Step #1 of How Do I Get Started? to configure and install OpenCV.
- Additionally, I recommend reading the Object Detection section first as object detection tends to be a prerequisite to object tracking.
Step #2: Your First Object Tracker (Beginner)
- The first object tracker we’ll cover is a color-based tracker.
- This algorithm combines both object detection and tracking into a single step, and in fact, is the simplest object tracker possible.
- You can read more about color-based detection and tracking here:
  - Ball Tracking with OpenCV
  - OpenCV Track Object Movement
Step #3: Discover Centroid Tracking (Intermediate)
- Our color-based tracker was a good start, but the algorithm will fail if there is more than one object we want to track.
- For example, let’s assume there are multiple objects in our video stream and we want to associate unique IDs with each of them — how might we go about doing that?
  - The answer is to apply a Centroid Tracking algorithm:
    - Simple object tracking with OpenCV via Centroid Tracking
  - Using Centroid Tracking we can not only associate unique IDs with a given object, but also detect when an object is lost and/or has left the field of view.
Step #4: Better Object Tracking Algorithms (Intermediate)
- OpenCV comes with eight object tracking algorithms built-in to the library, including:
  - BOOSTING Tracker
  - MIL Tracker
  - KCF Tracker
  - CSRT Tracker
  - MedianFlow Tracker
  - TLD Tracker
  - MOSSE Tracker
  - GOTURN Tracker
- You can learn how to use each of them in this tutorial:
  - OpenCV Object Tracking
- The dlib library also has an implementation of correlation tracking:
  - Object tracking with dlib
- When utilizing object tracking in your own applications you need to balance speed with accuracy.
  - My persona recommendation is to:
    - Use CSRT when you need higher object detection accuracy and can tolerate slower FPS throughput.
    - Use KCF when you need faster FPS throughput but can handle slightly lower object tracking accuracy.
    - Use MOSSE when you need pure speed.
Step #5: Multi-Object Tracking (Intermediate)
- Step #4 handled single object tracking using OpenCV and dlib’s object trackers — but what about multi-object tracking?
- You should start by reading about multi-object tracking with OpenCV:
  - Tracking multiple objects with OpenCV
- Multi-object tracking is, by definition, significantly more complex, both in terms of the underlying programming, API calls, and computationally efficiency.
  - Most multi-object tracking implementations instantiate a brand new Python/OpenCV class to handle object tracking, meaning that if you have N objects you want to track, you therefore have N object trackers instantiated — which quickly becomes a problem in crowded scenes.
  - Your CPU will choke on the load and your object tracking system will come to a grinding halt.
- One way to overcome this problem is to use multiprocessing and distribute the load across multiple processes/cores, thus enabling you to reclaim some speed:
  - Multi-object tracking with dlib
Step #6: Applied Object Tracking and Counting (Intermediate)
- So far you’ve learned how to apply single object tracking and multi-object tracking.
- Let’s put all the pieces together and build a person/footfall counter application capable of detecting, tracking, and counting the number of people that enter/exit a given area (i.e., convenience store, grocery store, etc.):
  - OpenCV People Counter
- In particular, you’ll want to note how the above implementation takes a hybrid approach to object detection and tracking, where:
  - The object detector is only applied every N frames.
  - One object tracker is created per detected object.
  - The trackers enable us to track the objects.
  - Then, once we reach the N-th frame, we apply object detection, associate centroids, and then create new object trackers.
- Such a hybrid implementation enables us to balance speed with accuracy.
Where to Next?
- Object tracking algorithms are more of an advanced Computer Vision concept.
- If you’re interested in studying Computer Vision in more detail, I would recommend the PyImageSearch Gurus course.
  - This course is similar to a college survey in Computer Vision, but way more practical, including hands-on coding and implementations.

Instance Segmentation and Semantic Segmentation

There are three primary types of algorithms used for image understanding:

Image classification algorithms enable you to obtain a single label that represents the contents of an image. You can think of image classification as inputting a single image to a network and obtaining a single label as output.
Object detection algorithms are capable of telling you not only what is in an image, but also where in the image a given object is. Object detectors thus accept a single input image and then returning multiple values as an output. The output itself is a list of values containing (1) the class label and (2) the bounding box (x, y)-coordinates of where the particular object is in the image.
Instance segmentation and semantic segmentation take object detection farther. Instead of returning bounding box coordinates, instance/semantic segmentation methods instead yield pixel-wise masks that tell us (1) the class label of an object, (2) the bounding box coordinates of the object, and (3) the coordinates of the pixels that belong to the object.

These segmentation algorithms are intermediate/advanced techniques, so make sure you read the Deep Learning section above to ensure you understand the fundamentals.

Step #1: Configure Your Development Environment (Beginner)
- In order to perform instance segmentation you need to have OpenCV, TensorFlow, and Keras installed on your system.
- Make sure you follow Step #1 from the How Do I Get Started? section to install OpenCV.
- From there, follow Step #1 from the Deep Learning section to ensure TensorFlow and Keras are properly configured.
Step #2: Segmentation vs. Object Detection (Intermediate)
- Now that you have your deep learning machine configured, you can learn about instance segmentation.
- Follow this guide to utilize your first instance segmentation network using OpenCV:
  - Instance segmentation with OpenCV
- That guide will also teach you how instance segmentation is different from object detection.
Step #3: Applying Mask R-CNN (Intermediate)
- Mask R-CNN is arguably the most popular instance segmentation architecture.
  - Mask R-CNNs have been successfully applied to self-driving cars (vehicle, road, and pedestrian detection), medical applications (automatic tumor detection/segmentation), and much more!
- This guide will show you how to use Mask R-CNN with OpenCV:
  - Mask R-CNN with OpenCV
- And this tutorial will teach you how to use the Keras implementation of Mask R-CNN:
  - Keras Mask R-CNN
Step #4: Semantic Segmentation with OpenCV (Intermediate)
- When performing instance segmentation our goal is to (1) detect objects and then (2) compute pixel-wise masks for each object detected.
- Semantic segmentation is a bit different — instead of labeling just the objects in an input image, semantic segmentation seeks to label every pixel in the image.
  - That means that if a given pixel doesn’t belong to any category/class, we label it as “background” (meaning that the pixel does not belong to any semantically interesting object).
- Semantic segmentation algorithms are very popular for self-driving car applications as they can segment an input image/frame into components, including road, sidewalk, pedestrian, bicyclist, sky, building, background, etc.
- To learn more about semantic segmentation algorithms, refer to this tutorial:
  - Semantic segmentation with OpenCV and deep learning
Where to Next?
- Congratulations, you now understand how to work with instance segmentation and semantic segmentation algorithms!
- However, we worked only with pre-trained segmentation networks — what if you wanted to train your own?
  - That is absolutely possible — and to do so, you’ll want to refer to Deep Learning for Computer Vision with Python.
  - Inside the book you’ll discover
    - The annotation tools I recommend (and how to use them) when labeling your own image dataset for instance/semantic segmentation.
    - How to train a Mask R-CNN on your own custom dataset.
    - How to take your trained Mask R-CNN and apply it to your own images.
    - My best practices, tips, and suggestions when training your own Mask R-CNN.
  - To learn more about the book just click here.

Embedded and IoT Computer Vision

Applying Computer Vision and Deep Learning algorithms to resource constrained devices such as the Raspberry Pi, Google Coral, and NVIDIA Jetson Nano can be super challenging due to the fact that state-of-the-art CV/DL algorithms are computationally hungry — these resource constrained devices just don’t have enough CPU power and sufficient RAM to feed these hungry algorithm beasts.

But don’t worry!

You can still apply CV and DL to these devices — you just need to follow these guides first.

Step #1: Configure Your Embedded/IoT Device (Beginner)
- Before you start applying Computer Vision and Deep Learning to embedded/IoT applications you first need to choose a device.
  - I suggest starting with the Raspberry Pi — it’s a super cheap ($35) and easily accessible device for your initial forays into embedded/IoT Computer Vision and Deep Learning.
  - These guides will help you configure your Raspberry Pi:
    - Install OpenCV on your RPi the “easy way” with pip
    - Compile and install OpenCV 4 from source on Raspberry Pi 4 and Raspbian Buster
  - Another option to consider is NVIDIA’s Jetson Nano, what many call the “Raspberry Pi for Artificial Intelligence”.
    - At $99 it’s still reasonable affordable and packs a Maxwell 128 CUDA core GPU capable of 472 GFLOPS of computation.
    - To get started with the NVIDIA Jetson Nano, follow this guide:
      - Getting started with the NVIDIA Jetson Nano
  - You may also want to consider Google’s Coral Platform along with the Movidius NCS.
    - The Google Coral USB Accelerator is a particularly attractive option as it’s essentially a Deep Learning USB Stick (similar to Intel’s Movidius NCS).
    - Both the Movidius NCS and Google Coral USB Accelerator plug into a USB port on your embedded device (such as a Raspberry Pi or Jetson Nano).
    - You can then performance inference (i.e., prediction) on the USB stick, yielding faster throughput than using the CPU alone.
    - We’ll cover both the Movidius NCS and Google Coral USB Accelerator later in this section.
Step #2: Your First Embedded Computer Vision Project (Beginner)
- Again, I strongly recommend the Raspberry Pi as your first embedded vision platform — it’s super cheap and very easy to use.
- To get started, I would recommend that you understand how to:
  - Access the Raspberry Pi Camera with OpenCV and Python
  - Diagnose common errors using the Raspberry Pi camera module
- Next, build your first motion detector using the Raspberry Pi:
  - Basic motion detection and tracking with Python and OpenCV
- And then extend it to build an IoT home surveillance system:
  - Home surveillance and motion detection with the Raspberry Pi, Python, and OpenCV
Step #3: Create Embedded/IoT Mini-Projects (Intermediate)
- If I’ve said it once, I’ve said it a hundred times — the best way to learn Computer Vision is through practical, hands-on the projects.
  - The same is true for Embedded Vision and IoT projects as well.
- To gain additional experience building embedded CV projects, follow these guides to work with video on embedded devices, including working with multiple cameras and live streaming video over a network:
- To gain experience working with hardware, check out this pan/tilt face tracker:
  - Pan/tilt face tracking with a Raspberry Pi and OpenCV
- There is a dedicated Face Applications section in this guide, but there’s no harm in getting experience with face applications on the RPi now:
  - Raspberry Pi: Facial landmarks + drowsiness detection with OpenCV and dlib
  - Raspberry Pi Face Recognition
- If you’re eager to gain some initial experience using deep learning on embedded devices, start with this guide:
  - Keras and deep learning on the Raspberry Pi
  - From there you’ll want to go through the steps in the Deep Learning section.
- Finally, if you want to integrate text message notifications into the Computer Visions security system we build in the previous step, then read this tutorial:
  - Building a Raspberry Pi security camera with OpenCV
Step #4: Image Classification on Embedded Devices (Intermediate)
- If you followed Step #3 then you found out that running Deep Learning models on resource constrained devices such as the Raspberry Pi can be computationally prohibitive, preventing you from obtaining real-time performance.
- In order to boost your Frames Per Second (FPS) throughput rate, you should consider using a coprocessor such as Intel’s Movidius NCS or Google’s Coral USB Accelerator:
  - Getting started with the Intel Movidius Neural Compute Stick
  - Getting started with Google Coral’s TPU USB Accelerator
- Or, you may want to switch to a different board entirely! For that I would recommend NVIDIA’s Jetson Nano:
  - Getting started with the NVIDIA Jetson Nano
- These devices/boards can substantially boost your FPS throughput!
Step #5: Object Detection on Embedded Devices (Intermediate)
- Just as image classification can be slow on embedded devices, the same is true for object detection as well.
  - And in fact, object detection is actually slower than image classification given the additional computation required.
- To see how object detection on the RPi CPU can be a challenge, start by reading this guide:
  - Raspberry Pi: Deep learning object detection with OpenCV
- To get around this limitation we can once again lean on the Movidius NCS, Google Coral, and NVIDIA Jetson Nano:
Where to Next?
- At this point you should:
  - Understand how to apply basic Computer Vision algorithms to resource constrained devices.
  - And more importantly, appreciate how challenging it can be to apply these algorithms given limited CPU, RAM, and power.
- If you’d like a deeper understanding of this material, including how:
  - Build practical, real-world computer vision applications on the Raspberry Pi
  - Create computer vision and Internet of Things (IoT) projects and applications with the RPi
  - Optimize your OpenCV code and algorithms on the resource constrained Pi
  - Perform Deep Learning on the Raspberry Pi (including utilizing the Movidius NCS and OpenVINO toolkit)
  - Utilize the Google Coral and NVIDIA Jetson Nano to build embedded computer vision and deep learning applications
  - ….then you should definitely take a look at my book, Raspberry Pi for Computer Vision!
  - This book is your one-stop shop for learning how to master Computer Vision and Deep Learning on embedded devices.

Computer Vision on the Raspberry Pi

At only $35, the Raspberry Pi (RPi) is a cheap, affordable piece of hardware that can be used by hobbyists, educators, and professionals/industry alike.

The Raspberry Pi 4 (the current model as of this writing) includes a Quad core Cortex-A72 running at 1.5Ghz and either 1GB, 2GB, or 4GB of RAM (depending on which model you purchase) — all running on a computer the size of a credit card.

But don’t let its small size fool you!

The Raspberry Pi can absolutely be used for Computer Vision and Deep Learning (but you need to know how to tune your algorithms first).

Step #1: Install OpenCV on the Raspberry Pi (Beginner)
- Prior to working through these steps I recommend that you first work through the How Do I Get Started? section first.
  - Not only will that section teach you how to install OpenCV on your Raspberry Pi, but it will also teach you the fundamentals of the OpenCV library.
- If you find yourself struggling to get OpenCV installed on your Raspberry Pi, take a look at both:
  - Practical Python and OpenCV
  - Raspberry Pi for Computer Vision
- Both of those books contain a pre-configured Raspbian .img file.
  - All you need to do is download the .img file, flash it to your micro-SD card, and boot your RPi.
  - From there you’ll have a pre-configured development environment with OpenCV and all other CV/DL libraries you need pre-installed.
    - This .img file can save you days of heartache trying to get OpenCV installed.
  - The documentation for the .img file can be found here.
Step #2: Development on the RPi (Beginner)
- Assuming you now have OpenCV installed on your RPi, you might be wondering about development best practices — what is the best way to write code on the RPi?
  - Should you install a dedicated IDE, such as PyCharm, directly on the Pi itself and code there?
  - Should you use a lightweight code editor such as Sublime Text?
  - Or should you SSH/VNC in to the RPi and edit the code that way?
- You could potentially do all three of those, but my favorite is to use either PyCharm or Sublime Text on my laptop/desktop with a SFTP plugin:
  - Remote development on the Raspberry Pi
- Doing so enables me to code using my favorite IDE on my laptop/desktop.
  - Once I’m done editing a file, I save it, after which the file is automatically uploaded to the RPi.
- It does take some additional time to configure your RPi and laptop/desktop in this manner, but once you do, it’s so worth it!
Step #3: Access your Raspberry Pi Camera or USB Webcam (Beginner)
- Now that your development environment is configured, you should verify that you can access your camera, whether that be a USB webcam or the Raspberry Pi camera module:
Step #4: Your First Computer Vision App on the Raspberry Pi (Beginner)
- The Raspberry Pi is naturally suited for home security applications, so let’s learn how we can utilize motion detection to detect when there is an intruder in our home:
Step #5: OpenCV, GPIO, and the Raspberry Pi (Beginner)
- If you want to use the GPIO to control additional hardware, specifically Hardware on Top (HATs), you should study how OpenCV and GPIO can be used together on the Raspberry Pi:
  - Accessing RPi.GPIO and GPIO Zero with OpenCV + Python
  - OpenCV, RPi.GPIO, and GPIO Zero on the Raspberry Pi
Step #6: Facial Applications on the Raspberry Pi (Intermediate)
- Facial applications, including face recognition can be extremely tricky on the Raspberry Pi due to the limited computational horsepower.
- Algorithms that worked well on our laptop/desktop may not translate well to our Raspberry Pi, so therefore, we need to take care to perform additional optimizations.
- These tutorials will get you started applying facial applications on the RPi:
  - Raspberry Pi: Facial landmarks + drowsiness detection with OpenCV and dlib
  - Raspberry Pi Face Recognition
Step #7: Apply Deep Learning on the Raspberry Pi (Intermediate)
- Deep Learning algorithms are notoriously computationally hungry, and given the resource constrained nature of the RPi, CPU and memory come at a premium.
- To discover why Deep Learning algorithms are slow on the RPi, start by reading these tutorials:
  - Raspberry Pi: Deep learning object detection with OpenCV
- Then, when you’re done, come back and learn how to implement a complete, end-to-end deep learning project on the RPi:
Step #8: Work with Servos and Additional Hardware (Intermediate)
- One of the benefits of the using the Raspberry Pi is that it makes it so easy to work with additional hardware, especially for robotics applications.
- In this tutorial you will learn how to apply face tracking using a pan/tilt servo:
  - Pan/tilt face tracking with a Raspberry Pi and OpenCV
Step #9: Utilize Intel’s NCS for Faster Deep Learning (Advanced)
- In order to speedup Deep Learning model inference on the Raspberry Pi we can use a coprocessor.
  - Think of a coprocessor as a USB stick that contains a specialized chip used to make Deep Learning models run faster.
  - We plug the stick into our RPi, integrate with the coprocessor API, and then push all Deep Learning prediction to the USB stick.
- One of the most popular Deep Learning coprocessors is Intel’s Movidius NCS.
  - Using the NCS we can obtain upwards of a 1,200% speedup in our algorithms!
- To learn more about the NCS, and use it for your own embedded vision applications, read these guides:
- Additionally, my new book, Raspberry Pi for Computer Vision, includes detailed guides on how to:
  - Train your own Deep Learning model on your own custom dataset
  - Optimize the model using the OpenVINO Toolkit
  - Deploy the optimized model to the RPi
  - Enjoy faster inference on the Raspberry Pi!
- To learn more about the book, just click here.
Step #10: Utilize Google Coral USB Accelerator for Faster Deep Learning (Advanced)
- Google’s Coral USB accelerator is a competitor to Intel’s Movidius NCS coprocessor.
- One of the benefits of combining the the Google Coral USB Accelerator with the RPi 4 is USB 3.0.
  - Using USB 3 we can obtain faster inference than the Movidius NCS.
- The Google Coral USB Accelerator is also very easy to use — you can read more about it here:
  - Getting started with Google Coral’s TPU USB Accelerator
  - Object detection and image classification with Google Coral USB Accelerator
Where to Next?
- Congrats on using the Raspberry Pi to apply Computer Vision algorithms!
- If you would like to take the next step, I would suggest reading my new book, Raspberry Pi for Computer Vision.
  - That book will teach you how to use the RPi, Google Coral, Intel Movidius NCS, and NVIDIA Jetson Nano for embedded Computer Vision and Deep learning applications.
  - And just like all my tutorials, each chapter of the text includes well documented code and detailed walkthroughs, ensuring that you understand exactly what’s going on.
  - To learn more about the book, just click here.

Medical Computer Vision

Computer Vision and Deep Learning algorithms have touched nearly every facet of Computer Science.

One area that CV and DL algorithms are making a massive impact on is the field of Medical Computer Vision.

Using Medical Computer Vision algorithms, we can now automatically analyze cell cultures, detect tumors, and even predict cancer before it even metastasizes!

Step #1: Configure Your Development Environment (Beginner)
- Step #2 and #3 of this section will require that you have OpenCV configured and installed on your machine.
  - Make sure you follow Step #1 from the How Do I Get Started? section to install OpenCV.
- Step #4 covers how to use Deep Learning for Medical Computer Vision.
  - You will need to have TensorFlow and Keras installed on your system for those guides.
  - You should follow Step #1 from the Deep Learning section to ensure TensorFlow and Keras are properly configured.
Step #2: Your First Medical Computer Vision Project (Beginner)
- Our first Medical Computer Vision project uses only basic Computer Vision algorithms, thus demonstrating how even basic techniques can make a profound impact on the medical community:
  - Detecting Parkinson’s Disease with OpenCV, Computer Vision, and the Spiral/Wave Test
- Fun fact: I wrote the above tutorial in collaboration with PyImageSearch reader, Joao Paulo Folador, a PhD student from Brazil.
  - We then published a paper detailing the method in CLAIB 2019!
  - It’s just further proof that PyImageSearch tutorials can lead to publishable results!
Step #3: Create Medical Computer Vision Mini-Projects (Intermediate)
- Now that you have some experience, let’s move on to a slightly more advanced Medical Computer Vision project.
- Here you will learn how to use Deep Learning to analyze root health of plants:
  - Deep learning and hydroponics
Step #4: Solve Real-World Medical Computer Vision Projects (Advanced)
- Our previous sections dealt with applying Deep Learning to a small medical image dataset.
- But what about larger medical datasets?
- Can we apply DL to those datasets as well?
- You bet we can!
  - The following two guides will show you how to use Deep Learning to automatically classify malaria in blood cells and perform automatic breast cancer detection:
    - Deep Learning and Medical Image Analysis with Keras
    - Breast cancer classification with Keras and Deep Learning
- Take your time working through those guides and make special note of how we compute the sensitivity and specificity, of the model — two key metrics when working with medical imaging tasks that directly impact patients.
Where to Next?
- As I mention in my About page, Medical Computer Vision is a topic near and dear to my heart.
- Previously, my company has consulted with the National Cancer Institute and National Institute of Health to develop image processing and machine learning algorithms to automatically analyze breast histology images for cancer risk factors.
- I’ve also developed methods to automatically recognize prescription pills in images, thereby reducing the number of injuries and deaths that happen each year due to the incorrect medication being taken.
- I continue to write about Medical Computer Vision, so if you’re interested in the topic, be sure to keep an eye on the PyImageSearch blog.
- Otherwise, you should take a look at my book, Deep Learning for Computer Vision with Python, which covers chapters on:
  - Automatic cancer/skin lesion segmentation using Mask R-CNNs
  - Prescription pill detection/localization using Mask R-CNNs
- To learn more about my deep learning book, just click here.

Working with Video

Most tutorials I have on the PyImageSearch blog involve working with images — but what if you wanted to work with videos instead?

If that’s you, make sure you pay attention to this section.

Step #1: Install OpenCV on Your System (Beginner)
- Prior to working with video (both on file and live video streams), you first need to install OpenCV on your system.
- You should follow Step #1 of the How Do I Get Started? section to configure and install OpenCV on your machine.
Step #2: Accessing your Webcam (Beginner)
- Now that you have OpenCV installed, let’s learn how to access your webcam.
  - If you are using either a USB webcam or built-in webcam (such as the camera on your laptop), you can use OpenCV’s cv2.VideoCapture class.
  - The problem with this method is that it will block your main execution thread until the next frame is read from the camera sensor.
    - That can be a big problem as it can dramatically decrease the Frames Per Second (FPS) throughput of your system.
    - To resolve the issue, I have implemented a threaded VideoStream class that more efficiently reads frames from a camera:
      - Learn how to use the VideoStream class
    - I would also suggest reading the following tutorial which provides a direct comparison of the cv2.VideoCapture class to my VideoStream class:
      - Increasing webcam FPS with Python and OpenCV
  - If you are using a Raspberry Pi camera module then you should follow this getting started guide to access the RPi camera:
    - Accessing the Raspberry Pi Camera with OpenCV and Python
    - Common errors using the Raspberry Pi camera module
  - Once you’ve confirmed you can access the RPi camera module you can use the VideoStream class which is compatible with both built-in/USB webcams and the RPi camera module:
    - Unifying picamera and cv2.VideoCapture into a single class with OpenCV
  - Inevitably, there will be a time where OpenCV cannot access your camera and your script errors out, resulting in a “NoneType” error — this tutorial will help you diagnose and resolve such errors:
    - OpenCV: Resolving NoneType errors
Step #3: Face Detection in Video (Beginner)
- I’m strong believer in learning by doing through practical, hands-on applications — and it’s hard to get more practical than face detection!
- This tutorial will teach you how to apply face detection to video streams:
  - Face detection with OpenCV and deep learning
Step #4: Face Applications in Video (Intermediate)
- Building on face detection, let’s learn how to apply face applications to video streams as well:
Step #5: Object Detection in Video (Intermediate)
- Face detection is a special class of object detection.
- Object detectors can be trained to recognize just about any type of object.
- The OpenCV library enables us to use pre-trained object detectors to detect common objects we encounter in our daily lives (people, cars, trucks, dogs, cats, etc.).
- The following tutorials will teach you how to apply object detection to video streams:
  - A gentle guide to deep learning object detection
  - Real-time object detection with deep learning and OpenCV
Step #6: Create OpenCV and Video Mini-Projects (Beginner/Intermediate)
- At this point you have a fair amount of experience applying Computer Vision and OpenCV to videos — let’s continue practicing using these tutorials:
- Take you time working through them and take notes as you do so.
  - You should pay close attention to the tutorials that interest you and excite you the most.
  - Take note of them and then revisit your ideas after you finish these tutorials.
    - Ask yourself how could extend them to work with your own projects?
    - What if you tried a different video source?
    - Or how might you integrate one of these video applications into a home security system?
    - Brainstorm these ideas and then try to implement them yourself — the best way to learn is to learn by doing!
Step #7: Image/Video Streaming with OpenCV (Intermediate)
- So far we’ve looked at how to process video streams with OpenCV, provided that we have physical access to the camera.
- But what if wanted to to access a network or an IP camera — how might we do that?
- Accessing RTSP streams with OpenCV is a big pain and not something I recommend doing.
- Instead, you should use ImageZMQ to stream frames directly from a camera to a server for processing:
  - Live video streaming over network with OpenCV and ImageZMQ
  - An interview with Jeff Bass, creator of ImageZMQ
Step #8: Video Classification with Deep Learning (Advanced)
- For this step I’ll be making the assumption that you’ve worked through the first half of the Deep Learning section.
- Provided that you have, you may have noticed that applying image classification to video streams results in a sort of prediction flickering.
  - A “prediction flicker” occurs when an image classification model reports Label A for Frame N, but then reports Label B (i.e., a different class label) for Frame N + 1 (i.e., the next frame in the video stream), despite the frames having near-identical contents!
  - Prediction flickering is a natural phenomena in video classification.
    - It happens due to noise in the input frames confusing the classification model.
  - One simple method to rectify prediction flickering is to apply prediction averaging:
    - Video classification with Keras and Deep Learning
  - Using prediction averaging you can overcome the prediction flickering problem.
    - Additionally, you may want to look into more advanced Deep Learning-based image/video classifiers, including Recurrent Neural Networks (RNNs) and Long Short-Term Memory Networks (LSTMs).
Where to Next?
- If you’re brand new to the world of Computer Vision and Image Processing, I would recommend you read Practical Python and OpenCV.
  - That book will teach you the basics of Computer Vision through the OpenCV library — and best of all, you can complete that book in only a single weekend.
  - It’s by far the fastest way to get up and running with OpenCV.
  - And furthermore, the book includes complete code templates and examples for working with video files and live video streams with OpenCV.
- For a more detailed review of the Computer Vision field, I would recommend the PyImageSearch Gurus course.
  - The PyImageSearch Gurus course is a comprehensive dive into the world of Computer Vision.
  - You can think of the Gurus course as similar to a college survey course on CV (but much more hands-on and practical).
- Finally, you’ll note that we utilized a number of pre-trained Deep Learning image classifiers and object detectors in this section.
  - If you’re interested in training your own custom Deep Learning models you should look no further than Deep Learning for Computer Vision with Python.
  - You’ll learn how to create your own datasets, train models on top of your data, and then deploy the trained models to solve real-world projects.
  - It’s by far the most comprehensive, detailed, and complete Computer Vision and Deep Learning education you can find online today.
  - Click here to learn more.

Image Search Engines

Content-based Image Retrieval (CBIR) is encompasses all algorithms, techniques, and methods to build an image search engine.

An image search engine functions similar to a text search engine (ex., Google, Bing, etc.).

A user visits the search engine website, but instead of having a text query (ex., “How do I learn OpenCV?”) they instead have an image as a query.

The goal of the image search engine is to accept the query image and find all visually similar images in a given dataset.

CBIR is the primary reason I started studying Computer Vision in the first place. I found the topic fascinating and am eager to share my knowledge with you.

Step #1: Install OpenCV on your System (Beginner)
- Before you can perform CBIR or build your first image search engine, you first need to install OpenCV your system.
- Follow Step #1 of the How Do I Get Started? section above to configure OpenCV and install it on your machine.
Step #2: Build Your First Image Search Engine (Beginner)
- The first image search engine you’ll build is also one of the first tutorials I wrote here on the PyImageSearch blog.
- Using this tutorial you’ll learn how to search for visually similar images in a dataset using color histograms:
  - A How-To Guide to Building Your First Image Search Engine in Python
Step #3: Understand Image Quantification (Beginner)
- In Step #2 we built an image search engine that characterized the contents of an image based on color — but what if we wanted to quantify the image based on texture, shape, or some combination of all three?
- How might we go about doing that?
- In order to describe the contents of an image, we first need to understand the concept of image quantification:
  - How To Describe and Quantify an Image Using Feature Vectors
  - Image quantification is the process of:
    - Accepting an input image
    - Applying an algorithm to characterize the contents of the image based on shape, color, texture, etc.
    - Returning a list of values representing the quantification of the image (we call this our feature vector).
    - The algorithm that performs the quantification is our image descriptor or feature descriptor.
Step #4: The 4 Steps of Any Image Search Engine (Beginner)
- There are four key steps to building any image search engine:
- As your CBIR system becomes more advanced you’ll start to include sub-steps between the main steps, but for now, understand that those four steps will be present in any image search engine you build.
Step #5: Build Image Search Engine Mini-Projects (Beginner)
- Now that you understand the fundamentals of CBIR, let’s apply it to a mini-project:
  - The complete guide to building an image search engine with Python and OpenCV
  - In the above tutorial you’ll learn how to combine color with locality, leading to a more accurate image search engine.
Step #6: Image Hashing (Intermediate)
- So far we’ve learned how to build an image search engine to find visually similar images in a dataset.
- But what if we wanted to find duplicate or near-duplicate images in a dataset?
  - Such an application is a subset of the CBIR field called image hashing:
    - Image hashing with OpenCV and Python
  - Image hashing algorithms compute a single integer to quantify the contents of an image.
  - The goal of applying image hashing is to find all duplicate/near-duplicate images.
  - Practical use cases of image hashing include:
    - De-duping a set of images you obtained by crawling the web.
      - You may be using my Google Images scraper or my Bing API crawler to build a dataset of images to train your own custom Convolutional Neural Network.
      - In that case, you want want to find all duplicate/near-duplicate images in your dataset (as these duplicates provide no additional value to the dataset itself).
    - Building TinEye, a reverse image search engine.
      - Reverse image search engines:
        
        Accept an input image
        
        Compute its hash
        
        And tell you everywhere on the web that the input image appears on
Step #7: Scaling Image Hashing Search Engines (Intermediate)
- At this point you know how image hashing algorithms work — but how can we scale them like TinEye has?
- The answer is to utilize specialized data structures, such as VP-Trees.
- This tutorial will show you how to efficiently use VP-Trees to scale your image hashing search engine:
  - Building an Image Hashing Search Engine with VP-Trees and OpenCV
Where to Next?
- The techniques covered here will help you build your own basic image search engines.
- The problem with these algorithms is they do not scale.
- If you want to build more advanced image search engines that scale to millions of images you’ll want to look into:
  - The Bag-of-Visual-Words model (BOVW)
  - k-Means clustering and forming a “codebook”
  - Vector quantization
  - Tf-idf weighting
  - Building an inverted index
- The PyImageSearch Gurus course includes over 40+ lessons on building image search engines, including how to scale your CBIR system to millions of images.
- If you’re interested in learning more about the course, and extending your own CBIR knowledge, just use the link below:
  - Tell me more about the PyImageSearch Gurus course

Interviews, Case Studies, and Success Stories

You can learn Computer Vision, Deep Learning, and OpenCV — I am absolutely confident in that.

And if you’ve been following this guide, you’ve seen for yourself how far you’ve progressed.

However, we cannot spend all of our time neck deep in code and implementation — we need to come up for air, rest, and recharge our batteries.

When then happens I suggest supplementing your technical education with a bit of light reading used to open your mind to what the world of Computer Vision and Deep Learning offers you.

After 5 years running the PyImageSearch blog I’ve seen countless readers dramatically change their lives, including changing their careers to CV/DL/AI, being awarded funding, winning Kaggle competitions, and even becoming CTOs of funded companies!

It’s truly a privilege and an honor to be taking this journey with you — thank you for letting me accompany you on it.

Below you’ll find some of my favorite interviews, case studies, and success stories.

Step #1: A Day in the Life of Adrian Rosebrock (Beginner)
- Ever wonder what it’s like to work as a Computer Vision/Deep Learning researcher and developer?
- You’re not alone.
- Over the past 5 years running PyImageSearch, I have received 100s of emails and inquiries that are “outside” traditional CV, DL, and OpenCV questions.
- They instead focus on something much more personal — my daily life.
- To give you an idea of what it’s like to be me, I’m giving you a behind the scenes look at:
  - How I spend my day.
  - What it’s like balancing my role as a (1) computer vision researcher/developer and (2) a writer and owner of PyImageSearch.
  - The habits and practices I’ve spent years perfecting to help me get shit done.
- You can read the full post here:
  - A day in the life of Adrian Rosebrock: computer vision researcher, developer, and entrepreneur.
Step #2: Intro to Computer Vision (Beginner)
- Back in 2015 I was interviewed on Scott Hanselman’s legendary podcast, Hanselminutes:
  - I was featured on the Hanselminutes podcast: Computer vision and the impact it has on our daily lives.
- Inside the podcast Scott and I discuss the types of problems Computer Vision can solve, from medical issues to gaming, retail to surveillance.
- This podcast is an excellent listen if you’re brand new to the world of Computer Vision (or if you want something entertaining to listen to).
Step #3: Computer Vision — Where are We Going Next? (Beginner)
- A more recent podcast (April 2019) comes from an interview on the Super Data Science Podcast, hosted by Kirill Eremenko:
  - SDS 255: Diving Into Computer Vision
- In the podcast we discuss Computer Vision, Deep Learning, and what the future holds for the fields.
- I highly recommend listening to this podcast, regardless if you are brand new to Computer Vision or already a seasoned expert — it’s both entertaining and educational at the same time.
Step #4: From Developer to CTO (Beginner)
- Saideep Talari’s story holds a special place in my heart.
- He started his career as a network tester, found his first job in Computer Vision after completing the PyImageSearch Gurus course, and then after completing Deep Learning for Computer Vision with Python is now the CTO of a tech company with over $2M in funding.
- He’s also an incredibly nice person — he used his earnings to clear his families debts and start fresh.
- Saideep is one my favorite people I’ve ever had the privledge of knowing — there’s a lot you can learn from this interview:
  - PyImageSearch Gurus member spotlight: Saideep Talari
Step #5: $30,500 in Grant Funding (Beginner)
- Tuomo Hiippala was awarded a $30,500 research grant for his work in Computer Vision, Optical Character Recognition, and Document Understanding.
- Find out how he landed the grant in the interview with him:
  - PyImageSearch Gurus member spotlight: Tuomo Hiippala
Step #6: Winning Kaggle’s Most Competitive Image Classification Challenge Ever (Beginner)
- David Austin and his teammate, Weimin Wang, took home 1st place (and $25,000) in Kaggle’s Iceberg Classifier Challenge (Kaggle’s most competitive challenge ever).
- David and Weimin used techniques from both the PyImageSearch Gurus course and Deep Learning for Computer Vision with Python to come up with their winning solution — read the full interview, including how they did it, here:
  - An interview with David Austin: 1st place and $25,000 in Kaggle’s most popular image classification competition
Step #7: Landing a Research and Development (R&D) Position (Beginner)
- Kapil Varshney was recently hired at Esri R&D as a Data Scientist focusing on Computer Vision and Deep Learning.
- Kapil’s story is really important as it shows that, no matter what your background is, you can be successful in computer vision and deep learning — you just need the right education first!
- You see, Kapil is a long-time PyImageSearch reader who read Deep Learning for Computer Vision with Python (DL4CV) last year.
- Soon after reading DL4CV, Kapil competed in a challenge sponsored by Esri to detect and localize objects in satellite images (including cars, swimming pools, etc.).
- He finished in 3rd-place out of 53 competitors.
- Esri was so impressed with Kapil’s work that after the contest they called him in for an interview.
- Kapil nailed the interview and was hired full-time at Esri R&D.
- His work on satellite image analysis at Esri now impacts millions of people across the world daily — and it’s truly a testament to his hard work.
- You can read the full interview with Kapil here:
  - An interview with Kapil Varshney, Data Scientist at Esri R&D
Where to Next?
- - I can’t promise you that you’ll win a Kaggle competition like David or become the CTO of a Computer Vision company like Saideep did, but I can guarantee you that the books and courses I offer here on PyImageSearch are the best resources available today to help you master computer vision and deep learning.
  - If you’d like to follow in their steps, you can see what books and courses I offer here:
    - What books and courses do you offer?
    - What do each of your books/courses cover? How are they similar and how are they different?
  - If you need help choosing a book/course, I suggest starting here:
    - Which book or course is right for me/which one do I start with?
  - And if you have any questions on my books/courses, feel free to reach out to me:
    - Contact me

Need More Help?

I’m dedicated to helping you learn Computer Vision, Deep Learning, and OpenCV.

If you need more help from me, here are a few options:

Books and Courses
- Practical Python and OpenCV
  - My gentle introduction to the world of computer vision and image processing through OpenCV.
  - If you’re brand new to the world of computer vision and image processing, start with this book so you can learn the fundamentals first.
- Deep Learning for Computer Vision with Python
  - In-depth dive into the world of computer vision and deep learning.
  - Whether this is the first time you’ve worked with machine learning and neural networks or you’re already a seasoned deep learning practitioner, DL4CV is engineered from the ground up to help you reach expert status.
- PyImageSearch Gurus
  - Similar to a college survey course in computer vision but much more hands-on, and practical.
  - Covers 13 modules broken out into 168 lessons, with other 2,161 pages of content.
  - Includes private community forums which I participate in daily.
    - Great way to get faster, more detailed answers to your questions.
- Raspberry Pi for Computer Vision
  - Apply Computer Vision and Deep Learning algorithms to embedded devices, including the Raspberry Pi, Google Coral, and NVIDIA Jetson Nano.
  - I recommend reading this book together with Practical Python and OpenCV and/or Deep Learning for Computer Vision with Python.
    - RPI for CV uses both Computer Vision and Deep Learning algorithms so some previous experience is suggested but not required.
Blog
- I’ve authored over 350+ free tutorials on the PyImageSearch.com blog.
- It’s likely that I have already authored a tutorial to help you with your question or project.
  - Make sure you use the “Search” bar to search for keywords related to your topic.
  - The search bar can be found on top-right of the sidebar on every page
FAQ
- I’ve compiled answers to to the most common questions I receive on my official FAQ page.
- Please check the FAQ as it’s possible that your question has been addressed there.
Contact
- Feel free to ask me a question, but kindly keep it to one question per email.
- My contact form

저작자표시 비영리 동일조건

Posted by uniqueone

안녕하세요 :) 금요일에 찬성님이 공유해주신 Full Stack Deep Learning Bootcamp 강의가 너무 좋아 바로 강의를 듣고 정리했습니다 단순히 Production할 때 Serving을 어떻게 해야한다 이런 한정적 내용만 ..

Deep Learning/resources 2019. 10. 8. 13:24

안녕하세요 :)
금요일에 찬성님이 공유해주신 Full Stack Deep Learning Bootcamp 강의가 너무 좋아 바로 강의를 듣고 정리했습니다

단순히 Production할 때 Serving을 어떻게 해야한다 이런 한정적 내용만 가르쳐주지 않고, 프로젝트 전반에 대해 생각하면 좋은 점, 딥러닝 프로젝트 트러블 슈팅 및 자주 하는 실수(Shape mitmatch, Casting issue, OOM 등)을 알려주고 있어서 매우 유익합니다

큰 그림을 잘 그려주는 강의라 생각합니다 :)

관심있으신 분들은 보시면 후회하지 않으실 강의입니다!

---

부트캠프의 목적은 Production화하기 위한 모든 것들을 가르치는 것입니다
- Problem을 명확히하고 프로젝트의 cost를 측정
- Data를 찾고, 전처리하고, 라벨링
- 적절한 Framework와 Infra를 선정
- 학습의 reproducibility 관련 트러블슈팅
- 대규모 모델 Deploy
https://www.facebook.com/groups/1738168866424224/permalink/2415463898694714/?sfnsn=mo

'Deep Learning > resources' 카테고리의 다른 글

일하기 싫어서 재미로 학습해본 seq2seq 모델을 공유합니다 :D 텍스트를 입력하면 성경 말투(?)로 변환해줘요 ㅋㅋㅋㅋㅋㅋ https://github.com/MrBananaHuman/TextToBible (0)	2019.10.15
잘 만든 Keras 모델을 모바일로 포팅해서 앱을 만들고자 하시는 분들 많으시죠? 어떻게 하면 될까요? 이 문제를 잘 정리한 글입니다. (0)	2019.10.15
Adam을 개선한 RAdam의 배경과 실행방법 (0)	2019.10.02
이제 저작권 없는 인물사진을 사용할 수 있다는,, (0)	2019.09.24
안녕하세요, 수아랩의 이호성입니다. - ICCV(International Conference on Computer Vision) 학회는 컴퓨터 비전에서 최고 수준의 학회이며 올해는 서울 코엑스에서 10/27 ~ 11/2 일주일간 개최가 됩니다. - ICCV .. (0)	2019.09.24

Posted by uniqueone

Hi guys, Do you want to build computer vision models for cattle monitoring? I the COCO json, masks, and images freely available here: https://nsmb.me/aw0f I'm planning on sharing more, maybe writing tutorials if anybody is interested. Would love to g..

Deep Learning/dataset 2019. 10. 7. 09:43

Hi guys,

Do you want to build computer vision models for cattle monitoring?
I the COCO json, masks, and images freely available here: https://nsmb.me/aw0f

I'm planning on sharing more, maybe writing tutorials if anybody is interested. Would love to get your feedback on this. 😊
https://www.facebook.com/groups/1738168866424224/permalink/2415463898694714/?sfnsn=mo

'Deep Learning > dataset' 카테고리의 다른 글

LandCover.ai: Dataset for Automatic Mapping of Buildings, Woodlands and Water fr (0)	2020.05.15
Great dataset recently released for the autonomous vehicle industry: Audi Autono (0)	2020.05.07
구글 데이터셋 검색 사이트 오픈 (0)	2020.01.09
안녕하세요, 최근에 SKTBrain에서 공개한 KoBERT를 이용해서 간단한 한국어 객체명 인식기를 만들어봤습니다. NER에 관심있는 분들은 한 번 보셔도 좋을 것 같습니다 기존 CNN-BiLSTM 보다 학습도 빠.. (0)	2019.11.02
DEVIEW2019 Keynote에서 “석상옥 대표님”이 소개해주신 NAVER LABS의 자율주행용 Open dataset입니다. 국내자율주행 기술 성장에 큰 도움이 될 것으로 기대됩니다 : ) (0)	2019.10.28

Posted by uniqueone

State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit..

Deep Learning/Papers2read 2019. 10. 4. 16:23

State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet

https://www.profillic.com/paper/arxiv:1909.03051

"With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantitatively, the ability of feature disentanglement qualitatively, and promising computational efficiency."
https://m.facebook.com/groups/1738168866424224?view=permalink&id=2414785935429177&sfnsn=mo

'Deep Learning > Papers2read' 카테고리의 다른 글

ICYMI: NADS-Net: Driver and Seat Belt Detection via Convolutional Neural Network! https://www.profillic.com/paper/arxiv:1910.03695 (0)	2019.10.13
안녕하세요, 수아랩의 이호성입니다. 이번 ICCV 2019에 accept된 Object Detection 주제의 논문 "Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 .. (0)	2019.10.11
ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u.. (0)	2019.10.04
Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538 (0)	2019.10.04
Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163 (0)	2019.10.02

Posted by uniqueone

ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u..

Deep Learning/Papers2read 2019. 10. 4. 13:18

ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling

https://www.profillic.com/paper/arxiv:1909.11839

(Breast cancer is one of the most common causes of cancer-related death in women worldwide)

(The proposed network architecture using a pre-trained Xception model yields 92.50% average classification accuracy)
https://www.facebook.com/groups/1738168866424224/permalink/2414747192099718/?sfnsn=mo

'Deep Learning > Papers2read' 카테고리의 다른 글

안녕하세요, 수아랩의 이호성입니다. 이번 ICCV 2019에 accept된 Object Detection 주제의 논문 "Gaussian YOLOv3. An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving"을 한글로 리뷰하여 .. (0)	2019.10.11
State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit.. (0)	2019.10.04
Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538 (0)	2019.10.04
Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163 (0)	2019.10.02
Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping (0)	2019.10.02

Posted by uniqueone

Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538

Deep Learning/Papers2read 2019. 10. 4. 11:22

Modern problems require modern solutions: Protecting privacy using deepfakes

https://www.profillic.com/paper/arxiv:1909.04538
https://www.facebook.com/groups/1738168866424224/permalink/2415463898694714/?sfnsn=mo

'Deep Learning > Papers2read' 카테고리의 다른 글

State of the art in Gait recognition: Novel AutoEncoder framework, GaitNet https://www.profillic.com/paper/arxiv:1909.03051 "With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantit.. (0)	2019.10.04
ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u.. (0)	2019.10.04
Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163 (0)	2019.10.02
Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping (0)	2019.10.02
Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165 (0)	2019.10.02

Posted by uniqueone

Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163

Deep Learning/Papers2read 2019. 10. 2. 13:37

https://www.facebook.com/groups/DeepNetGroup/permalink/962998307426385/

Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.

https://www.profillic.com/paper/arxiv:1908.06163

저작자표시 비영리 동일조건

'Deep Learning > Papers2read' 카테고리의 다른 글

ICMYI: Breast Cancer Diagnosis with Transfer Learning and Global Pooling https://www.profillic.com/paper/arxiv:1909.11839 (Breast cancer is one of the most common causes of cancer-related death in women worldwide) (The proposed network architecture u.. (0)	2019.10.04
Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538 (0)	2019.10.04
Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping (0)	2019.10.02
Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165 (0)	2019.10.02
안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄.. (0)	2019.10.01

Posted by uniqueone

Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping

Deep Learning/Papers2read 2019. 10. 2. 13:35

https://www.facebook.com/groups/1738168866424224/permalink/2414086682165769/

Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.

https://www.profillic.com/paper/arxiv:1909.12837

Solution: SegMap: a map representation solution for localization and mapping

저작자표시 비영리 동일조건

'Deep Learning > Papers2read' 카테고리의 다른 글

Modern problems require modern solutions: Protecting privacy using deepfakes https://www.profillic.com/paper/arxiv:1909.04538 (0)	2019.10.04
Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163 (0)	2019.10.02
Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165 (0)	2019.10.02
안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄.. (0)	2019.10.01
이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모.. (0)	2019.10.01

Posted by uniqueone

Adam을 개선한 RAdam의 배경과 실행방법

Deep Learning/resources 2019. 10. 2. 13:33

https://www.facebook.com/groups/726202694419379/permalink/909151656124481/

Adam을 개선한 RAdam의 배경과 실행방법

- 딥러닝 과학기술 그룹

'그래서 결론은, 아담(Adam)' 이라며 옵티마이저 수업을 끝내도 무리가 없었을 만큼,

아담은 고급 경사하강법들 중 대표주자 격이었습니다. 그런데, 몇주 전 아담을 개선한 RAdam(Rectified Adam)이 나왔고 이게 더 좋다며 들썩이고 있습니다. 가장 좋은 것이 개선되어 새로 나왔으니 새로운 대표주자의 탄생일까요?

옵티마이저는 SGD, 모멘텀, 알엠에스프롭 등의 순서로 발표되었는데 나중에 발표될 수록 앞서 나온것을 참조하게되므로 나중 것이 예전 것보다 성능이 개선되는 경향이 있습니다. 따라서 RAdam을 이해하기 위해서는 먼저 그동안 나온 옵티마이저들의 흐름을 파악해 보는 것이 좋습니다. 옵티마이저의 개선 방향과 그 끝에 나온 RAdam에 관하여 간단히 정리해 보았습니다.

1. 경사하강법을 개선한 SGD

딥러닝을 배울 때 반드시 배우게 되는 것이 경사 하강법(Gradient Decent, GD)입니다. 미분을 이용해 가중치를 업데이트 하는 것이지요. 그런데 경사 하강법은 한번 업데이트 할 때 마다 전체 데이터를 미분해야 하므로 계산량이 매우 많았습니다. 이러한 점을 보완하기 위해 전체 데이터가 아닌, 랜덤하게 추출한 일부만을 사용하는 확률적 경사 하강법(Stochastic GD, SGD)이 나왔습니다. SGD가 기존의 GD를 대체해 감과 동시에, 이처럼 기존의 방법을 개선한 '고급 경사하강법'들이 연이어 소개되기 시작합니다.

2. 관성과 방향을 고려한 모멘텀

미분을 통해 기울기를 구하는 건 같지만, 오차 수정전 바로 앞 수정값과 방향(+,-)를 참고하여 같은 방향으로 일정한 비율만 수정되게 하는 방법이 모멘텀(Momentum)입니다. 이 아이디어로 인해 양의 방향과 음의 방향으로 지그재그가 크게 반복되며 수정 되는 현상이 개선되었고 이전 이동값을 고려하여 일정 비율 만큼만 다음값을 결정하므로 관성의 효과를 낼 수 있었습니다.

3. 모멘텀을 개선한 네스테로프 모멘텀

모멘텀이 이동시킬 방향을 정하면, 그 방향으로 미리 이동해서 기울기를 계산, 불필요한 계산량을 줄이고 정확도를 향상시키는 방법이 제안 되었습니다. 네스테로프 모멘텀(Nesterov momentum)이라고 불리우는 고급 경사 하강법입니다.

4. 보폭의 크기를 개선한 아다그라드와 아다그라드를 개선한 알엠에스프롭

이어서 변수의 업데이트가 잦으면 학습률을 적게 하여 이동 보폭을 조절하는 아이디어를 담은 아다그라드(Adagrad)가 소개되었고, 이 아다그라드의 보폭 민감도를 보완한 알엠에스프롭(RMSprop)이 등장했습니다.

6. 모멘텀과 알엠에스 프롭의 장점을 합친 아담

아담은 모멘텀을 사용해 정확도를 향상시키고 알엠에스프롭를 이용해 보폭 민감도를 보완한 방법입니다. '결론은 아담'이었던 이유는 아담이 그동안 나온 고급 경사하강법들의 장점을 모아 만들어 졌기 때문입니다.

7. 그 아담을 개선한 RAdam

드디어 RAdam의 등장입니다. 일리노이즈 대학, 조지아텍 그리고 마이크로소프트에 소속된 멤버들이 아담을 연구하던 중, 초기에 낮은 학습률로 warmup을 해주거나, 모멘텀을 잠시 꺼주면 아담의 성능이 향상되는 이유를 조사했습니다. 이를 정리해 발표한 논문 (참고자료 2)에 등장한 것이 RAdam입니다.

[RAdam의 실행]

케라스를 사용하는 경우, 다음과 같이 자신의 프로젝트에 적용할 수 있습니다.

1. pip install keras-rectified-adam 로 인스톨 후,

2. from keras_radam import RAdam 로 불러와서,

3. model.compile(RAdam(), loss="sparse_categorical_crossentropy", metrics=["accuracy"]) 이런식으로 컴파일 할때 옵티마이저 부분을 대체해 줍니다.

파이토치로 만들어진 공식 깃헙은 아래 주소입니다.
https://github.com/LiyuanLucasLiu/RAdam

아래 참고자료(3)은 RAdam의 원리에 대한 간단한 소개 및 Adam과의 결과를 비교하는 것을 보여줍니다. (수정/추가) 여기 실린 두개의 결과 그림을 가져와 첨부했었는데, RAdam의 loss가 Adam보다 더 낮지 않은 예시였고, 여러 데이터 집합의 결과를 검토해 한번 더 올리겠다고 예고된 바, RAdam 논문에 실린 그림으로 대체하였습니다.

참고 자료

1. 모두의 딥러닝, 길벗 (pp 116~119, "속도와 정확도 문제를 해결하는 고급 경사 하강법"편)

2. Liu, Liyuan, et al. "On the variance of the adaptive learning rate and beyond." arXiv preprint arXiv:1908.03265 (2019). https://arxiv.org/pdf/1908.03265.pdf

3. "Rectified Adam (RAdam) optimizer with Keras" https://www.pyimagesearch.com/2019/09/30/rectified-adam-radam-optimizer-with-keras/

저작자표시 비영리 동일조건

'Deep Learning > resources' 카테고리의 다른 글

잘 만든 Keras 모델을 모바일로 포팅해서 앱을 만들고자 하시는 분들 많으시죠? 어떻게 하면 될까요? 이 문제를 잘 정리한 글입니다. (0)	2019.10.15
안녕하세요 :) 금요일에 찬성님이 공유해주신 Full Stack Deep Learning Bootcamp 강의가 너무 좋아 바로 강의를 듣고 정리했습니다 단순히 Production할 때 Serving을 어떻게 해야한다 이런 한정적 내용만 .. (0)	2019.10.08
이제 저작권 없는 인물사진을 사용할 수 있다는,, (0)	2019.09.24
안녕하세요, 수아랩의 이호성입니다. - ICCV(International Conference on Computer Vision) 학회는 컴퓨터 비전에서 최고 수준의 학회이며 올해는 서울 코엑스에서 10/27 ~ 11/2 일주일간 개최가 됩니다. - ICCV .. (0)	2019.09.24
Reinforcement Learning KR 에서 행사가 하나 있었습니다. (0)	2019.08.28

Posted by uniqueone

Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165

Deep Learning/Papers2read 2019. 10. 2. 13:31

https://www.facebook.com/groups/DeepAI/permalink/2280373625587435/

Great applications for the fashion industry-

Poly-GAN: Garments are automatically placed on images of human models at an arbitrary pose

https://www.profillic.com/paper/arxiv:1909.02165

저작자표시 비영리 동일조건

'Deep Learning > Papers2read' 카테고리의 다른 글

Built by Stanford researchers: TunaGAN: Modify high-resolution face images with good qualitative and quantitative performance.https://www.profillic.com/paper/arxiv:1908.06163 (0)	2019.10.02
Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping (0)	2019.10.02
안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄.. (0)	2019.10.01
이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모.. (0)	2019.10.01
TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다 (0)	2019.10.01

Posted by uniqueone

60 #Interview #Questions On #MachineLearning https://analyticsindiamag.com/60-in (0)	2020.02.10
Hi DataScience enthusiast . Are you fresher or professional looking out to make your path as "Data Scientist" and here something for you for upcoming 30days(prepare yourself and get hired ) . DataScience interview questions #day25 . If you missed #day1 .. (0)	2019.11.26
Towards Data Science(TDS)에서 지금까지 올라온 주옥같은 포스트들을 주제별로 분류하여 제공하였습니다. https://towardsdatascience.com/learn-on-towards-data-science-52245bc91451 (0)	2019.11.26
Hi DataScience enthusiast . Are you fresher or professional looking out to make your path as "Data Scientist" and here something for you for upcoming 30days(prepare yourself and get hired ) . DataScience interview questions #day13 . If you missed #day1 .. (0)	2019.11.15
python for data science cheat sheet using pandas (0)	2019.11.11

안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄..

Deep Learning/Papers2read 2019. 10. 1. 18:00

안녕하세요, 수아랩의 이호성입니다.
얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄 정도로 핵심 내용을 요약하여 글로 작성을 해보았습니다.

https://hoya012.github.io/blog/ICCV-2019-paper-preview/

Image Classification, Object Detection, Segmentation, Generative Model, Super-Resolution, Adversarial Attack 등의 주제의 논문들로 선정을 하였으니 어떤 논문을 읽을지 고민이 되시는 분들은 참고하시면 좋을 것 같습니다!
공부하시는데 도움이 되었으면 좋겠습니다. 감사합니다!

https://www.facebook.com/groups/PyTorchKR/permalink/1487343908071952/?sfnsn=mo

“ICCV 2019 paper preview”

October 01, 2019 | 12 Minute Read

안녕하세요, 이번 포스팅에서는 2019년 10월 27일 ~ 11월 2일 우리나라 서울에서 개최될 ICCV 2019 학회의 accepted paper들에 대해 분석하여 시각화한 자료를 보여드리고, accepted paper 중에 제 관심사를 바탕으로 22편의 논문을 간단하게 리뷰를 할 예정입니다. 최근 모든 학회들이 다 그렇듯이 전체 accepted paper가 폭발적으로 많아지고 있습니다. 논문 수가 많다 보니 하나하나 읽기에는 시간이 많이 소요가 되어서 제목만 보고 논문 리스트를 추리게 되었습니다.

당부드리는 말씀은 제가 정리한 논문 리스트에 없다고 재미 없거나 추천하지 않는 논문은 절대 아니고 단지 제 주관에 의해 정리된 것임을 강조 드리고 싶습니다.!!

ICCV 2019 Paper Statistics

메이저 학회에 대한 미리보기 형식의 블로그 글들을 여러 편 썼는데 이번에는 5번째 글을 작성하게 되었습니다.

매번 하던 것처럼 이번에도 ICCV 2019에 몇 편의 논문이 submit되고 accept되는 지 경향을 시각화하였습니다.

격년으로 진행되어오는 학회인데 2017년까지만 해도 학회에 제출되는 논문의 규모가 약간씩 상승하는 경향을 보였습니다. 그런데 올해에는 2년전에 비해 제출된 논문의 수가 약 2배가량 커졌으며 이에 따라 acceptance rate도 25%대로 크게 떨어진 것을 확인할 수 있습니다. 이러한 경향은 CVPR 2019과도 거의 동일한 것이 흥미로운 점입니다. (2017년 대비 제출된 논문 2배 증가, acceptance rate 30%  25% 감소)

또한 어떤 키워드의 논문들이 많이 제출되는지 경향을 분석하기위해 간단한 python script를 작성해보았습니다.

단순하게 논문 제목에 포함된 키워드를 분석하여 시각화를 하였으며, 코드는 해당 repository 에서 확인하실 수 있습니다. (Star는 저에게 큰 힘이됩니다!)

Computer Vision 학회이다 보니 image, video, object 등 general한 키워드들이 주를 이루고 있고, attention, unsupervised, re-identification 등의 키워드를 가진 논문들이 빈도가 증가하였습니다. 이러한 키워드 정보를 참고하면 최근 학회에 제출되는 논문들의 트렌드를 파악하는데 도움이 될 수 있습니다.

참고로 올해는 총 1077편의 논문이 accept 되었고 저는 이 논문들 중 22편을 선정해서 간단하게 소개를 드릴 예정입니다.

ICCV 2019 주요 논문 소개

앞서 말씀드렸듯이 accept된 논문을 모두 다 확인하기엔 시간과 체력이 부족하여서, 간단하게 훑어보면서 재미가 있을 것 같은 논문들을 추려보았습니다. 총 22편의 논문이며, 8편의 oral paper, 14편의 poster paper로 준비를 해보았습니다. 각 논문에서 제안한 방법들을 그림과 함께 간략하게 소개드릴 예정이며, 논문의 디테일한 내용은 직접 논문을 읽어 보시는 것을 추천 드립니다.

1. Human uncertainty makes classification more robust

Topic: Image Classification, Robustness
CIFAR-10 데이터셋을 기반으로 사람의 label을 취득하여 얻은 CIFAR-10H soft label 데이터셋을 제작하였고, 이를 이용하여 학습을 시키면 모델의 일반화 성능이 좋아짐을 실험적으로 증명함.
논문의 내용을 요약하여 ppt로 제작하였습니다. 자세한 내용은 해당 ppt를 참고하시면 될 것 같습니다.
논문 리뷰 PPT

2. Exploring Randomly Wired Neural Networks for Image Recognition (Oral)

Topic: Image Classification
Neural Architecture Search(NAS)에서 human이 설정한 constraint에 의존하지 않고 모든 layer를 random하게 생성하는 Randomly Wired Neural Network 구조를 제안함.
3가지의 Random graph model (ER, BA, WS)를 이용하여 Random하게 wiring하는 network 구조를 생성하였고 우수한 성능을 보이는 것을 확인함.
PR-12 이진원님 한글 리뷰 영상

3. Searching for MobileNetV3 (Oral)

Topic: Image Classification
Efficient-Oriented CNN의 대표격인 MobileNet의 3번째 버전. MobileNet V2과 MnasNet 등에서 사용된 layer 들을 기반으로 한 구조를 제안하였고, swish nonlinearity를 fixed point 연산에 최적화시킨 hard-swish activation function을 제안함.
기존 방법들 대비 우수한 성능을 보였고, classification 외에 object detection, semantic segmentation에도 적용하면 좋은 성능을 보임. 또한 efficient segmentation을 위한 decoder 구조인 Lite Reduced Atrous Spatial Pyramid Pooling(LR-ASPP) 도 제안함.

4. Universally Slimmable Networks and Improved Training Techniques

Topic: Image Classification
지난 ICLR 2019 image recognition paper list guide 게시물 에서 다루었던 Slimmable neural network 논문의 후속 논문
기존 Slimmable network에서는 미리 지정한 width에 대해서만 동작할 수 있었는데 이러한 문제를 개선하여 임의의 width에서도 동작이 가능한 universally slimmable networks(US-Nets) 구조를 제안하였고, 이를 잘 학습시키기 위한 sandwich rule, inplace distillation 방식을 제안함.

5. Unsupervised Pre-Training of Image Features on Non-Curated Data (Oral)

Topic: Image Classification, Unsupervised learning
Annotation이 존재하지 않는(Non-Curated Data) 대량의 데이터셋을 이용하여 ImageNet과같은curated data를 이용하여 pre-training을 하는 것과 비슷한 성능을 내기위한 unsupervised pre-training 기법을 제안함. Self-supervision, clustering이 주된 방법임.

6. Understanding Deep Networks via Extremal Perturbations and Smooth Masks (Oral)

Topic: Image attribution
모델이 input의 어느 부분을 보고 output을 출력하는지 확인하는 문제를 attribution 문제라 하는데 널리 사용되는 back-propagation 방식 대신 perturbation 방식인 Extremal perturbation 을 제안함.
mask의 넓이와 smoothness에 constraint를 가하는 방식을 이용하며 image 뿐만 아니라 네트워크의 intermediate layer에도 적용 가능함을 보임.

7. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features (Oral)

Topic: Image Classification, Data augmentation
Region 기반 dropout 방식이 모델의 분류 성능을 높이는데 기여하는데 이 때 정보의 손실이 발생하는 단점이 있었음. 이를 개선하기 위해 Mixup 방식을 접목시킨 CutMix augmentation 기법을 제안함.
Official Code (PyTorch)

8. Online Hyper-Parameter Learning for Auto-Augmentation Strategy

Topic: Image Classification, Data augmentation
Data Auto augmentation을 위한 Online Hyper-parameter learning(OHL-Auto-Aug) 방식을 제안함.
기존 Auto augmentation 방식들은 offline 방식이라 search & 재학습을 반복해야 하는데 제안하는 방법은 online 방식으로 진행되고 결과적으로 search cost를 크게 감소시킬 수 있음.

9. Unsupervised Out-of-Distribution Detection by Maximum Classifier Discrepancy

Topic: Image Classification, Out-of-distribution detection, Anomaly detection
이미지 분류 문제에서 정해진 class 외에 아예 생뚱맞은 class의 이미지가 입력으로 들어왔을 때 이를 걸러내는 문제를 out-of-distribution detection 이라고 부름. 본 논문에서는 기존 방식들과는 다르게 unlabeled data를 활용하는 unsupervised setting을 따르며 기존 방식들 대비 우수한 성능을 보임.
하나의 feature extractor와 2개의 classifier로 구성이 되어있으며 각각 다른 decision boundary를 갖도록 하는 Discrepancy Loss 를 통해 unsupervised training을 수행함.

10. FCOS: Fully Convolutional One-Stage Object Detection

Topic: Object Detection
기존 object detection에서 주로 사용되던 anchor box 기반 방식이나 proposal 기반 방식에서 벗어나 pixelwise로 prediction을 하는 Fully-Convolutional one-stage detector(FCOS)를 제안함.
Anchor box를 사용하면서 생기는 여러 부작용들(training 계산량, hyper-parameter에 민감한 성능 등)을 해결할 수 있으며 기존 방법들 대비 좋은 성능을 보임.
Official Code (PyTorch)

11. AutoFocus: Efficient Multi-Scale Inference

Topic: Object Detection
지난 NeurIPS 2018 image recognition paper guide 게시물 에서 다루었던 SNIPER 논문의 inference 과정에서 발생하는 문제를 개선하기 위한 방법론을 제안함.
Small object가 존재할 법한 위치를 추출한 결과물인 FocusPixels과 이를 둘러싼 FocusChips를 생성하고 FocusChips에 대해 한 번 더 detect를 수행하여 검출 성능을 높이는 Multi-scale inference 방법을 제안함.
SNIPER보다 빠른 처리 속도로 비슷한 성능을 낼 수 있는 것이 장점.

12. Where Is My Mirror?

Topic: Semantic Segmentation
그동안 대부분 Computer Vision 문제에서 거울은 잘 다루지 않아 왔음. 하지만 거울은 일상생활에서 자주 볼 수 있는 물건이며 보이는 것을 반사한다는 특징이 있음.
본 논문에서는 이미지로부터 거울 영역을 segmentation하기 위한 데이터셋을 제작하고 MirrorNet 이라는 네트워크 구조를 제안함.
최초의 mirror dataset인 MSD는 4,018장의 이미지와 mask로 구성이 되어있음. 참신한 문제 상황이 흥미로운 논문임.

13. YOLACT: Real-Time Instance Segmentation (Oral)

Topic: Instance Segmentation
실시간 instance segmenation을 위한 YOLACT 라는 방법론을 제안함. YOLO 논문과 유사하게 기존 방법들 대비 정확도는 떨어지지만 single GPU로 실시간(30fps 이상) 동작하는 것을 main contribution으로 삼고 있음.
약간의 정확도 손실이 발생하는 대신 처리 속도를 늘릴 수 있는 FastNMS 방식도 제안함.
Official Code (PyTorch)

14. Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation

Topic: Semantic Segmentation
기존 Weakly Supervised Semantic Segmentation(WSSS) 연구들은 대체로 학습된 Saliency Detection(SD)의 결과물을 이용하는 방식을 사용 해왔음.
WSSS와 SD를 하나의 network(SS-Net)를 이용하여 동시에 학습시키는 multi-task learning 방식을 제안함.

15. SC-FEGAN: Face Editing Generative Adversarial Network With User’s Sketch and Color

Topic: Generative Model
데모 이미지에서 알 수 있듯이 원하는 영역에 스케치를 그려주면 스케치와 주변 context를 보고 그럴싸한 이미지를 그려주는 GAN 구조를 제안함.
컬러 이미지, 수정하고자 하는 영역의 mask, HED edge detector를 이용하여 얻은 sketch 등을 이용하며, PartialConv based padding 과 per-pixel loss, perceptual loss, style loss, total variance loss 등을 이용하여 안정적인 학습을 수행함.
Official Code (TensorFlow)

16. AutoGAN: Neural Architecture Search for Generative Adversarial Networks

Topic: Generative Model, AutoML
AutoML의 Neural Architecture Search를 GAN에 적용하는 방법론을 제안함.
Inception score를 reward로 사용하였고 Multi-level architecture search(MLAS)를 적용하여 단계적으로 NAS를 수행함.
Official Code (PyTorch)

17. Seeing What a GAN Cannot Generate (Oral)

Topic: Generative Model
GAN의 고질적인 문제인 mode collapse를 분석하기 위해 distribution level 과 instance level에서 mode collapse를 시각화하는 방법을 제안함. 즉 GAN generator가 생성하지 못하는 것이 무엇인지를 파악하는 것을 목표로 함.
Target image와 generated image의 object들의 distribution을 확인하기 위해 semantic segmentation network를 사용하여 Generated Image Segmentation Statistics 지표를 측정하고, 이를 토대로 GAN을 분석함. (distribution level)
또한 이미지 단위로 특정 클래스가 누락된 GAN으로 생성한 이미지와 실제 이미지를 비교하며 실패 case를 분석하는 instance level의 분석도 수행함.

18. Everybody Dance Now

Topic: Generative Model,
Video로부터 Pose를 얻고 이를 통해 다시 Video를 생성하는 과정에서 원본 동영상의 춤 Style을 Transfer 하는 것을 GAN을 통해 수행함.
또한 얼굴 합성의 퀄리티를 높이기 위해 별도의 FaceGAN 구조도 사용하여 전반적인 생성된 영상의 품질을 높임.
Demo Video

19. SROBB: Targeted Perceptual Loss for Single Image Super-Resolution

Topic: Single Image Super-Resolution
17번 논문과 유사하게 segmentation 정보를 사용하는 것이 특징이며 segmentation label로부터 Object, Background, Boundary(OBB) label을 얻은 뒤 이를 이용하여 perceptual loss를 효과적으로 주는 방법을 제안함.
실제로 사람이 민감하게 열화를 느끼는 edge 부분에 loss를 반영하는 점이 인상깊으며 실제 Super-Resolution을 통해 얻은 이미지의 퀄리티도 우수한 것을 확인할 수 있음.

20. Toward Real-World Single Image Super-Resolution: A New Benchmark and a New Model (Oral)

Topic: Single Image Super-Resolution
현존하는 대부분의 Single Image Super-Resolution 논문들은 “Single Image Super Resolution using Deep Learning Overview” 게시물 에서 제기했던 문제점처럼 simulated datasets에 대해 학습이 되고 있음.
하지만 실제 LR image의 degradations은 단순한 bicubic downsampling 등의 방식보다 훨씬 복잡한 특징을 가지고 있음. 이러한 문제점을 해결하기 위해 디지털 카메라의 focal length를 조절하며 같은 scene에서 LR-HR pair image를 취득하여 얻은 RealSR 데이터셋을 제작하고, 새로운 모델인 Laplacian pyramid based kernel prediction network (LP-KPN) 을 제안함.

21. Evaluating Robustness of Deep Image Super-Resolution Against Adversarial Attacks

Topic: Single Image Super-Resolution, Adversarial attack
딥러닝 기반 Single image Super-Resolution의 adversarial attack에 대한 Robustness를 분석한 논문. LR image에 약간의 perturbation을 넣어주며 attack을 시도하는 방법을 사용함.
3가지 attack method를 제안하였고, state-of-the-art deep super-resolution model들이 adversarial attack에 취약함을 가지고 있음을 입증하고 여러 방법들의 robustness를 이론적, 실험적으로 분석함.

[여러 Deep SR 모델에 대한 adversarial attack 결과]

22. Adversarial Robustness vs. Model Compression, or Both?

Topic: Adversarial attack, Model Compression, Network Pruning
Deep neural network가 adversarial attack에 취약한 건 잘 알려진 사실이며, Min-max robust optimization 기반 adversarial training을 이용하면 adversarial robustness를 높일 수 있음. 하지만 큰 capacity를 갖는 network를 필요로 함.
본 논문에서는 adversarial robustness를 유지하며 모델을 경량화하는 concurrent adversarial training & weight pruning 기법을 제안함.

결론

이번 포스팅에서는 ICCV 2019에 대한 분석 및 주요 논문 22편에 대한 간단한 리뷰를 글로 작성해보았습니다.
제가 정리한 논문 외에도 이번 ICCV 2019에는 양질의 논문들이 많이 제출되었으니 관심있으신 분들은 다른 논문들도 읽어 보시는 것을 권장 드리며 이상으로 글을 마치겠습니다. 감사합니다!

'Deep Learning > Papers2read' 카테고리의 다른 글

Precisely estimating a robot’s pose in a prior, global map is a challenge in unstructured, dynamic environments.https://www.profillic.com/paper/arxiv:1909.12837Solution: SegMap: a map representation solution for localization and mapping (0)	2019.10.02
Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165 (0)	2019.10.02
이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모.. (0)	2019.10.01
TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다 (0)	2019.10.01
Gaze Estimation for Assisted Living Environments https://www.profillic.com/paper/arxiv:1909.09225 Experiments on images from a real assisted living environment demonstrate the higher suitability of their model for its final application. (0)	2019.09.30

Posted by uniqueone

이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모..

Deep Learning/Papers2read 2019. 10. 1. 10:43

https://www.facebook.com/groups/TensorFlowKR/permalink/997362213938170/

AI Robotics KR Neural Network Quantization & Compact Network Design Study

WEEK4: XNOR-NET & SQUEEZENET !!

Paper: XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

Presentor: 오휘건
Video: https://youtu.be/N6oP-8E5cWA
PPT: https://drive.google.com/open?id=1bz3C-fFVSCrOdnbi-8lf_2NS1yhpGdVO

Paper: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size

Presentor: Martin Hwang
Video: https://youtu.be/eH5O5nDiFoY
PPT: https://drive.google.com/open?id=1HNRhl1lxb7oe0gFsbv9f2fduCr_f_G4O

Description :

지난 9월 29일 일요일에 Neural Network Quantization & Compact Network Design Study의 4번째 모임이 있었습니다.

이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모두 좋은 발표해주셔서 유익한 시간이었습니다 📷

진행한 영상과 자료를 공유드립니다! 📷
즐거운 한 주 되세요! 🤩

저작자표시 비영리 동일조건

'Deep Learning > Papers2read' 카테고리의 다른 글

Great applications for the fashion industry-Poly-GAN: Garments are automatically placed on images of human models at an arbitrary posehttps://www.profillic.com/paper/arxiv:1909.02165 (0)	2019.10.02
안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄.. (0)	2019.10.01
TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다 (0)	2019.10.01
Gaze Estimation for Assisted Living Environments https://www.profillic.com/paper/arxiv:1909.09225 Experiments on images from a real assisted living environment demonstrate the higher suitability of their model for its final application. (0)	2019.09.30
안녕하세요? 텐플코 여러분. 어제 자정 마감이었던 ICLR 2020의 논문 제출수가 대략 2600건이 된 것 같습니다. 학회장이 에디오피아라 내심 제출건수가 좀 줄기를 기대했건만 1000건이나 더 늘었다.. (0)	2019.09.26

Posted by uniqueone

TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다

Deep Learning/Papers2read 2019. 10. 1. 10:38

https://www.facebook.com/groups/TensorFlowKR/permalink/997406130600445/

#PR12 #197번째논문

TensorFlow Korea 논문읽기모임 PR12 197번째 논문 review입니다

(2기 목표 200편까지 이제 3편이 남았습니다!!)

이번에 제가 발표한 논문은 FAIR(Facebook AI Research)에서 나온 One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers 입니다

한 장의 ticket으로 모든 복권에서 1등을 할 수 있다면 얼마나 좋을까요?

일반적인 network pruning 방법은 pruning 하기 이전에 학습된 network weight를 그대로 사용하면서 fine tuning하는 방법을 사용해왔습니다

pruning한 이후에 network에 weight를 random intialization한 후 학습하면 성능이 잘 나오지 않는 문제가 있었는데요

작년 MIT에서 나온 Lottery ticket hypothesis라는 논문에서는 이렇게 pruning된 이후의 network를 어떻게 random intialization하면 높은 성능을 낼 수 있는지

이 intialization 방법을 공개하며 lottery ticket의 winning ticket이라고 이름붙였습니다.

그런데 이 winning ticket이 혹시 다른 dataset이나 다른 optimizer를 사용하는 경우에도 잘 동작할 수 있을까요?

예를 들어 CIFAR10에서 찾은 winning ticket이 ImageNet에서도 winning ticket의 성능을 나타낼 수 있을까요?

이 논문은 이러한 질문에 대한 답을 실험을 통해서 확인하였고, initialization에 대한 여러가지 insight를 담고 있습니다.

자세한 내용은 발표 영상을 참고해주세요~!

영상링크: https://youtu.be/YmTNpF2OOjA

발표자료링크: https://www.slideshare.net/…/pr197-one-ticket-to-win-them-a…

논문링크: https://arxiv.org/abs/1906.02773

저작자표시 비영리 동일조건

'Deep Learning > Papers2read' 카테고리의 다른 글

안녕하세요, 수아랩의 이호성입니다.얼마전에 ICCV 2019 학회에 대한 간단한 시각화 자료를 공유드렸었는데요, 이번에는 제 관심사를 바탕으로 1077편의 논문 중에 22편을 정해서 각 논문당 2~3줄.. (0)	2019.10.01
이번 스터디에서는 BNN 학습을 Cost Function을 정의하고 최적화하여 접근하는 방식을 소개한 XNOR-Net과 1x1 Conv와 3x3 Conv의 조합을 통해 Network를 Reduction하는 SqueezeNet을 다루었습니다! 두 발표자분 모.. (0)	2019.10.01
Gaze Estimation for Assisted Living Environments https://www.profillic.com/paper/arxiv:1909.09225 Experiments on images from a real assisted living environment demonstrate the higher suitability of their model for its final application. (0)	2019.09.30
안녕하세요? 텐플코 여러분. 어제 자정 마감이었던 ICLR 2020의 논문 제출수가 대략 2600건이 된 것 같습니다. 학회장이 에디오피아라 내심 제출건수가 좀 줄기를 기대했건만 1000건이나 더 늘었다.. (0)	2019.09.26
Great applications for the healthcare industry: 3D Mesh Reconstruction from Single 2D Image for Right Ventricle https://www.profillic.com/paper/arxiv:1909.08986 "Instantiation-Net: 3D Mesh Reconstruction from Single 2D Image for Right Ventricle" (0)	2019.09.25

Posted by uniqueone

예전에 keras모델(SavedModel)을 tflite 모델로 변환시키는 과정에서 bn(batch normalization)이 전부 빠져서 알아보니 tflite는 bn을 지원하지 않는다는 글을 본적이 있었습니다.혹시 아직도 tflite는 bn을 지..

Deep Learning/TensorFlow 2019. 10. 1. 10:35

https://www.facebook.com/groups/TensorFlowKR/permalink/997343660606692/

안녕하세요 TF-KR 여러분

이번에는 tflite 및 모바일 딥러닝 관련 질문입니다.

예전에 keras모델(SavedModel)을 tflite 모델로 변환시키는 과정에서 bn(batch normalization)이 전부 빠져서 알아보니 tflite는 bn을 지원하지 않는다는 글을 본적이 있었습니다.

혹시 아직도 tflite는 bn을 지원하지 않는건가요?

그러면 혹시 bn이 포함되어 있는 tensorflow 모델을 안드로이드 혹은 IOS 에서 구동할 수 있는 방법은 없는건가요?

추론시간이 빠르지 않아도 괜찮아서(뭐.. 10초 기다려도 괜찮습니다 ㅠㅠ) 한번 구동하셔 보고 싶어서 자료를 찾는데 보이질 않아 질문합니다.

이승현 tflite에서 batchnorm을 지원하지 않는 것은 제가 사용해보지 않아서 정확하지 않지만
convolutional layer와 batch normalization의 parameter를 합쳐서 하나의 layer로 만들 수 있기 때문인 것 같습니다.

저작자표시 비영리 동일조건

'Deep Learning > TensorFlow' 카테고리의 다른 글

역시 #TFDevSummitKR 키노트에 소개될 만큼 TensorFlow KR 그룹의 저력을 느낄 수 있었습니다. 이 자리를 빌어 하루만에 조회 (0)	2020.03.18
안녕하십니까? 저는 기계공학(열유체공학)전공자입니다. 기계설계에 있어 데이터 및 인공신경망을 적용하고자 합니다. 텐서플로를 활용하여, DNN을 이용하여 기계의 성능예측 주제로 SCI 논.. (0)	2019.11.30
deploy a CNN that could understand 10 signs or hand gestures. (0)	2018.02.22
Quick complete Tensorflow tutorial to understand and run Alexnet, VGG, Inceptionv3, Resnet and squeezeNet networks (0)	2017.07.26
LSTM을 이용한 감정 분석 w/ Tensorflow. 텍스트파일에서 감정상태 분류 (0)	2017.07.05

Posted by uniqueone

인공지능을 공부하면서 느꼈던 점들과 공부자료들을 공유하고 싶어 이렇게 글을 남깁니다.

Deep Learning/course 2019. 10. 1. 09:03

https://www.facebook.com/groups/TensorFlowKR/permalink/997795710561487/

텐서플로우 코리아 님들 안녕하세요!

2017년 8월에 인공지능을 처음 입문하였는데, 어느덧 2년이 지나 학교를 졸업했네요. 잠시 백수 라이프를 즐기고 있는데, 인공지능을 공부하면서 느꼈던 점들과 공부자료들을 공유하고 싶어 이렇게 글을 남깁니다.

1. 주변의 변화

저보다 더 오래되신 분들도 많으시겠지만, 2년 전만 하더라도 주변에 딥러닝을 하는 사람들이 많이 없었습니다. 그런데 요즘에는 기계/ 재료/ 화학 등 여러 학과에서 딥러닝을 많이 하고 있고, 딥러닝/ 데이터 사이언티스트로 취직하기위한 허들도 조금씩 낮아지고 있는 것 같습니다. 당장 저희 학교/ 학과만 보더라도 다들 딥러닝 한다고(작년이랑 올해 캡스톤 디자인 수상한 팀이 다 딥러닝을 사용한 팀이네요 ㅋㅋ)하고 있고, 대학교 마지막 학기인 저의 친형은 재료 물성치를 예측하는 딥러닝 모델을 만드는 데 도와달라고 하네요 ㅋㅋ. 정말 재미있는 현상 같습니다.

2. 수학 vs 코딩

6개월 전까지만 하더라도 저는 수학 파였는데, 요즘은 균형 잡힌 인재가 더 필요한 것 같습니다. 또한, 코딩보다 수학을 위주로 공부하여 취직하고 싶다면 석사 또는 박사의 학력이 필요한 것 같습니다. 이 부분에 대해서 결정을 하기위해서는 사이언티스트로 취업을 할지 엔지니어로 취직을 할지 먼저 결정하는게 좋을 것 같네요. 일반적으로 사이언티스트는 수학을 좀 더 공부하면 좋을 것 같고, 엔지니어는 전산과목을 좀 더 공부하면 좋을 것 같습니다. 인공지능에는 많은 통계/수학적 지식이 필요합니다. 물론 몰라도 코딩은 할 수 있고, 이를 응용하여 사용할 수 있지만, 수학을 모르고는 그 한계가 분명합니다. 반면에 수학을 잘하더라도, 이를 구현하지 못 하면 소용 없음으로, 둘 중에 하나를 정하여 집중하되 다른 한 쪽도 기초는 공부하는게 좋을 것 같네요ㅎ

개인적으로 수학은 선형대수학, 수리통계학, 회귀분석은 수강하는 게 좋다고 생각하며,

전산 과목은(잘 모르지만) 자료구조, 알고리즘, 컴퓨터 구조 정도는 알고 있어야 한다고 생각합니다(물론 제가 다 들었다는 것은 아닙니다. ㅋㅋ)

3. 텐서플로우 VS 파이토치

저는 지금도 텐서플로우를 사용하여 코딩하고 있습니다. 텐서플로우는 빠르고, 오픈 소스가 많다는 장점이 있지만, GPU버전을 설치하기가 힘들며, 병렬처리를 하기 힘들다는 단점을 가지고 있습니다. 반면 파이토치는 병렬처리가 텐서플로우에 비해서는 정말 쉽고 코드를 짜는 것도 편하다는 장점이 있습니다. 개인적으로는 한 라이브러리를 깊이 있게 공부하고, 나머지 다른 라이브러리는 읽을 수 있는 정도만 공부하면 될 것 같습니다.

4. 컴퓨터 비전 vs 자연어 처리 vs 강화학습

아주 예민한 주제인데, 저의 생각은 자신이 하고 싶은 거로 하되 각 분야의 유명 모델 정도는 공부하자 입니다(너무 식상한가요? ㅎ). 여기는 학생분들도 많이 계시니까 취업을 기준으로 먼저 말하면 현재 기준 자연어 처리 > 컴퓨터 비전 > 강화학습 순으로 일자리가 많지만, 각 분야에서 두각을 드러낸다면 이는 문제 될 일이 없는 것 같습니다. GAN은 컴퓨터 비전에서 유명한 모델입니다. 하지만 데이터의 확률분포를 학습하기 위한 방법으로 자연어처리 분야의 음성 합성 부분에서 자주 등장하며, 최근 자연어 처리의 핫 모델 BERT는 컴퓨터 비전의 SELFIE라는 사전학습 방법으로 응용되어 제안되기도 했습니다. 이처럼 자신이 원하는 도메인을 잡아 공부하되, 다른 분야의 핫 모델들도 같이 공부한다면 이를 응용하여 좋은 결과를 낼 수도 있다고 생각합니다.

5. 구현에 관한 생각

우리는 머신러닝 모델을 공부할 때 깃허브에서 “Generative adversarial networks tensorflow”라고 검색하여 나온 코드를 사용하곤 합니다. 하지만 공부를 하면서 느꼈던 것은 가짜 구현이 정말 많다는 것 이였습니다. 실제로 저의 경우, Spectral Normalization GANs의 코드가 필요해 깃허브 스타가 좀 있는 분의 구현을 다운받아서 연구에 사용했습니다. 나중에 안 사실이지만 이는 가짜 구현이었고, FID와 Inception score를 찍어본 결과 논문에서 제시하는 값들에 한 참 못 미치는 결과가 나왔습니다. 이처럼 다른 사람의 코드를 가지고 오거나 직접 코드를 짜서 연구할 때는 철저한 검증 절차가 필수적이라고 생각합니다.

6. 머신러닝 및 딥러닝 강의 목록

최근에는 영어만 잘한다면 들을 수 있는 명강의들이 정말 많습니다. 영어를 잘 못 하는 저는 눈물만 나지만 ㅠㅠ, 주제별로 괜찮다 싶은 강의들을 모아봤습니다.

모두를 위한 딥러닝 시즌 2

(제작해주신 모든 분들 정말 감사합니다. 딥러닝 입문 한국어 강좌들 중 원톱!)

https://www.youtube.com/watch?v=7eldOrjQVi0&list=PLQ28Nx3M4Jrguyuwg4xe9d9t2XE639e5C

머신러닝을 위한 Python 워밍업(한국어)

https://www.edwith.org/aipython

머신러닝을 위한 선형대수(한국어)

https://www.edwith.org/linearalgebra4ai

데이터 구조 및 분석(문일철 교수님)

https://kaist.edwith.org/datastructure-2019s

인공지능 및 기계학습 개론(문일철 교수님)

https://kaist.edwith.org/machinelearning1_17

영상이해를 위한 최적화 기법(김창익 교수님)

https://kaist.edwith.org/optimization2017

<영어>

UC Berkley 인공지능 강좌

https://www.youtube.com/watch?v=Va8WWRfw7Og&list=PLZSO_6-bSqHQHBCoGaObUljoXAyyqhpFW

CS231n

https://www.youtube.com/results?search_query=cs213n

Toronto Machine Learning course

https://www.youtube.com/watch?v=FvAibtlARQ8&list=PL-Mfq5QS-s8iS9XqKuApPE1TSlnZblFHF

CS224N(NLP 강좌)

https://www.youtube.com/playlist?list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z

Deep Learning for Natural Language Processing(Oxford, DeepMind)

https://www.youtube.com/watch?v=RP3tZFcC2e8&list=PL613dYIGMXoZBtZhbyiBqb0QtgK6oJbpm

Advanced Deep Learning, Reinforcement Learning(DeepMind)

https://www.youtube.com/watch?v=iOh7QUZGyiU&list=PLqYmG7hTraZDNJre23vqCGIVpfZ_K2RZs&index=1

다들 즐거운 하루되세요 ㅎㅎ!

저작자표시 비영리 동일조건

'Deep Learning > course' 카테고리의 다른 글

[괜찮은 듯] pytorch tutorial (0)	2019.10.21
The #working #code is given in the video description of each video. You can #download the Jupyter notebook from #GitHub. Learn Complete Data Science with these 4 video series \| Free Content of 28 Hours in 55 Lectures. 1. Python for Beginners https://ww.. (0)	2019.10.21
최근 번역서로 출판된 "신경망과 심층학습" 이라는 책에 대한 부가적인 자료가 있는 사이트를 알게되어 공유드립니다. 원서 제목은 Neural Networks and Deep Learning: A Textbook 으로 IBM Watson 연구소의 .. (0)	2019.09.17
부스트코스]딥러닝 기초 강좌"요 라고 말할 수 있을 것 같습니다 (0)	2019.07.29
머신러닝 딥러닝 유튜브 강좌 (0)	2018.09.08

Posted by uniqueone

이전 1 2 3 다음

'2019/10'에 해당되는 글 78건

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

What do you need help with?

How Do I Get Started?

Deep Learning

Face Applications

Optical Character Recognition (OCR)

Object Detection

Object Tracking

Instance Segmentation and Semantic Segmentation

Embedded and IoT Computer Vision

Computer Vision on the Raspberry Pi

Medical Computer Vision

Working with Video

Image Search Engines

Interviews, Case Studies, and Success Stories

Need More Help?

'Deep Learning > resources' 카테고리의 다른 글

'Deep Learning > dataset' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > resources' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Data Science' 카테고리의 다른 글

“ICCV 2019 paper preview”

'Deep Learning > Papers2read' 카테고리의 다른 글

AI Robotics KR Neural Network Quantization & Compact Network Design Study

WEEK4: XNOR-NET & SQUEEZENET !!

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > Papers2read' 카테고리의 다른 글

'Deep Learning > TensorFlow' 카테고리의 다른 글

'Deep Learning > course' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바