Be the only one, not the best one :: 스마트폰에서 딥러닝 실행방법-Squeezing Deep Learning into Mobile Phones

스마트폰에서 딥러닝 실행방법-Squeezing Deep Learning into Mobile Phones - A Practitioner's guide

Deep Learning 2017. 3. 18. 08:04

Search Data Science Central
SearchSign UpSign In
Data Science Central
ANALYTICS
BIG DATA
HADOOP
DATA PLUMBING
DATAVIZ
JOBS
WEBINARS
MEMBERSHIP
SEARCH
CLASSIFIEDS
CONTACT

Subscribe to DSC Newsletter
All Blog PostsMy BlogAdd

Squeezing Deep Learning into Mobile Phones - A Practitioner's guide
Posted by Vincent Granville on March 17, 2017 at 7:30amView Blog
This is a slideshare presentation by Anirudh Koul. Anirudh is deep learning data scientist at Microsoft AI & Research. He earned a master of computational data science at Carnegie Mellon University, and a graduate certificate in data mining from Stanford University. He currently lives in the Bay Area. Anirudh is leading projects like Seeing AI (for the blind community) and others.

Yesterday, I gave a talk at the Strata+Hadoop World Conference on “Squeezing Deep Learning into Mobile Phones - A Practitioner's guide”. Luckily, it seems to have organically gone viral on Twitter, with 3000 views in 12 hours. I thought your readers might find it interesting too, hence sharing it with you.
Tweet: https://twitter.com/petewarden/status/842169469401104384
Slideshare: https://www.slideshare.net/anirudhkoul/squeezing-deep-learning-into...
My twitter id : @anirudhkoul
Below is a transcript of the presentation:
1. Squeezing Deep Learning into mobile phones - A Practitioners guide Anirudh Koul
2. Anirudh Koul , @anirudhkoul , http://koul.ai Project Lead, Seeing AI Applied Researcher, Microsoft AI & Research Akoul at Microsoft dot com Currently working on applying artificial intelligence for productivity, augmented reality and accessibility Along with Eugene Seleznev, Saqib Shaikh, Meher Kasam
3. Why Deep Learning On Mobile? Latency Privacy
4. Mobile Deep Learning Recipe Mobile Inference Engine + Pretrained Model = DL App (Efficient) (Efficient)
5. Building a DL App in _ time
6. Building a DL App in 1 hour
7. Use Cloud APIs Microsoft Cognitive Services Clarifai Google Cloud Vision IBM Watson Services Amazon Rekognition
8. Microsoft Cognitive Services Models won the 2015 ImageNet Large Scale Visual Recognition Challenge Vision, Face, Emotion, Video and 21 other topics
9. Building a DL App in 1 day
10. http://deeplearningkit.org/2015/12/28/deeplearningkit-deep-learning... Energy to train Convolutional Neural Network Energy to use Convolutional Neural Network
11. Base PreTrained Model ImageNet – 1000 Object Categorizer Inception Resnet
12. Running pre-trained models on mobile MXNet Tensorflow CNNDroid DeepLearningKit Caffe Torch
13. MXNET Amalgamation : Pack all the code in a single source file Pro: • Cross Platform (iOS, Android), Easy porting • Usable in any programming language Con: • CPU only, Slow https://github.com/Leliana/WhatsThis
14. Tensorflow Easy pipeline to bring Tensorflow models to mobile Great documentation Optimizations to bring model to mobile Upcoming : XLA (Accelerated Linear Algebra) compiler to optimize for hardware
15. CNNdroid GPU accelerated CNNs for Android Supports Caffe, Torch and Theano models ~30-40x Speedup using mobile GPU vs CPU (AlexNet) Internally, CNNdroid expresses data parallelism for different layers, instead of leaving to the GPU’s hardware scheduler
16. DeepLearningKit Platform : iOS, OS X and tvOS (Apple TV) DNN Type : CNNs models trained in Caffe Runs on mobile GPU, uses Metal Pro : Fast, directly ingests Caffe models Con : Unmaintained
17. Caffe Caffe for Android https://github.com/sh1r0/caffe-android-lib Sample app https://github.com/sh1r0/caffe-android-demo Caffe for iOS : https://github.com/aleph7/caffe Sample app https://github.com/noradaiko/caffe-ios-sample Pro : Usually couple of lines to port a pretrained model to mobile CPU Con : Unmaintained
18. Running pre-trained models on mobile Mobile Library Platform GPU DNN Architecture Supported Trained Models Supported Tensorflow iOS/Android Yes CNN,RNN,LSTM, etc Tensorflow CNNDroid Android Yes CNN Caffe, Torch, Theano DeepLearningKit iOS Yes CNN Caffe MXNet iOS/Android No CNN,RNN,LSTM, etc MXNet Caffe iOS/Android No CNN Caffe Torch iOS/Android No CNN,RNN,LSTM, etc Torch
19. Building a DL App in 1 week
20. Learn Playing an Accordion 3 months
21. Learn Playing an Accordion 3 months Knows Piano Fine Tune Skills 1 week
22. I got a dataset, Now What? Step 1 : Find a pre-trained model Step 2 : Fine tune a pre-trained model Step 3 : Run using existing frameworks “Don’t Be A Hero” - Andrej Karpathy
23. How to find pretrained models for my task? Search “Model Zoo” Microsoft Cognitive Toolkit (previously called CNTK) – 50 Models Caffe Model Zoo Keras Tensorflow MXNet
24. AlexNet, 2012 (simplified) [Krizhevsky, Sutskever,Hinton’12] Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Ng, “Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks”, 11 n-dimension Feature representation
25. Deciding how to fine tune Size of New Dataset Similarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
26. Deciding when to fine tune Size of New Dataset Similarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
27. Deciding when to fine tune Size of New Dataset Similarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
28. Deciding when to fine tune Size of New Dataset Similarity to Original Dataset What to do? Large High Fine tune. Small High Don’t Fine Tune, it will overfit. Train linear classifier on CNN Features Small Low Train a classifier from activations in lower layers. Higher layers are dataset specific to older dataset. Large Low Train CNN from scratch http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
29. Building a DL Website in 1 week
30. Less Data + Smaller Networks = Faster browser training
31. Several JavaScript Libraries Run large CNNs • Keras-JS • MXNetJS • CaffeJS Train and Run CNNs • ConvNetJS Train and Run LSTMs • Brain.js • Synaptic.js Train and Run NNs • Mind.js • DN2A
32. ConvNetJS Both Train and Test NNs in browser Train CNNs in browser
33. Keras.js Run Keras models in browser, with GPU support.
34. Brain.JS Train and run NNs in browser Supports Feedforward, RNN, LSTM, GRU No CNNs Demo : http://brainjs.com/ Trained NN to recognize color contrast
35. MXNetJS On Firefox and Microsoft Edge, performance is 8x faster than Chrome. Optimization difference because of ASM.js.
36. Building a DL App in 1 month (and get featured in Apple App store)
37. Response Time Limits – Powers of 10 0.1 second : Reacting instantly 1.0 seconds : User’s flow of thought 10 seconds : Keeping the user’s attention [Miller 1968; Card et al. 1991; Jakob Nielsen 1993]:
38. Apple frameworks for Deep Learning Inference BNNS – Basic Neural Network Subroutine MPS – Metal Performance Shaders
39. Metal Performance Shaders (MPS) Fast, Provides GPU acceleration for inference phase Faster app load times than Tensorflow (Jan 2017) About 1/3rd the run time memory of Tensorflow on Inception-V3 (Jan 2017) ~130 ms on iPhone 7S Plus to run Inception-V3 Cons: • Limited documentation. • No easy way to programmatically port models. • No batch normalization. Solution : Join Conv and BatchNorm weights
40. Putting out more frames than an art gallery
41. Basic Neural Network Subroutines (BNNS) Runs on CPU BNNS is faster for smaller networks than MPS but slower for bigger networks
42. BrainCore NN Framework for iOS Provides LSTMs functionality Fast, uses Metal, runs on iPhone GPU https://github.com/aleph7/braincore
43. Building a DL App in 6 months
44. What you want https://www.flickr.com/photos/kenjonbro/9075514760/ and http://www.newcars.com/land-rover/range-rover-sport/2016 $2000$200,000 What you can afford
45. 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 AlexNet, 8 layers (ILSVRC 2012) Revolution of Depth Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015
46. 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 AlexNet, 8 layers (ILSVRC 2012) 3x3 conv, 64 3x3 conv, 64, pool/2 3x3 conv, 128 3x3 conv, 128, pool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256, pool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512, pool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512, pool/2 fc, 4096 fc, 4096 fc, 1000 VGG, 19 layers (ILSVRC 2014) input Conv 7x7+ 2(S) MaxPool 3x3+ 2(S) LocalRespNorm Conv 1x1+ 1(V) Conv 3x3+ 1(S) LocalRespNorm MaxPool 3x3+ 2(S) Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat MaxPool 3x3+ 2(S) Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) AveragePool 5x5+ 3(V) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) AveragePool 5x5+ 3(V) Dept hConcat MaxPool 3x3+ 2(S) Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hConcat AveragePool 7x7+ 1(V) FC Conv 1x1+ 1(S) FC FC Soft maxAct ivat ion soft max0 Conv 1x1+ 1(S) FC FC Soft maxAct ivat ion soft max1 Soft maxAct ivat ion soft max2 GoogleNet, 22 layers (ILSVRC 2014) Revolution of Depth Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015
47. AlexNet, 8 layers (ILSVRC 2012) ResNet, 152 layers (ILSVRC 2015) 3x3 conv, 64 3x3 conv, 64, pool/2 3x3 conv, 128 3x3 conv, 128, pool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256, pool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512, pool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512, pool/2 fc, 4096 fc, 4096 fc, 1000 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x2 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 256, /2 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 512, /2 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 ave pool, fc 1000 7x7 conv, 64, /2, pool/2 VGG, 19 layers (ILSVRC 2014) Revolution of Depth Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015 Ultra deep
48. ResNet, 152 layers 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x2 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 7x7 conv, 64, /2, pool/2 Revolution of Depth Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015
49. 3.57 6.7 7.3 11.7 16.4 25.8 28.2 ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ILSVRC'12 AlexNet ILSVRC'11 ILSVRC'10 ImageNet Classification top-5 error (%) shallow8 layers 19 layers22 layers 152 layers Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”, 2015 8 layers Revolution of Depth
50. Your Budget - Smartphone Floating Point Operations Per Second (2015) http://pages.experts-exchange.com/processing-power-compared/
51. Accuracy vs Operations Per Image Inference Size is proportional to num parameters Alfredo Canziani, Adam Paszke, Eugenio Culurciello, “An Analysis of Deep Neural Network Models for Practical Applications” 2016 552 MB 240 MB What we want
52. Accuracy Per Parameter Alfredo Canziani, Adam Paszke, Eugenio Culurciello, “An Analysis of Deep Neural Network Models for Practical Applications” 2016
53. Pick your DNN Architecture for your mobile architecture Resnet Family Under 150 ms on iPhone 7 using Metal GPU Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, "Deep Residual Learning for Image Recognition”, 2015
54. Strategies to make DNNs even more efficient Shallow networks Compressing pre-trained networks Designing compact layers Quantizing parameters Network binarization
55. Pruning Aim : Remove all connections with absolute weights below a threshold Song Han, Jeff Pool, John Tran, William J. Dally, "Learning both Weights and Connections for Efficient Neural Networks", 2015
56. Observation : Most parameters in Fully Connected Layers AlexNet 240 MB VGG-16 552 MB 96% of all parameters 90% of all parameters
57. Pruning gets quickest model compression without accuracy loss AlexNet 240 MB VGG-16 552 MB First layer which directly interacts with image is sensitive and cannot be pruned too much without hurting accuracy
58. Weight Sharing Idea : Cluster weights with similar values together, and store in a dictionary. Codebook Huffman coding HashedNets Simplest implementation: • Round all weights into 256 levels • Tensorflow export script reduces inception zip file from 87 MB to 26 MB with 1% drop in precision
59. Selective training to keep networks shallow Idea : Augment data limited to how your network will be used Example : If making a selfie app, no benefit in rotating training images beyond +-45 degrees. Your phone will anyway rotate. Followed by WordLens / Google Translate Example : Add blur if analyzing mobile phone frames
60. Design consideration for custom architectures – Small Filters Three layers of 3x3 convolutions >> One layer of 7x7 convolution Replace large 5x5, 7x7 convolutions with stacks of 3x3 convolutions Replace NxN convolutions with stack of 1xN and Nx1 Fewer parameters  Less compute  More non-linearity  Better Faster Stronger Andrej Karpathy, CS-231n Notes, Lecture 11
61. SqueezeNet - AlexNet-level accuracy in 0.5 MB SqueezeNet base 4.8 MB SqueezeNet compressed 0.5 MB 80.3% top-5 Accuracy on ImageNet 0.72 GFLOPS/image Fire Block Forrest N. Iandola, Song Han et al, "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size"
62. Reduced precision Reduce precision from 32 bits to <=16 bits or lesser Use stochastic rounding for best results In Practice: • Ristretto + Caffe • Automatic Network quantization • Finds balance between compression rate and accuracy • Apple Metal Performance Shaders automatically quantize to 16 bits • Tensorflow has 8 bit quantization support • Gemmlowp – Low precision matrix multiplication library
63. Binary weighted Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
64. Binary weighted Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
65. Binary weighted Networks Idea :Reduce the weights to -1,+1 Speedup : Convolution operation can be approximated by only summation and subtraction Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
66. XNOR-Net Idea :Reduce both weights + inputs to -1,+1 Speedup : Convolution operation can be approximated by XNOR and Bitcount operations Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
67. XNOR-Net Idea :Reduce both weights + inputs to -1,+1 Speedup : Convolution operation can be approximated by XNOR and Bitcount operations Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
68. XNOR-Net Idea :Reduce both weights + inputs to -1,+1 Speedup : Convolution operation can be approximated by XNOR and Bitcount operations Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks”
69. XNOR-Net on Mobile
70. Building a DL App and get $10 million in funding (or a PhD)
71. Minerva
72. Minerva
73. DeepX Toolkit Nicholas D. Lane et al, “DXTK : Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit",2016
74. EIE : Efficient Inference Engine on Compressed DNNs Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, William Dally, "EIE: Efficient Inference Engine on Compressed Deep Neural Network", 2016 189x faster on CPU 13x faster on GPU
75. One Last Question
76. How to access the slides in 1 second Link posted here -> @anirudhkoul
Top DSC Resources
Article: Difference between Machine Learning, Data Science, AI, Deep Learnin...
Article: What is Data Science? 24 Fundamental Articles Answering This Question
Article: Hitchhiker's Guide to Data Science, Machine Learning, R, Python
Tutorial: Data Science Cheat Sheet
Tutorial: How to Become a Data Scientist - On Your Own
Tutorial: Advanced Machine Learning with Basic Excel
Categories: Data Science - Machine Learning - AI - IoT - Deep Learning
Tools: Hadoop - DataViZ - Python - R - SQL - Excel
Techniques: Clustering - Regression - SVM - Neural Nets - Ensembles - Decision Trees
Links: Cheat Sheets - Books - Events - Webinars - Tutorials - Training - News - Jobs
Links: Announcements - Salary Surveys - Data Sets - Certification - RSS Feeds - About Us
Newsletter: Sign-up - Past Editions - Members-Only Section - Content Search - For Bloggers
DSC on: Ning - Twitter - LinkedIn - Facebook - GooglePlus
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Views: 265
Like Share
< Previous Post
Next Post >
Comment
You need to be a member of Data Science Central to add comments!
Join Data Science Central
© 2017 Data Science Central Powered byWebsite builder | Create website | Ning.com Badges | Report an Issue | Privacy Policy | Terms of Service

'Deep Learning' 카테고리의 다른 글

Why Momentum Really Works (0)	2017.04.05
What is the best probabilistic graphical model toolkit for MATLAB? (0)	2017.03.19
Winning Tips on Machine Learning Competitions by Kazanova, Current Kaggle #3 \| HackerEarth Blog (0)	2017.03.13
[NIPS 2016 tutorial - Summary] Nuts and bolts of building AI applications using Deep Learning (1)	2017.03.09
Android에서 TensorFlow 실행하기 (0)	2017.03.09

Posted by uniqueone

Be the only one, not the best one

스마트폰에서 딥러닝 실행방법-Squeezing Deep Learning into Mobile Phones - A Practitioner's guide

'Deep Learning' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바