Using [#ComputerVision](https://twitter.com/hashtag/ComputerVision?src=hashtag_c

Computer Vision/Source Code 2020. 7. 7. 11:01

https://www.facebook.com/groups/computervisionandimageprocessing/permalink/3083736145029353/

Using [#ComputerVision](https://twitter.com/hashtag/ComputerVision?src=hashtag_click) to control a teddy [#robot](https://twitter.com/hashtag/robot?src=hashtag_click) avatar

[#AI](https://twitter.com/hashtag/AI?src=hashtag_click) [#Robotics](https://twitter.com/hashtag/Robotics?src=hashtag_click) [#ML](https://twitter.com/hashtag/ML?src=hashtag_click) [#MachineLearning](https://twitter.com/hashtag/MachineLearning?src=hashtag_click)

Posted by uniqueone

,

YouTube에서 'Image to Text with Python - pytesseract' 보기

Computer Vision/Resources 2020. 4. 14. 16:19

https://youtu.be/4DrCIVS5U3Y

'Computer Vision > Resources' 카테고리의 다른 글

Papers with code. Sorted by stars. Updated weekly. (0)	2019.10.17

Posted by uniqueone

,

Computer Vision Specialization | Coursera. Learn how computers visually process the world. Dive into artificial intelligence’s fastest-growing field There are 4 Courses in this Specialization: COURSE 1) Computer Vision Basics. COURSE 2) Image Processi..

Computer Vision/Course 2019. 11. 18. 12:06

Computer Vision Specialization | Coursera.
Learn how computers visually process the world. Dive into artificial intelligence’s fastest-growing field
There are 4 Courses in this Specialization:
COURSE 1) Computer Vision Basics.
COURSE 2) Image Processing, Features & Segmentation.
COURSE 3) Stereo Vision, Dense Motion & Tracking.
COURSE 4) Visual Recognition & Understanding.
http://bit.ly/31Xgm15
https://www.facebook.com/groups/computervisionandimageprocessing/permalink/2561374053932234/?sfnsn=mo

'Computer Vision > Course' 카테고리의 다른 글

EENG 512 - Computer Vision - 유튜브 동영상 강의 (0)	2019.06.13
Computer Vision for Visual Effects - 유튜브 동영상 강의 (0)	2019.06.13

Posted by uniqueone

,

며칠전에 CoRL 19 가 오사카에서 있었는데요 페이퍼들이 다 올라왔습니다 https://www.robot-learning.org/home/program/friday-november-1 리뷰도 제공하네요 오 대부분이 RL 관련 논문이긴 했지만 (RL몰라요 ..

Computer Vision 2019. 11. 5. 23:06

며칠전에 CoRL 19 가 오사카에서 있었는데요

페이퍼들이 다 올라왔습니다 https://www.robot-learning.org/home/program/friday-november-1

리뷰도 제공하네요 오

대부분이 RL 관련 논문이긴 했지만 (RL몰라요 ㅠ)

그 중에서 SLAM 및 perception 쪽 CoRL19 논문들을 모아보았습니다. 해당 비디오 링크도 함께 정리하였습니다.

@ TUM 크레머

Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in Low Light

https://drive.google.com/file/d/15eIyWTVelmiCqg_x111FKrPtOjiZBdde/view

https://youtu.be/glMwF5-q51E?t=7999

@ 옥스포드

Masking by Moving: Learning Distraction-Free Radar Odometry from Pose Information

https://drive.google.com/open?id=1mqKLXOEN18Ig88hCetHfJksAS12Lmbxn

https://youtu.be/glMwF5-q51E?t=11527

@ Toyota

Two Stream Networks for Self-Supervised Ego-Motion Estimation

https://drive.google.com/file/d/1itLb8wM9JV3FIftCHH5Mw12Q-qHSKxwf/view

https://youtu.be/b7StSnt85S4?t=12810

@ 우버 ATG

Identifying Unknown Instances for Autonomous Driving

https://drive.google.com/file/d/1e_kBfHEL9adDWwhuhSrOHz-CUqRbLe2T/view?usp=drive_open

https://youtu.be/b7StSnt85S4?t=27917

@ Toyota

Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances

https://drive.google.com/file/d/1DsnCqZ42VtWR2_AJqf6IttUdEKy6omIG/view

https://youtu.be/b7StSnt85S4?t=28102

@ 웨이모

End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds

https://drive.google.com/file/d/18LymYUkOVkj-xbKQElHzECW1VDbnx4wY/view

https://youtu.be/QaCuEv_7lfs?t=12515
https://www.facebook.com/groups/slamkr/permalink/1007915666234744/?sfnsn=mo

'Computer Vision' 카테고리의 다른 글

SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments (ICCV 2019 Workshop paper) 네이버랩스 유럽에서 나온 다이나믹 환경에 강건하게 동작하는 VSLAM 알고리즘입니다 ㅎ 코드는 공개되어 있는데.. (0)	2019.11.01
파일 1개로 간단하게 mono VO를 구성한 예제가 있네요 입문용으로 좋을듯합니다. https://github.com/avisingh599/mono-vo/tree/master/src 블로그에 상세한 설명과 리포트도 있습니다. http://avisingh599.github.. (0)	2019.10.22
Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490 (0)	2019.10.18
3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02

Posted by uniqueone

,

Visual localization 등에서, 2017년만 해도 6D pose 를 CNN이 바로 regress 해주는게 신비로운 시절이 있었는데요, 하지만 DSAC 이라는게 나오면서 "learning less" 하는게 성능이 더 낫다라는걸 알게되고..

Computer Vision/Geometry 2019. 11. 4. 12:15

Visual localization 등에서,

2017년만 해도 6D pose 를 CNN이 바로 regress 해주는게 신비로운 시절이 있었는데요,

하지만 DSAC 이라는게 나오면서 "learning less" 하는게 성능이 더 낫다라는걸 알게되고

understanding the limitations of cnn-based absolute camera pose regression

위 논문 에서는 이론적으로 direct pose regression 이 안좋을수밖에 없는 이유를 Sattler 가 설명했습니다.

이번 월요일에 ICCV localization tutorial 에서

DSAC저자인 Eric 님이 발표하신 자료가 올라왔길래 공유합니다 :)

사설이 길었네요 ㅋㅋ
https://www.facebook.com/groups/slamkr/permalink/1006108736415437/?sfnsn=mo

Posted by uniqueone

,

SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments (ICCV 2019 Workshop paper) 네이버랩스 유럽에서 나온 다이나믹 환경에 강건하게 동작하는 VSLAM 알고리즘입니다 ㅎ 코드는 공개되어 있는데..

Computer Vision 2019. 11. 1. 08:01

SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments (ICCV 2019 Workshop paper)

네이버랩스 유럽에서 나온 다이나믹 환경에 강건하게 동작하는 VSLAM 알고리즘입니다 ㅎ

코드는 공개되어 있는데 논문이 안나와 있어서 자세히 판단은 못하겠지만

세만틱 정보를 이용해서 다이나믹하게 움직이는 물체로 분류되는 클래스는 localization and mapping에 활용하지

않는 것으로 보입니다. 하지만 주행중 정차된 차량에 대해서는 피쳐를 활용하는 것으로 보아 디테일한 부분이 있을것으로 보여지네요~

[https://github.com/mthz/slamantic](https://github.com/mthz/slamantic)
https://www.facebook.com/groups/slamkr/permalink/1004457316580579/?sfnsn=mo

'Computer Vision' 카테고리의 다른 글

며칠전에 CoRL 19 가 오사카에서 있었는데요 페이퍼들이 다 올라왔습니다 https://www.robot-learning.org/home/program/friday-november-1 리뷰도 제공하네요 오 대부분이 RL 관련 논문이긴 했지만 (RL몰라요 .. (0)	2019.11.05
파일 1개로 간단하게 mono VO를 구성한 예제가 있네요 입문용으로 좋을듯합니다. https://github.com/avisingh599/mono-vo/tree/master/src 블로그에 상세한 설명과 리포트도 있습니다. http://avisingh599.github.. (0)	2019.10.22
Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490 (0)	2019.10.18
3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02

Posted by uniqueone

,

파일 1개로 간단하게 mono VO를 구성한 예제가 있네요 입문용으로 좋을듯합니다. https://github.com/avisingh599/mono-vo/tree/master/src 블로그에 상세한 설명과 리포트도 있습니다. http://avisingh599.github..

Computer Vision 2019. 10. 22. 11:45

파일 1개로 간단하게 mono VO를 구성한 예제가 있네요

입문용으로 좋을듯합니다.

https://github.com/avisingh599/mono-vo/tree/master/src

블로그에 상세한 설명과 리포트도 있습니다.

http://avisingh599.github.io/vision/monocular-vo/

http://avisingh599.github.io/assets/ugp2-report.pdf
https://www.facebook.com/groups/slamkr/permalink/995242584168719/?sfnsn=mo

'Computer Vision' 카테고리의 다른 글

며칠전에 CoRL 19 가 오사카에서 있었는데요 페이퍼들이 다 올라왔습니다 https://www.robot-learning.org/home/program/friday-november-1 리뷰도 제공하네요 오 대부분이 RL 관련 논문이긴 했지만 (RL몰라요 .. (0)	2019.11.05
SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments (ICCV 2019 Workshop paper) 네이버랩스 유럽에서 나온 다이나믹 환경에 강건하게 동작하는 VSLAM 알고리즘입니다 ㅎ 코드는 공개되어 있는데.. (0)	2019.11.01
Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490 (0)	2019.10.18
3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02

Posted by uniqueone

,

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490

Computer Vision 2019. 10. 18. 11:15

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping
Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be
Code: https://github.com/MIT-SPARK/Kimera
Paper: https://arxiv.org/abs/1910.02490
https://www.facebook.com/groups/1738168866424224/permalink/2427276487513455/?sfnsn=mo

'Computer Vision' 카테고리의 다른 글

SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments (ICCV 2019 Workshop paper) 네이버랩스 유럽에서 나온 다이나믹 환경에 강건하게 동작하는 VSLAM 알고리즘입니다 ㅎ 코드는 공개되어 있는데.. (0)	2019.11.01
파일 1개로 간단하게 mono VO를 구성한 예제가 있네요 입문용으로 좋을듯합니다. https://github.com/avisingh599/mono-vo/tree/master/src 블로그에 상세한 설명과 리포트도 있습니다. http://avisingh599.github.. (0)	2019.10.22
3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02
GrabCut, GraphCut matlab Source code (0)	2018.06.18

Posted by uniqueone

,

Papers with code. Sorted by stars. Updated weekly.

Computer Vision/Resources 2019. 10. 17. 10:24

https://github.com/zziz/pwc

20182017201620152014201320122011201020092008Suggestions

This work is in continuous progress and update. We are adding new PWC everyday! Tweet me @fvzaur
Use this thread to request us your favorite conference to be added to our watchlist and to PWC list.

Weekly updated pushed!

2018

TitleConfCodeStars

Video-to-Video Synthesis	NIPS	code	5578
Deep Image Prior	CVPR	code	3736
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation	CVPR	code	3405
Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network	ECCV	code	2434
Learning to See in the Dark	CVPR	code	2326
Glow: Generative Flow with Invertible 1x1 Convolutions	NIPS	code	2088
Squeeze-and-Excitation Networks	CVPR	code	1477
Efficient Neural Architecture Search via Parameters Sharing	ICML	code	1382
Multimodal Unsupervised Image-to-image Translation	ECCV	code	1296
Non-Local Neural Networks	CVPR	code	992
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?	CVPR	code	924
Single-Shot Refinement Neural Network for Object Detection	CVPR	code	875
Image Generation From Scene Graphs	CVPR	code	851
GANimation: Anatomically-aware Facial Animation from a Single Image	ECCV	code	772
Simple Baselines for Human Pose Estimation and Tracking	ECCV	code	752
Visualizing the Loss Landscape of Neural Nets	NIPS	code	724
Detect-and-Track: Efficient Pose Estimation in Videos	CVPR	code	650
Relation Networks for Object Detection	CVPR	code	635
Generative Image Inpainting With Contextual Attention	CVPR	code	609
PointCNN	NIPS	code	607
Look at Boundary: A Boundary-Aware Face Alignment Algorithm	CVPR	code	575
Pelee: A Real-Time Object Detection System on Mobile Devices	NIPS	code	548
Distractor-aware Siamese Networks for Visual Object Tracking	ECCV	code	545
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples	ICML	code	535
Which Training Methods for GANs do actually Converge?	ICML	code	520
End-to-End Recovery of Human Shape and Pose	CVPR	code	502
Taskonomy: Disentangling Task Transfer Learning	CVPR	code	502
Cascaded Pyramid Network for Multi-Person Pose Estimation	CVPR	code	497
Neural 3D Mesh Renderer	CVPR	code	489
Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs	CVPR	code	489
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs	CVPR	code	485
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric	CVPR	code	447
Frustum PointNets for 3D Object Detection From RGB-D Data	CVPR	code	434
The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks	CVPR	code	416
ICNet for Real-Time Semantic Segmentation on High-Resolution Images	ECCV	code	415
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume	CVPR	code	398
Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++	CVPR	code	397
Gibson Env: Real-World Perception for Embodied Agents	CVPR	code	385
Acquisition of Localization Confidence for Accurate Object Detection	ECCV	code	384
Noise2Noise: Learning Image Restoration without Clean Data	ICML	code	370
GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation	CVPR	code	359
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose	CVPR	code	359
A Style-Aware Content Loss for Real-time HD Style Transfer	ECCV	code	349
Soccer on Your Tabletop	CVPR	code	338
Pyramid Stereo Matching Network	CVPR	code	335
Neural Baby Talk	CVPR	code	332
License Plate Detection and Recognition in Unconstrained Scenarios	ECCV	code	326
Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors	CVPR	code	326
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images	ECCV	code	323
Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning	CVPR	code	317
Fast End-to-End Trainable Guided Filter	CVPR	code	312
Deep Clustering for Unsupervised Learning of Visual Features	ECCV	code	302
Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs	CVPR	code	294
Neural Relational Inference for Interacting Systems	ICML	code	289
Adversarially Regularized Autoencoders	ICML	code	282
Learning to Adapt Structured Output Space for Semantic Segmentation	CVPR	code	280
Convolutional Neural Networks With Alternately Updated Clique	CVPR	code	272
Learning to Segment Every Thing	CVPR	code	269
Supervising Unsupervised Learning	NIPS	code	262
LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation	CVPR	code	261
Bilinear Attention Networks	NIPS	code	258
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation	ECCV	code	254
An intriguing failing of convolutional neural networks and the CoordConv solution	NIPS	code	249
End-to-End Learning of Motion Representation for Video Understanding	CVPR	code	238
Image Super-Resolution Using Very Deep Residual Channel Attention Networks	ECCV	code	234
Iterative Visual Reasoning Beyond Convolutions	CVPR	code	228
Semi-Parametric Image Synthesis	CVPR	code	226
Compressed Video Action Recognition	CVPR	code	225
Style Aggregated Network for Facial Landmark Detection	CVPR	code	223
Pose-Robust Face Recognition via Deep Residual Equivariant Mapping	CVPR	code	220
Multi-Content GAN for Few-Shot Font Style Transfer	CVPR	code	218
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models	ICML	code	214
Referring Relationships	CVPR	code	210
MoCoGAN: Decomposing Motion and Content for Video Generation	CVPR	code	205
Latent Alignment and Variational Attention	NIPS	code	204
LayoutNet: Reconstructing the 3D Room Layout From a Single RGB Image	CVPR	code	202
Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs	CVPR	code	197
An End-to-End TextSpotter With Explicit Alignment and Attention	CVPR	code	195
DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks	CVPR	code	189
SPLATNet: Sparse Lattice Networks for Point Cloud Processing	CVPR	code	188
Attentive Generative Adversarial Network for Raindrop Removal From a Single Image	CVPR	code	186
Single View Stereo Matching	CVPR	code	182
MegaDepth: Learning Single-View Depth Prediction From Internet Photos	CVPR	code	181
ECO: Efficient Convolutional Network for Online Video Understanding	ECCV	code	180
Unsupervised Feature Learning via Non-Parametric Instance Discrimination	CVPR	code	180
ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing	CVPR	code	179
Video Based Reconstruction of 3D People Models	CVPR	code	179
Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks	CVPR	code	178
Learning Category-Specific Mesh Reconstruction from Image Collections	ECCV	code	176
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms	NIPS	code	175
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation	ECCV	code	175
Group Normalization	ECCV	code	175
Real-Time Seamless Single Shot 6D Object Pose Prediction	CVPR	code	174
MVSNet: Depth Inference for Unstructured Multi-view Stereo	ECCV	code	174
Neural Motifs: Scene Graph Parsing With Global Context	CVPR	code	171
Learning a Single Convolutional Super-Resolution Network for Multiple Degradations	CVPR	code	169
Optimizing Video Object Detection via a Scale-Time Lattice	CVPR	code	168
MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network	ECCV	code	167
Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns	CVPR	code	166
Weakly Supervised Instance Segmentation Using Class Peak Response	CVPR	code	166
PlaneNet: Piece-Wise Planar Reconstruction From a Single RGB Image	CVPR	code	164
Residual Dense Network for Image Super-Resolution	CVPR	code	163
Embodied Question Answering	CVPR	code	162
Evolved Policy Gradients	NIPS	code	160
Camera Style Adaptation for Person Re-Identification	CVPR	code	159
Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer	CVPR	code	159
Scale-Recurrent Network for Deep Image Deblurring	CVPR	code	159
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction	CVPR	code	158
Relational recurrent neural networks	NIPS	code	157
Densely Connected Pyramid Dehazing Network	CVPR	code	155
Image Inpainting for Irregular Holes Using Partial Convolutions	ECCV	code	153
SO-Net: Self-Organizing Network for Point Cloud Analysis	CVPR	code	152
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling	CVPR	code	152
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices	CVPR	code	152
DenseASPP for Semantic Segmentation in Street Scenes	CVPR	code	151
Facelet-Bank for Fast Portrait Manipulation	CVPR	code	150
Self-Imitation Learning	ICML	code	145
Graph R-CNN for Scene Graph Generation	ECCV	code	144
A Closer Look at Spatiotemporal Convolutions for Action Recognition	CVPR	code	143
Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation	CVPR	code	143
Quantized Densely Connected U-Nets for Efficient Landmark Localization	ECCV	code	143
Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining	ECCV	code	142
Two-Stream Convolutional Networks for Dynamic Texture Synthesis	CVPR	code	141
Integral Human Pose Regression	ECCV	code	141
Adaptive Affinity Fields for Semantic Segmentation	ECCV	code	141
LSTM Pose Machines	CVPR	code	141
Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships	CVPR	code	140
Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform	CVPR	code	139
Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification	CVPR	code	137
Learning to Compare: Relation Network for Few-Shot Learning	CVPR	code	135
CosFace: Large Margin Cosine Loss for Deep Face Recognition	CVPR	code	135
Deep Depth Completion of a Single RGB-D Image	CVPR	code	134
Deep Back-Projection Networks for Super-Resolution	CVPR	code	132
Context Embedding Networks	CVPR	code	131
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics	CVPR	code	131
Perturbative Neural Networks	CVPR	code	130
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis	ICML	code	129
Fast and Accurate Online Video Object Segmentation via Tracking Parts	CVPR	code	129
Nonlinear 3D Face Morphable Model	CVPR	code	128
BodyNet: Volumetric Inference of 3D Human Body Shapes	ECCV	code	126
3D-CODED: 3D Correspondences by Deep Deformation	ECCV	code	125
DeepMVS: Learning Multi-View Stereopsis	CVPR	code	125
Hierarchical Imitation and Reinforcement Learning	ICML	code	124
Domain Adaptive Faster R-CNN for Object Detection in the Wild	CVPR	code	123
L4: Practical loss-based stepsize adaptation for deep learning	NIPS	code	123
A Generative Adversarial Approach for Zero-Shot Learning From Noisy Texts	CVPR	code	122
Recurrent Relational Networks	NIPS	code	121
Gated Path Planning Networks	ICML	code	121
PSANet: Point-wise Spatial Attention Network for Scene Parsing	ECCV	code	121
Rethinking Feature Distribution for Loss Functions in Image Classification	CVPR	code	120
Density-Aware Single Image De-Raining Using a Multi-Stream Dense Network	CVPR	code	118
FOTS: Fast Oriented Text Spotting With a Unified Network	CVPR	code	118
ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes	ECCV	code	117
PU-Net: Point Cloud Upsampling Network	CVPR	code	117
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning	CVPR	code	117
Long-term Tracking in the Wild: a Benchmark	ECCV	code	116
Factoring Shape, Pose, and Layout From the 2D Image of a 3D Scene	CVPR	code	114
Repulsion Loss: Detecting Pedestrians in a Crowd	CVPR	code	113
Unsupervised Attention-guided Image-to-Image Translation	NIPS	code	110
Attention-based Deep Multiple Instance Learning	ICML	code	109
Learning Blind Video Temporal Consistency	ECCV	code	109
Noisy Natural Gradient as Variational Inference	ICML	code	108
End-to-End Weakly-Supervised Semantic Alignment	CVPR	code	106
Decoupled Networks	CVPR	code	105
LiDAR-Video Driving Dataset: Learning Driving Policies Effectively	CVPR	code	104
MAttNet: Modular Attention Network for Referring Expression Comprehension	CVPR	code	104
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks	ECCV	code	103
FSRNet: End-to-End Learning Face Super-Resolution With Facial Priors	CVPR	code	100
Deep Mutual Learning	CVPR	code	100
Macro-Micro Adversarial Network for Human Parsing	ECCV	code	98
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans	CVPR	code	97
Learning Depth From Monocular Videos Using Direct Methods	CVPR	code	97
VITON: An Image-Based Virtual Try-On Network	CVPR	code	95
Cascade R-CNN: Delving Into High Quality Object Detection	CVPR	code	93
Learning Human-Object Interactions by Graph Parsing Neural Networks	ECCV	code	93
Future Frame Prediction for Anomaly Detection – A New Baseline	CVPR	code	92
Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence	ECCV	code	92
Tell Me Where to Look: Guided Attention Inference Network	CVPR	code	91
Neural Kinematic Networks for Unsupervised Motion Retargetting	CVPR	code	90
Learning SO(3) Equivariant Representations with Spherical CNNs	ECCV	code	89
One-Shot Unsupervised Cross Domain Translation	NIPS	code	89
Synthesizing Images of Humans in Unseen Poses	CVPR	code	88
Depth-aware CNN for RGB-D Segmentation	ECCV	code	88
Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights	ECCV	code	88
Knowledge Aided Consistency for Weakly Supervised Phrase Grounding	CVPR	code	87
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes	CVPR	code	87
Neural Arithmetic Logic Units	NIPS	code	87
A PID Controller Approach for Stochastic Optimization of Deep Networks	CVPR	code	87
VITAL: VIsual Tracking via Adversarial Learning	CVPR	code	86
Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking	CVPR	code	86
Recurrent Pixel Embedding for Instance Grouping	CVPR	code	85
SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation	CVPR	code	84
Multi-Scale Location-Aware Kernel Representation for Object Detection	CVPR	code	84
Repeatability Is Not Enough: Learning Affine Regions via Discriminability	ECCV	code	84
“Zero-Shot” Super-Resolution Using Deep Internal Learning	CVPR	code	84
DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency	ECCV	code	82
Multi-View Consistency as Supervisory Signal for Learning Shape and Pose Prediction	CVPR	code	80
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation	ECCV	code	78
Generalizing A Person Retrieval Model Hetero- and Homogeneously	ECCV	code	78
Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning	CVPR	code	77
Pairwise Confusion for Fine-Grained Visual Classification	ECCV	code	77
Learning to Reweight Examples for Robust Deep Learning	ICML	code	76
Improving Generalization via Scalable Neighborhood Component Analysis	ECCV	code	76
SparseMAP: Differentiable Sparse Structured Inference	ICML	code	75
PDE-Net: Learning PDEs from Data	ICML	code	75
Pose-Normalized Image Generation for Person Re-identification	ECCV	code	75
Disentangled Person Image Generation	CVPR	code	75
Learning to Navigate for Fine-grained Classification	ECCV	code	74
Superpixel Sampling Networks	ECCV	code	74
Shift-Net: Image Inpainting via Deep Feature Rearrangement	ECCV	code	74
3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation	ECCV	code	74
Ordinal Depth Supervision for 3D Human Pose Estimation	CVPR	code	74
Path-Level Network Transformation for Efficient Architecture Search	ICML	code	73
Diverse Image-to-Image Translation via Disentangled Representations	ECCV	code	72
Visual Feature Attribution Using Wasserstein GANs	CVPR	code	72
Real-World Anomaly Detection in Surveillance Videos	CVPR	code	72
Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval	CVPR	code	72
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image	ECCV	code	72
Learning to Find Good Correspondences	CVPR	code	72
Learning Less Is More - 6D Camera Localization via 3D Surface Regression	CVPR	code	72
Object Level Visual Reasoning in Videos	ECCV	code	71
Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing	CVPR	code	71
Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration	CVPR	code	71
Fast and Accurate Single Image Super-Resolution via Information Distillation Network	CVPR	code	71
Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present	CVPR	code	70
Multi-Shot Pedestrian Re-Identification via Sequential Decision Making	CVPR	code	70
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition	CVPR	code	69
Progressive Neural Architecture Search	ECCV	code	68
Generative Neural Machine Translation	NIPS	code	68
Learning Latent Super-Events to Detect Multiple Activities in Videos	CVPR	code	67
Generate to Adapt: Aligning Domains Using Generative Adversarial Networks	CVPR	code	67
Adversarial Feature Augmentation for Unsupervised Domain Adaptation	CVPR	code	67
Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking	CVPR	code	67
Pointwise Convolutional Neural Networks	CVPR	code	67
Optimizing the Latent Space of Generative Networks	ICML	code	66
Part-Aligned Bilinear Representations for Person Re-Identification	ECCV	code	64
Geometry-Aware Learning of Maps for Camera Localization	CVPR	code	63
Fighting Fake News: Image Splice Detection via Learned Self-Consistency	ECCV	code	62
Isolating Sources of Disentanglement in Variational Autoencoders	NIPS	code	62
Neural Program Synthesis from Diverse Demonstration Videos	ICML	code	62
Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation	ECCV	code	61
Rotation-Sensitive Regression for Oriented Scene Text Detection	CVPR	code	61
Human Semantic Parsing for Person Re-Identification	CVPR	code	61
Unsupervised Discovery of Object Landmarks as Structural Representations	CVPR	code	61
IQA: Visual Question Answering in Interactive Environments	CVPR	code	60
Hierarchical Long-term Video Prediction without Supervision	ICML	code	60
Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency	ECCV	code	60
Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning	CVPR	code	59
Neural Style Transfer via Meta Networks	CVPR	code	59
Frame-Recurrent Video Super-Resolution	CVPR	code	58
PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction	ECCV	code	57
CBAM: Convolutional Block Attention Module	ECCV	code	57
Decorrelated Batch Normalization	CVPR	code	57
Learning Conditioned Graph Structures for Interpretable Visual Question Answering	NIPS	code	57
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition	ECCV	code	57
Leveraging Unlabeled Data for Crowd Counting by Learning to Rank	CVPR	code	56
Deep Marching Cubes: Learning Explicit Surface Representations	CVPR	code	56
Learning From Synthetic Data: Addressing Domain Shift for Semantic Segmentation	CVPR	code	56
LF-Net: Learning Local Features from Images	NIPS	code	55
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model	ECCV	code	55
Discriminability Objective for Training Descriptive Captions	CVPR	code	54
BlockDrop: Dynamic Inference Paths in Residual Networks	CVPR	code	54
Conditional Probability Models for Deep Image Compression	CVPR	code	54
Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation	CVPR	code	54
Learning towards Minimum Hyperspherical Energy	NIPS	code	54
DeepVS: A Deep Learning Based Video Saliency Prediction Approach	ECCV	code	53
Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting	ECCV	code	52
Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation	CVPR	code	52
Wasserstein Introspective Neural Networks	CVPR	code	51
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis	CVPR	code	51
Self-produced Guidance for Weakly-supervised Object Localization	ECCV	code	51
Measuring abstract reasoning in neural networks	ICML	code	51
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation	NIPS	code	51
RayNet: Learning Volumetric 3D Reconstruction With Ray Potentials	CVPR	code	51
Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation	ECCV	code	50
Efficient end-to-end learning for quantizable representations	ICML	code	50
Visual Question Generation as Dual Task of Visual Question Answering	CVPR	code	50
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam	ICML	code	49
Surface Networks	CVPR	code	48
Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions	ICML	code	48
Stacked Cross Attention for Image-Text Matching	ECCV	code	48
Actor and Observer: Joint Modeling of First and Third-Person Videos	CVPR	code	48
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation	CVPR	code	47
Learning-based Video Motion Magnification	ECCV	code	47
Pose Partition Networks for Multi-Person Pose Estimation	ECCV	code	47
Neural Autoregressive Flows	ICML	code	47
Weakly- and Semi-Supervised Panoptic Segmentation	ECCV	code	46
Video Re-localization	ECCV	code	46
Real-time 'Actor-Critic' Tracking	ECCV	code	46
Black-box Adversarial Attacks with Limited Queries and Information	ICML	code	46
Hyperbolic Entailment Cones for Learning Hierarchical Embeddings	ICML	code	46
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation	CVPR	code	46
Differentiable Compositional Kernel Learning for Gaussian Processes	ICML	code	45
Visualizing and Understanding Atari Agents	ICML	code	45
Image Manipulation with Perceptual Discriminators	ECCV	code	45
Learning Intrinsic Image Decomposition From Watching the World	CVPR	code	45
Overcoming Catastrophic Forgetting with Hard Attention to the Task	ICML	code	44
Learning Pose Specific Representations by Predicting Different Views	CVPR	code	44
Zero-Shot Object Detection	ECCV	code	43
Mean Field Multi-Agent Reinforcement Learning	ICML	code	43
Partial Adversarial Domain Adaptation	ECCV	code	43
Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation	ECCV	code	43
Robust Classification With Convolutional Prototype Learning	CVPR	code	43
SimplE Embedding for Link Prediction in Knowledge Graphs	NIPS	code	42
PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning	ICML	code	42
Learning to Blend Photos	ECCV	code	42
Mask-Guided Contrastive Attention Model for Person Re-Identification	CVPR	code	41
Link Prediction Based on Graph Neural Networks	NIPS	code	41
Generalisation in humans and deep neural networks	NIPS	code	41
Towards Binary-Valued Gates for Robust LSTM Training	ICML	code	41
Multi-scale Residual Network for Image Super-Resolution	ECCV	code	41
Fully Motion-Aware Network for Video Object Detection	ECCV	code	41
Interpretable Convolutional Neural Networks	CVPR	code	40
Generative Adversarial Perturbations	CVPR	code	40
The Sound of Pixels	ECCV	code	40
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization	CVPR	code	40
Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance	ECCV	code	40
Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation	NIPS	code	40
Learning Warped Guidance for Blind Face Restoration	ECCV	code	39
Adversarial Complementary Learning for Weakly Supervised Object Localization	CVPR	code	39
Learning Semantic Representations for Unsupervised Domain Adaptation	ICML	code	39
Neural Architecture Search with Bayesian Optimisation and Optimal Transport	NIPS	code	39
Mutual Information Neural Estimation	ICML	code	39
NetGAN: Generating Graphs via Random Walks	ICML	code	39
Learning to Evaluate Image Captioning	CVPR	code	38
Hyperbolic Neural Networks	NIPS	code	37
Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation	ECCV	code	37
Adversarially Learned One-Class Classifier for Novelty Detection	CVPR	code	37
Disentangling by Factorising	ICML	code	37
Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples	ICML	code	37
Tangent Convolutions for Dense Prediction in 3D	CVPR	code	37
Few-Shot Image Recognition by Predicting Parameters From Activations	CVPR	code	37
Real-Time Monocular Depth Estimation Using Synthetic Data With Domain Adaptation via Image Style Transfer	CVPR	code	37
Generalizing to Unseen Domains via Adversarial Data Augmentation	NIPS	code	36
SeGAN: Segmenting and Generating the Invisible	CVPR	code	36
Graphical Generative Adversarial Networks	NIPS	code	36
PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference	CVPR	code	36
Gated Fusion Network for Single Image Dehazing	CVPR	code	35
Neural Code Comprehension: A Learnable Representation of Code Semantics	NIPS	code	35
Eye In-Painting With Exemplar Generative Adversarial Networks	CVPR	code	35
Deep One-Class Classification	ICML	code	34
Deep Regression Tracking with Shrinkage Loss	ECCV	code	34
Deflecting Adversarial Attacks With Pixel Deflection	CVPR	code	34
Learning Visual Question Answering by Bootstrapping Hard Attention	ECCV	code	33
Human-Centric Indoor Scene Synthesis Using Stochastic Grammar	CVPR	code	33
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering	CVPR	code	33
CleanNet: Transfer Learning for Scalable Image Classifier Training With Label Noise	CVPR	code	33
Speaker-Follower Models for Vision-and-Language Navigation	NIPS	code	33
Improving Shape Deformation in Unsupervised Image-to-Image Translation	ECCV	code	33
Learning Single-View 3D Reconstruction with Limited Pose Supervision	ECCV	code	33
3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data	NIPS	code	33
Adversarial Logit Pairing	NIPS	code	32
Attention in Convolutional LSTM for Gesture Recognition	NIPS	code	32
Graph-Cut RANSAC	CVPR	code	32
Neural Guided Constraint Logic Programming for Program Synthesis	NIPS	code	32
Learning Dynamic Memory Networks for Object Tracking	ECCV	code	32
GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints	ECCV	code	32
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks	NIPS	code	32
Flow-Grounded Spatial-Temporal Video Prediction from Still Images	ECCV	code	32
Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection	ECCV	code	32
On the Robustness of Semantic Segmentation Models to Adversarial Attacks	CVPR	code	31
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning	CVPR	code	31
SketchyScene: Richly-Annotated Scene Sketches	ECCV	code	31
Deep Randomized Ensembles for Metric Learning	ECCV	code	30
Deep High Dynamic Range Imaging with Large Foreground Motions	ECCV	code	30
Revisiting Video Saliency: A Large-Scale Benchmark and a New Model	CVPR	code	30
Blazingly Fast Video Object Segmentation With Pixel-Wise Metric Learning	CVPR	code	30
Deep Model-Based 6D Pose Refinement in RGB	ECCV	code	30
TOM-Net: Learning Transparent Object Matting From a Single Image	CVPR	code	30
Quaternion Convolutional Neural Networks	ECCV	code	30
Densely Connected Attention Propagation for Reading Comprehension	NIPS	code	30
A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising	ECCV	code	30
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings	ICML	code	29
Video Rain Streak Removal by Multiscale Convolutional Sparse Coding	CVPR	code	29
Recurrent Scene Parsing With Perspective Understanding in the Loop	CVPR	code	29
Single Shot Scene Text Retrieval	ECCV	code	29
Toward Characteristic-Preserving Image-based Virtual Try-On Network	ECCV	code	29
Explainable Neural Computation via Stack Neural Module Networks	ECCV	code	29
Exploring Disentangled Feature Representation Beyond Face Identification	CVPR	code	29
Controllable Video Generation With Sparse Trajectories	CVPR	code	28
Layer-structured 3D Scene Inference via View Synthesis	ECCV	code	28
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation	ECCV	code	28
PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection	CVPR	code	28
Learning Rich Features for Image Manipulation Detection	CVPR	code	27
Fast Video Object Segmentation by Reference-Guided Mask Propagation	CVPR	code	27
3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration	ECCV	code	27
Who Let the Dogs Out? Modeling Dog Behavior From Visual Data	CVPR	code	27
EC-Net: an Edge-aware Point set Consolidation Network	ECCV	code	27
Interpretable Intuitive Physics Model	ECCV	code	27
Learning a Discriminative Feature Network for Semantic Segmentation	CVPR	code	26
Partial Transfer Learning With Selective Adversarial Networks	CVPR	code	26
Cross-Modal Deep Variational Hand Pose Estimation	CVPR	code	26
Between-Class Learning for Image Classification	CVPR	code	26
AON: Towards Arbitrarily-Oriented Text Recognition	CVPR	code	26
Conditional Image-to-Image Translation	CVPR	code	25
Learning Convolutional Networks for Content-Weighted Image Compression	CVPR	code	25
Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-Identification	CVPR	code	25
Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries	ECCV	code	25
CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation	CVPR	code	25
Deep Texture Manifold for Ground Terrain Recognition	CVPR	code	25
Audio-Visual Event Localization in Unconstrained Videos	ECCV	code	25
First Order Generative Adversarial Networks	ICML	code	25
Visual Coreference Resolution in Visual Dialog using Neural Module Networks	ECCV	code	25
SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks	CVPR	code	24
Deep Reinforcement Learning of Marked Temporal Point Processes	NIPS	code	24
Explicit Inductive Bias for Transfer Learning with Convolutional Networks	ICML	code	24
LEGO: Learning Edge With Geometry All at Once by Watching Videos	CVPR	code	24
Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes	ECCV	code	24
Multi-Agent Diverse Generative Adversarial Networks	CVPR	code	23
Face Aging With Identity-Preserved Conditional Generative Adversarial Networks	CVPR	code	23
Learning to Separate Object Sounds by Watching Unlabeled Video	ECCV	code	23
Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search	ICML	code	23
To Trust Or Not To Trust A Classifier	NIPS	code	23
Im2Flow: Motion Hallucination From Static Images for Action Recognition	CVPR	code	22
ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing	CVPR	code	22
Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning	CVPR	code	22
Anonymous Walk Embeddings	ICML	code	22
Learning to Multitask	NIPS	code	22
CondenseNet: An Efficient DenseNet Using Learned Group Convolutions	CVPR	code	22
HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN	CVPR	code	22
Hierarchical Relational Networks for Group Activity Recognition and Retrieval	ECCV	code	22
Collaborative and Adversarial Network for Unsupervised Domain Adaptation	CVPR	code	22
Geometry-Aware Scene Text Detection With Instance Transformation Network	CVPR	code	22
Learning to Promote Saliency Detectors	CVPR	code	21
CSGNet: Neural Shape Parser for Constructive Solid Geometry	CVPR	code	21
Local Spectral Graph Convolution for Point Set Feature Learning	ECCV	code	21
HiDDeN: Hiding Data with Deep Networks	ECCV	code	21
GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning	CVPR	code	20
Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal	CVPR	code	20
Fully-Convolutional Point Networks for Large-Scale Point Clouds	ECCV	code	20
Learning Superpixels With Segmentation-Aware Affinity Loss	CVPR	code	20
Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks	CVPR	code	20
Crowd Counting With Deep Negative Correlation Learning	CVPR	code	20
Dimensionality-Driven Learning with Noisy Labels	ICML	code	20
Objects that Sound	ECCV	code	20
Deep Expander Networks: Efficient Deep Networks from Graph Theory	ECCV	code	19
Low-Shot Learning With Large-Scale Diffusion	CVPR	code	19
Low-Shot Learning With Imprinted Weights	CVPR	code	19
Cross-Domain Self-Supervised Multi-Task Feature Learning Using Synthetic Imagery	CVPR	code	19
Learning Descriptor Networks for 3D Shape Synthesis and Analysis	CVPR	code	19
Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders	ECCV	code	19
CTAP: Complementary Temporal Action Proposal Generation	ECCV	code	18
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors	NIPS	code	18
Conditional Image-Text Embedding Networks	ECCV	code	18
EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth From Light Field Images	CVPR	code	18
Glimpse Clouds: Human Activity Recognition From Unstructured Feature Points	CVPR	code	18
Bayesian Optimization of Combinatorial Structures	ICML	code	18
FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis	CVPR	code	18
Learning Type-Aware Embeddings for Fashion Compatibility	ECCV	code	17
Sliced Wasserstein Distance for Learning Gaussian Mixture Models	CVPR	code	17
Revisiting Deep Intrinsic Image Decompositions	CVPR	code	17
A Spectral Approach to Gradient Estimation for Implicit Distributions	ICML	code	17
Hierarchical Novelty Detection for Visual Object Recognition	CVPR	code	17
Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies	CVPR	code	17
Learning Generative ConvNets via Multi-Grid Modeling and Sampling	CVPR	code	17
Learning 3D Shape Completion From Laser Scan Data With Weak Supervision	CVPR	code	17
Triplet Loss in Siamese Network for Object Tracking	ECCV	code	17
Adversarial Attack on Graph Structured Data	ICML	code	17
Arbitrary Style Transfer With Deep Feature Reshuffle	CVPR	code	17
Visual Question Reasoning on General Dependency Tree	CVPR	code	17
Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition	ECCV	code	16
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks	NIPS	code	16
Coded Sparse Matrix Multiplication	ICML	code	16
Weakly-Supervised Action Segmentation With Iterative Soft Boundary Assignment	CVPR	code	16
Recovering 3D Planes from a Single Image via Convolutional Neural Networks	ECCV	code	16
SegStereo: Exploiting Semantic Information for Disparity Estimation	ECCV	code	16
Functional Gradient Boosting based on Residual Network Perception	ICML	code	16
NAG: Network for Adversary Generation	CVPR	code	16
Generative Probabilistic Novelty Detection with Adversarial Autoencoders	NIPS	code	16
Hashing as Tie-Aware Learning to Rank	CVPR	code	15
Pose Proposal Networks	ECCV	code	15
Convolutional Sequence to Sequence Model for Human Dynamics	CVPR	code	15
Joint Pose and Expression Modeling for Facial Expression Recognition	CVPR	code	15
Grounding Referring Expressions in Images by Variational Context	CVPR	code	15
Rethinking the Form of Latent States in Image Captioning	ECCV	code	15
Open Set Domain Adaptation by Backpropagation	ECCV	code	15
Neural Sign Language Translation	CVPR	code	15
SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters	ECCV	code	15
Efficient Neural Audio Synthesis	ICML	code	15
Deep Learning Under Privileged Information Using Heteroscedastic Dropout	CVPR	code	14
Image Transformer	ICML	code	14
Learning to Understand Image Blur	CVPR	code	14
Learning and Using the Arrow of Time	CVPR	code	14
Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints	CVPR	code	14
Learning to Forecast and Refine Residual Motion for Image-to-Video Generation	ECCV	code	14
Multi-Scale Weighted Nuclear Norm Image Restoration	CVPR	code	14
Synthesizing Robust Adversarial Examples	ICML	code	13
Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data	ECCV	code	13
Assessing Generative Models via Precision and Recall	NIPS	code	13
Deep Diffeomorphic Transformer Networks	CVPR	code	13
Learning by Asking Questions	CVPR	code	13
Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection	CVPR	code	13
Variational Autoencoders for Deforming 3D Mesh Models	CVPR	code	13
Min-Entropy Latent Model for Weakly Supervised Object Detection	CVPR	code	13
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering	CVPR	code	13
Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace	ICML	code	13
Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition	CVPR	code	13
Finding Influential Training Samples for Gradient Boosted Decision Trees	ICML	code	13
Gesture Recognition: Focus on the Hands	CVPR	code	12
Cross-View Image Synthesis Using Conditional GANs	CVPR	code	12
Joint Optimization Framework for Learning With Noisy Labels	CVPR	code	12
Future Person Localization in First-Person Videos	CVPR	code	12
AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos	ECCV	code	12
Learning Transferable Architectures for Scalable Image Recognition	CVPR	code	12
Clipped Action Policy Gradient	ICML	code	12
Mix and Match Networks: Encoder-Decoder Alignment for Zero-Pair Image Translation	CVPR	code	12
Decouple Learning for Parameterized Image Operators	ECCV	code	12
Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction	ICML	code	12
Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models	NIPS	code	12
AMNet: Memorability Estimation With Attention	CVPR	code	12
Adversarial Time-to-Event Modeling	ICML	code	12
Reversible Recurrent Neural Networks	NIPS	code	12
Human Pose Estimation With Parsing Induced Learner	CVPR	code	11
ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking	ECCV	code	11
A Joint Sequence Fusion Model for Video Question Answering and Retrieval	ECCV	code	11
Learning Face Age Progression: A Pyramid Architecture of GANs	CVPR	code	11
Robust Physical-World Attacks on Deep Learning Visual Classification	CVPR	code	11
High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach	ICML	code	11
Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory	ICML	code	11
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence	CVPR	code	11
Accelerating Natural Gradient with Higher-Order Invariance	ICML	code	11
Hierarchical Multi-Label Classification Networks	ICML	code	11
Convolutional Image Captioning	CVPR	code	11
Boosting Domain Adaptation by Discovering Latent Domains	CVPR	code	11
Logo Synthesis and Manipulation With Clustered Generative Adversarial Networks	CVPR	code	10
PacGAN: The power of two samples in generative adversarial networks	NIPS	code	10
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification	CVPR	code	10
End-to-End Incremental Learning	ECCV	code	10
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation	CVPR	code	10
On GANs and GMMs	NIPS	code	10
Salient Object Detection Driven by Fixation Prediction	CVPR	code	9
Semantic Video Segmentation by Gated Recurrent Flow Propagation	CVPR	code	9
Constraint-Aware Deep Neural Network Compression	ECCV	code	9
Statistically-motivated Second-order Pooling	ECCV	code	9
Excitation Backprop for RNNs	CVPR	code	9
Analyzing Uncertainty in Neural Machine Translation	ICML	code	9
Learning Dynamics of Linear Denoising Autoencoders	ICML	code	9
Saliency Detection in 360° Videos	ECCV	code	9
Density Adaptive Point Set Registration	CVPR	code	9
Decoupled Parallel Backpropagation with Convergence Guarantee	ICML	code	9
Classification from Pairwise Similarity and Unlabeled Data	ICML	code	9
oi-VAE: Output Interpretable VAEs for Nonlinear Group Factor Analysis	ICML	code	9
Modeling Sparse Deviations for Compressed Sensing using Generative Models	ICML	code	9
Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction	CVPR	code	9
Towards Open-Set Identity Preserving Face Synthesis	CVPR	code	9
Five-Point Fundamental Matrix Estimation for Uncalibrated Cameras	CVPR	code	8
BourGAN: Generative Networks with Metric Embeddings	NIPS	code	8
Fast Information-theoretic Bayesian Optimisation	ICML	code	8
Deep Variational Reinforcement Learning for POMDPs	ICML	code	8
Specular-to-Diffuse Translation for Multi-View Reconstruction	ECCV	code	8
Dynamic Conditional Networks for Few-Shot Learning	ECCV	code	8
Learning Facial Action Units From Web Images With Scalable Weakly Supervised Clustering	CVPR	code	8
High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs	CVPR	code	8
Deep Defense: Training DNNs with Improved Adversarial Robustness	NIPS	code	8
Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations	ICML	code	8
Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling	ECCV	code	7
Non-metric Similarity Graphs for Maximum Inner Product Search	NIPS	code	7
Towards Realistic Predictors	ECCV	code	7
Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation	NIPS	code	7
Don’t Just Assume Look and Answer: Overcoming Priors for Visual Question Answering	CVPR	code	7
Learning Dual Convolutional Neural Networks for Low-Level Vision	CVPR	code	7
The Mirage of Action-Dependent Baselines in Reinforcement Learning	ICML	code	7
DVQA: Understanding Data Visualizations via Question Answering	CVPR	code	7
A Two-Step Disentanglement Method	CVPR	code	7
Detecting and Correcting for Label Shift with Black Box Predictors	ICML	code	7
Conditional Prior Networks for Optical Flow	ECCV	code	7
Generative Adversarial Learning Towards Fast Weakly Supervised Detection	CVPR	code	7
Adversarial Learning with Local Coordinate Coding	ICML	code	7
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks	CVPR	code	7
AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks	CVPR	code	7
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation	ICML	code	7
Banach Wasserstein GAN	NIPS	code	7
Gradually Updated Neural Networks for Large-Scale Image Recognition	ICML	code	7
Learning Steady-States of Iterative Algorithms over Graphs	ICML	code	7
Progressive Attention Guided Recurrent Network for Salient Object Detection	CVPR	code	7
Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains	CVPR	code	6
Unsupervised holistic image generation from key local patches	ECCV	code	6
Inner Space Preserving Generative Pose Machine	ECCV	code	6
Bilevel Programming for Hyperparameter Optimization and Meta-Learning	ICML	code	6
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition	CVPR	code	6
Breaking the Activation Function Bottleneck through Adaptive Parameterization	NIPS	code	6
Ultra Large-Scale Feature Selection using Count-Sketches	ICML	code	6
Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks	CVPR	code	6
Orthogonally Decoupled Variational Gaussian Processes	NIPS	code	6
Batch Bayesian Optimization via Multi-objective Acquisition Ensemble for Automated Analog Circuit Design	ICML	code	6
A Modulation Module for Multi-task Learning with Applications in Image Retrieval	ECCV	code	6
A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos	CVPR	code	6
Towards Effective Low-Bitwidth Convolutional Neural Networks	CVPR	code	5
Disentangling Factors of Variation by Mixing Them	CVPR	code	5
Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior	ECCV	code	5
Learning Longer-term Dependencies in RNNs with Auxiliary Losses	ICML	code	5
Contour Knowledge Transfer for Salient Object Detection	ECCV	code	5
HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning	ECCV	code	5
Sidekick Policy Learning for Active Visual Exploration	ECCV	code	5
Learning to Localize Sound Source in Visual Scenes	CVPR	code	5
Neural Architecture Optimization	NIPS	code	5
COLA: Decentralized Linear Learning	NIPS	code	5
Diverse and Coherent Paragraph Generation from Images	ECCV	code	5
DRACO: Byzantine-resilient Distributed Training via Redundant Gradients	ICML	code	5
Inter and Intra Topic Structure Learning with Word Embeddings	ICML	code	5
Estimating the Success of Unsupervised Image to Image Translation	ECCV	code	5
Dynamic-Structured Semantic Propagation Network	CVPR	code	5
The Description Length of Deep Learning models	NIPS	code	5
Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving	ECCV	code	5
Blind Justice: Fairness with Encrypted Sensitive Attributes	ICML	code	5
Transfer Learning via Learning to Transfer	ICML	code	5
Deepcode: Feedback Codes via Deep Learning	NIPS	code	4
Configurable Markov Decision Processes	ICML	code	4
A Framework for Evaluating 6-DOF Object Trackers	ECCV	code	4
Differentially Private Database Release via Kernel Mean Embeddings	ICML	code	4
Recognizing Human Actions as the Evolution of Pose Estimation Maps	CVPR	code	4
Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images	CVPR	code	4
DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map	CVPR	code	4
Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification	ECCV	code	4
Tracking Emerges by Colorizing Videos	ECCV	code	4
Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes	ECCV	code	4
Inference Suboptimality in Variational Autoencoders	ICML	code	4
Black Box FDR	ICML	code	4
Feedback-Prop: Convolutional Neural Network Inference Under Partial Evidence	CVPR	code	4
Quadrature-based features for kernel approximation	NIPS	code	4
Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking	ECCV	code	4
Transferable Adversarial Perturbations	ECCV	code	4
Single Image Water Hazard Detection using FCN with Reflection Attention Units	ECCV	code	4
Multimodal Generative Models for Scalable Weakly-Supervised Learning	NIPS	code	4
Importance Weighted Transfer of Samples in Reinforcement Learning	ICML	code	3
Feature Generating Networks for Zero-Shot Learning	CVPR	code	3
DICOD: Distributed Convolutional Coordinate Descent for Convolutional Sparse Coding	ICML	code	3
CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces	NIPS	code	3
Bidirectional Retrieval Made Simple	CVPR	code	3
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages	NIPS	code	3
A Hybrid l1-l0 Layer Decomposition Model for Tone Mapping	CVPR	code	3
Spatially-Adaptive Filter Units for Deep Neural Networks	CVPR	code	3
Learning to Branch	ICML	code	3
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives	NIPS	code	3
Lifelong Learning via Progressive Distillation and Retrospection	ECCV	code	3
CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition	CVPR	code	3
Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care	ICML	code	3
Learning Answer Embeddings for Visual Question Answering	CVPR	code	3
Information Constraints on Auto-Encoding Variational Bayes	NIPS	code	3
Parallel Bayesian Network Structure Learning	ICML	code	3
Ring Loss: Convex Feature Normalization for Face Recognition	CVPR	code	3
Teaching Categories to Human Learners With Visual Explanations	CVPR	code	3
Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization	ICML	code	3
Deep Burst Denoising	ECCV	code	3
Convergent Tree Backup and Retrace with Function Approximation	ICML	code	3
Gaze Prediction in Dynamic 360° Immersive Videos	CVPR	code	3
Statistical Recurrent Models on Manifold valued Data	NIPS	code	3
End-to-End Flow Correlation Tracking With Spatial-Temporal Attention	CVPR	code	3

↥ back to top

2017

TitleConfCodeStars

Bridging the Gap Between Value and Policy Based Reinforcement Learning	NIPS	code	46593
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models	NIPS	code	46593
Focal Loss for Dense Object Detection	ICCV	code	18356
Mask R-CNN	ICCV	code	9493
Deep Photo Style Transfer	CVPR	code	8655
LightGBM: A Highly Efficient Gradient Boosting Decision Tree	NIPS	code	7536
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation	NIPS	code	6449
Attention is All you Need	NIPS	code	6288
Large Pose 3D Face Reconstruction From a Single Image via Direct Volumetric CNN Regression	ICCV	code	3354
Densely Connected Convolutional Networks	CVPR	code	3130
A Unified Approach to Interpreting Model Predictions	NIPS	code	3122
Deformable Convolutional Networks	ICCV	code	2165
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games	NIPS	code	1823
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation	CVPR	code	1523
Improved Training of Wasserstein GANs	NIPS	code	1405
Fully Convolutional Instance-Aware Semantic Segmentation	CVPR	code	1395
Aggregated Residual Transformations for Deep Neural Networks	CVPR	code	1361
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network	CVPR	code	1301
Unsupervised Image-to-Image Translation Networks	NIPS	code	1205
Photographic Image Synthesis With Cascaded Refinement Networks	ICCV	code	1142
High-Resolution Image Inpainting Using Multi-Scale Neural Patch Synthesis	CVPR	code	1072
SphereFace: Deep Hypersphere Embedding for Face Recognition	CVPR	code	1048
Deep Feature Flow for Video Recognition	CVPR	code	966
Bayesian GAN	NIPS	code	942
Pyramid Scene Parsing Network	CVPR	code	934
Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes	NIPS	code	906
Finding Tiny Faces	CVPR	code	856
Toward Multimodal Image-to-Image Translation	NIPS	code	794
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks	ICML	code	784
YOLO9000: Better, Faster, Stronger	CVPR	code	773
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space	NIPS	code	772
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	ICML	code	729
FlowNet 2.0: Evolution of Optical Flow Estimation With Deep Networks	CVPR	code	720
Channel Pruning for Accelerating Very Deep Neural Networks	ICCV	code	649
Dilated Residual Networks	CVPR	code	640
Inferring and Executing Programs for Visual Reasoning	ICCV	code	636
DSOD: Learning Deeply Supervised Object Detectors From Scratch	ICCV	code	582
Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization	ICCV	code	572
Accelerating Eulerian Fluid Simulation With Convolutional Networks	ICML	code	570
Learning Disentangled Representations with Semi-Supervised Deep Generative Models	NIPS	code	556
Inductive Representation Learning on Large Graphs	NIPS	code	552
Regressing Robust and Discriminative 3D Morphable Models With a Very Deep Neural Network	CVPR	code	537
How Far Are We From Solving the 2D & 3D Face Alignment Problem? (And a Dataset of 230,000 3D Facial Landmarks)	ICCV	code	526
SSH: Single Stage Headless Face Detector	ICCV	code	515
Learning From Simulated and Unsupervised Images Through Adversarial Training	CVPR	code	492
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space	CVPR	code	487
Video Frame Interpolation via Adaptive Convolution	CVPR	code	482
Video Frame Interpolation via Adaptive Separable Convolution	ICCV	code	482
GMS: Grid-based Motion Statistics for Fast, Ultra-Robust Feature Correspondence	CVPR	code	460
Joint Detection and Identification Feature Learning for Person Search	CVPR	code	459
Dual Path Networks	NIPS	code	451
Flow-Guided Feature Aggregation for Video Object Detection	ICCV	code	436
Deep Image Matting	CVPR	code	434
Richer Convolutional Features for Edge Detection	CVPR	code	399
Annotating Object Instances With a Polygon-RNN	CVPR	code	397
Recurrent Highway Networks	ICML	code	397
Detect to Track and Track to Detect	ICCV	code	387
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation	CVPR	code	379
Detecting Oriented Text in Natural Images by Linking Segments	CVPR	code	364
Deep Lattice Networks and Partial Monotonic Functions	NIPS	code	349
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results	NIPS	code	347
RON: Reverse Connection With Objectness Prior Networks for Object Detection	CVPR	code	345
Universal Style Transfer via Feature Transforms	NIPS	code	344
Residual Attention Network for Image Classification	CVPR	code	329
One-Shot Video Object Segmentation	CVPR	code	316
Accurate Single Stage Detector Using Recurrent Rolling Convolution	CVPR	code	314
Feature Pyramid Networks for Object Detection	CVPR	code	310
Efficient softmax approximation for GPUs	ICML	code	304
OctNet: Learning Deep 3D Representations at High Resolutions	CVPR	code	302
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution	CVPR	code	301
Pixel Recursive Super Resolution	ICCV	code	301
Self-Critical Sequence Training for Image Captioning	CVPR	code	299
Age Progression/Regression by Conditional Adversarial Autoencoder	CVPR	code	297
Style Transfer from Non-Parallel Text by Cross-Alignment	NIPS	code	296
Dilated Recurrent Neural Networks	NIPS	code	285
Lifting From the Deep: Convolutional 3D Pose Estimation From a Single Image	CVPR	code	280
DeepBach: a Steerable Model for Bach Chorales Generation	ICML	code	276
The Predictron: End-To-End Learning and Planning	ICML	code	274
Convolutional Sequence to Sequence Learning	ICML	code	258
OptNet: Differentiable Optimization as a Layer in Neural Networks	ICML	code	245
Prototypical Networks for Few-shot Learning	NIPS	code	244
Deep Voice: Real-time Neural Text-to-Speech	ICML	code	242
Reinforcement Learning with Deep Energy-Based Policies	ICML	code	233
Learning Deep CNN Denoiser Prior for Image Restoration	CVPR	code	231
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium	NIPS	code	229
A Point Set Generation Network for 3D Object Reconstruction From a Single Image	CVPR	code	228
Deeply Supervised Salient Object Detection With Short Connections	CVPR	code	228
BlitzNet: A Real-Time Deep Network for Scene Understanding	ICCV	code	227
Language Modeling with Gated Convolutional Networks	ICML	code	221
Unlabeled Samples Generated by GAN Improve the Person Re-Identification Baseline in Vitro	ICCV	code	215
Stacked Generative Adversarial Networks	CVPR	code	215
RMPE: Regional Multi-Person Pose Estimation	ICCV	code	215
Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning	CVPR	code	214
Generative Face Completion	CVPR	code	212
VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition	ICCV	code	210
The Reversible Residual Network: Backpropagation Without Storing Activations	NIPS	code	210
Recurrent Scale Approximation for Object Detection in CNN	ICCV	code	209
Learning From Synthetic Humans	CVPR	code	207
Spatially Adaptive Computation Time for Residual Networks	CVPR	code	203
Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis	ICCV	code	202
3D Bounding Box Estimation Using Deep Learning and Geometry	CVPR	code	200
Multi-View 3D Object Detection Network for Autonomous Driving	CVPR	code	199
Visual Dialog	CVPR	code	199
Interpretable Explanations of Black Boxes by Meaningful Perturbation	ICCV	code	192
Inverse Compositional Spatial Transformer Networks	CVPR	code	189
FastMask: Segment Multi-Scale Object Candidates in One Shot	CVPR	code	189
OnACID: Online Analysis of Calcium Imaging Data in Real Time	NIPS	code	189
Semantic Scene Completion From a Single Depth Image	CVPR	code	188
Learning Efficient Convolutional Networks Through Network Slimming	ICCV	code	186
Learning Feature Pyramids for Human Pose Estimation	ICCV	code	185
Be Your Own Prada: Fashion Synthesis With Structural Coherence	ICCV	code	183
Scene Graph Generation by Iterative Message Passing	CVPR	code	182
Fast Image Processing With Fully-Convolutional Networks	ICCV	code	180
Learning Multiple Tasks with Multilinear Relationship Networks	NIPS	code	178
Learning to Reason: End-To-End Module Networks for Visual Question Answering	ICCV	code	178
Single Shot Text Detector With Regional Attention	ICCV	code	176
Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment With Limited Resources	ICCV	code	175
Deep Feature Interpolation for Image Content Changes	CVPR	code	170
On Human Motion Prediction Using Recurrent Neural Networks	CVPR	code	167
Image Super-Resolution via Deep Recursive Residual Network	CVPR	code	163
Learning Cross-Modal Embeddings for Cooking Recipes and Food Images	CVPR	code	160
Input Convex Neural Networks	ICML	code	159
Simple Does It: Weakly Supervised Instance and Semantic Segmentation	CVPR	code	159
Low-Shot Visual Recognition by Shrinking and Hallucinating Features	ICCV	code	158
Oriented Response Networks	CVPR	code	157
Soft Proposal Networks for Weakly Supervised Object Localization	ICCV	code	154
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks	ICML	code	147
Axiomatic Attribution for Deep Networks	ICML	code	146
Gradient Episodic Memory for Continual Learning	NIPS	code	146
DSAC - Differentiable RANSAC for Camera Localization	CVPR	code	144
Attend to You: Personalized Image Captioning With Context Sequence Memory Networks	CVPR	code	143
Conditional Similarity Networks	CVPR	code	142
Language Modeling with Recurrent Highway Hypernetworks	NIPS	code	141
Triple Generative Adversarial Nets	NIPS	code	138
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning	NIPS	code	138
One-Sided Unsupervised Domain Mapping	NIPS	code	137
Detecting Visual Relationships With Deep Relational Networks	CVPR	code	137
Attentive Recurrent Comparators	ICML	code	136
Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach	ICCV	code	136
Learning a Multi-View Stereo Machine	NIPS	code	135
Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model	NIPS	code	134
Multi-Context Attention for Human Pose Estimation	CVPR	code	131
Controlling Perceptual Factors in Neural Style Transfer	CVPR	code	130
Bayesian Compression for Deep Learning	NIPS	code	130
Adversarial Discriminative Domain Adaptation	CVPR	code	129
Working hard to know your neighbor's margins: Local descriptor learning loss	NIPS	code	128
Concrete Dropout	NIPS	code	127
SegFlow: Joint Learning for Video Object Segmentation and Optical Flow	ICCV	code	127
Segmentation-Aware Convolutional Networks Using Local Attention Masks	ICCV	code	126
Detail-Revealing Deep Video Super-Resolution	ICCV	code	126
CREST: Convolutional Residual Learning for Visual Tracking	ICCV	code	126
Discriminative Correlation Filter With Channel and Spatial Reliability	CVPR	code	124
SVDNet for Pedestrian Retrieval	ICCV	code	121
Semantic Image Synthesis via Adversarial Learning	ICCV	code	121
Spatiotemporal Multiplier Networks for Video Action Recognition	CVPR	code	121
PoseTrack: Joint Multi-Person Pose Estimation and Tracking	CVPR	code	121
Hierarchical Attentive Recurrent Tracking	NIPS	code	121
Good Semi-supervised Learning That Requires a Bad GAN	NIPS	code	120
Deep Watershed Transform for Instance Segmentation	CVPR	code	120
Associative Domain Adaptation	ICCV	code	119
Learning by Association -- A Versatile Semi-Supervised Training Method for Neural Networks	CVPR	code	119
Value Prediction Network	NIPS	code	119
Unrestricted Facial Geometry Reconstruction Using Image-To-Image Translation	ICCV	code	119
MemNet: A Persistent Memory Network for Image Restoration	ICCV	code	119
Bayesian Optimization with Gradients	NIPS	code	117
TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning	NIPS	code	117
Compressed Sensing using Generative Models	ICML	code	116
Switching Convolutional Neural Network for Crowd Counting	CVPR	code	116
WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation	CVPR	code	116
Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner	ICCV	code	115
Video Frame Synthesis Using Deep Voxel Flow	ICCV	code	114
Multiple Instance Detection Network With Online Instance Classifier Refinement	CVPR	code	113
Deep Pyramidal Residual Networks	CVPR	code	112
Train longer, generalize better: closing the generalization gap in large batch training of neural networks	NIPS	code	112
Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction	CVPR	code	110
Unite the People: Closing the Loop Between 3D and 2D Human Representations	CVPR	code	110
Learning Combinatorial Optimization Algorithms over Graphs	NIPS	code	109
FeUdal Networks for Hierarchical Reinforcement Learning	ICML	code	107
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression	ICCV	code	105
Learning a Deep Embedding Model for Zero-Shot Learning	CVPR	code	104
ECO: Efficient Convolution Operators for Tracking	CVPR	code	103
SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning	CVPR	code	102
Multi-View Supervision for Single-View Reconstruction via Differentiable Ray Consistency	CVPR	code	100
Task-based End-to-end Model Learning in Stochastic Optimization	NIPS	code	100
Learning to Compose Domain-Specific Transformations for Data Augmentation	NIPS	code	97
Genetic CNN	ICCV	code	97
HashNet: Deep Learning to Hash by Continuation	ICCV	code	97
Interleaved Group Convolutions	ICCV	code	95
Deeply-Learned Part-Aligned Representations for Person Re-Identification	ICCV	code	95
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model	NIPS	code	94
Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation	CVPR	code	93
Octree Generating Networks: Efficient Convolutional Architectures for High-Resolution 3D Outputs	ICCV	code	92
Semantic Autoencoder for Zero-Shot Learning	CVPR	code	92
Deep Hyperspherical Learning	NIPS	code	92
Decoupled Neural Interfaces using Synthetic Gradients	ICML	code	90
Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks	NIPS	code	90
Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search	NIPS	code	90
Optical Flow Estimation Using a Spatial Pyramid Network	CVPR	code	90
AMC: Attention guided Multi-modal Correlation Learning for Image Search	CVPR	code	90
Deep Video Deblurring for Hand-Held Cameras	CVPR	code	89
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data	NIPS	code	88
Causal Effect Inference with Deep Latent-Variable Models	NIPS	code	87
GANs for Biological Image Synthesis	ICCV	code	85
MMD GAN: Towards Deeper Understanding of Moment Matching Network	NIPS	code	84
Representation Learning by Learning to Count	ICCV	code	84
Optical Flow in Mostly Rigid Scenes	CVPR	code	83
Fast-Slow Recurrent Neural Networks	NIPS	code	82
Unsupervised Video Summarization With Adversarial LSTM Networks	CVPR	code	82
Constrained Policy Optimization	ICML	code	81
A-NICE-MC: Adversarial Training for MCMC	NIPS	code	80
Coarse-To-Fine Volumetric Prediction for Single-Image 3D Human Pose	CVPR	code	80
End-To-End Instance Segmentation With Recurrent Attention	CVPR	code	78
DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data	CVPR	code	78
Learning Shape Abstractions by Assembling Volumetric Primitives	CVPR	code	77
Local Binary Convolutional Neural Networks	CVPR	code	77
Raster-To-Vector: Revisiting Floorplan Transformation	ICCV	code	76
Positive-Unlabeled Learning with Non-Negative Risk Estimator	NIPS	code	76
Hard-Aware Deeply Cascaded Embedding	ICCV	code	75
Deep Image Harmonization	CVPR	code	73
Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis	CVPR	code	73
Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade	CVPR	code	73
Improved Stereo Matching With Constant Highway Networks and Reflective Confidence Learning	CVPR	code	72
Query-Guided Regression Network With Context Policy for Phrase Grounding	ICCV	code	72
Top-Down Visual Saliency Guided by Captions	CVPR	code	72
Feedback Networks	CVPR	code	72
What Actions Are Needed for Understanding Human Actions in Videos?	ICCV	code	71
Xception: Deep Learning With Depthwise Separable Convolutions	CVPR	code	71
Action-Decision Networks for Visual Tracking With Deep Reinforcement Learning	CVPR	code	71
Video Propagation Networks	CVPR	code	70
Image-To-Image Translation With Conditional Adversarial Networks	CVPR	code	70
Quality Aware Network for Set to Set Recognition	CVPR	code	69
Self-Supervised Learning of Visual Features Through Embedding Images Into Text Topic Spaces	CVPR	code	69
Deep Subspace Clustering Networks	NIPS	code	68
Escape From Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models	ICCV	code	68
A Distributional Perspective on Reinforcement Learning	ICML	code	68
Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks	CVPR	code	67
Deep Transfer Learning with Joint Adaptation Networks	ICML	code	67
Training Deep Networks without Learning Rates Through Coin Betting	NIPS	code	66
Full Resolution Image Compression With Recurrent Neural Networks	CVPR	code	66
SurfaceNet: An End-To-End 3D Neural Network for Multiview Stereopsis	ICCV	code	66
Doubly Stochastic Variational Inference for Deep Gaussian Processes	NIPS	code	66
TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals	ICCV	code	66
Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification	ICCV	code	65
Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes With Deep Generative Networks	CVPR	code	65
Dance Dance Convolution	ICML	code	65
Borrowing Treasures From the Wealthy: Deep Transfer Learning Through Selective Joint Fine-Tuning	CVPR	code	64
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes	ICCV	code	64
Toward Controlled Generation of Text	ICML	code	63
Person Re-Identification in the Wild	CVPR	code	63
ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching	NIPS	code	63
Differentiable Learning of Logical Rules for Knowledge Base Reasoning	NIPS	code	62
Person Search With Natural Language Description	CVPR	code	61
Multi-Channel Weighted Nuclear Norm Minimization for Real Color Image Denoising	ICCV	code	61
Playing for Benchmarks	ICCV	code	61
Unsupervised Learning by Predicting Noise	ICML	code	60
Localizing Moments in Video With Natural Language	ICCV	code	60
End-To-End 3D Face Reconstruction With Deep Neural Networks	CVPR	code	60
CoupleNet: Coupling Global Structure With Local Parts for Object Detection	ICCV	code	59
AdaGAN: Boosting Generative Models	NIPS	code	59
Convolutional Gaussian Processes	NIPS	code	57
A Deep Regression Architecture With Two-Stage Re-Initialization for High Performance Facial Landmark Detection	CVPR	code	57
Modeling Relationships in Referential Expressions With Compositional Modular Networks	CVPR	code	57
Curiosity-driven Exploration by Self-supervised Prediction	ICML	code	56
Wavelet-SRNet: A Wavelet-Based CNN for Multi-Scale Face Super Resolution	ICCV	code	56
The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process	NIPS	code	56
Online and Linear-Time Attention by Enforcing Monotonic Alignments	ICML	code	56
Neural Expectation Maximization	NIPS	code	56
Dense-Captioning Events in Videos	ICCV	code	55
Factorized Bilinear Models for Image Recognition	ICCV	code	55
Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee	NIPS	code	54
On-the-fly Operation Batching in Dynamic Computation Graphs	NIPS	code	54
Visual Translation Embedding Network for Visual Relation Detection	CVPR	code	54
Learning Blind Motion Deblurring	ICCV	code	54
A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning	NIPS	code	53
Towards Diverse and Natural Image Descriptions via a Conditional GAN	ICCV	code	53
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos	CVPR	code	53
A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing	ICCV	code	52
Deep IV: A Flexible Approach for Counterfactual Prediction	ICML	code	52
Triangle Generative Adversarial Networks	NIPS	code	51
EAST: An Efficient and Accurate Scene Text Detector	CVPR	code	51
SST: Single-Stream Temporal Action Proposals	CVPR	code	51
Predicting Deeper Into the Future of Semantic Segmentation	ICCV	code	51
L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space	CVPR	code	51
TALL: Temporal Activity Localization via Language Query	ICCV	code	50
Hybrid Reward Architecture for Reinforcement Learning	NIPS	code	50
Fast Fourier Color Constancy	CVPR	code	49
Modulating early visual processing by language	NIPS	code	49
Adversarial Examples for Semantic Segmentation and Object Detection	ICCV	code	49
Learning Discrete Representations via Information Maximizing Self-Augmented Training	ICML	code	49
Efficient Diffusion on Region Manifolds: Recovering Small Objects With Compact CNN Representations	CVPR	code	48
Real Time Image Saliency for Black Box Classifiers	NIPS	code	48
FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling	CVPR	code	47
Multiple People Tracking by Lifted Multicut and Person Re-Identification	CVPR	code	47
Learned D-AMP: Principled Neural Network based Compressive Image Recovery	NIPS	code	47
GP CaKe: Effective brain connectivity with causal kernels	NIPS	code	46
Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network	NIPS	code	46
Semantic Video CNNs Through Representation Warping	ICCV	code	46
Grammar Variational Autoencoder	ICML	code	46
EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis	ICCV	code	46
Safe Model-based Reinforcement Learning with Stability Guarantees	NIPS	code	45
Deep Spectral Clustering Learning	ICML	code	45
Semantic Compositional Networks for Visual Captioning	CVPR	code	45
On-Demand Learning for Deep Image Restoration	ICCV	code	45
Video Pixel Networks	ICML	code	45
Stabilizing Training of Generative Adversarial Networks through Regularization	NIPS	code	45
Structured Bayesian Pruning via Log-Normal Multiplicative Noise	NIPS	code	44
Deriving Neural Architectures from Sequence and Graph Kernels	ICML	code	44
Masked Autoregressive Flow for Density Estimation	NIPS	code	44
Unsupervised Adaptation for Deep Stereo	ICCV	code	44
Learning Residual Images for Face Attribute Manipulation	CVPR	code	43
Learning to Generate Long-term Future via Hierarchical Prediction	ICML	code	43
Accurate Optical Flow via Direct Cost Volume Processing	CVPR	code	42
Generalized Orderless Pooling Performs Implicit Salient Matching	ICCV	code	42
Comparative Evaluation of Hand-Crafted and Learned Local Features	CVPR	code	42
SchNet: A continuous-filter convolutional neural network for modeling quantum interactions	NIPS	code	41
Temporal Generative Adversarial Nets With Singular Value Clipping	ICCV	code	41
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks	ICML	code	41
Neural Scene De-Rendering	CVPR	code	40
Semantic Image Inpainting With Deep Generative Models	CVPR	code	40
A Linear-Time Kernel Goodness-of-Fit Test	NIPS	code	40
Least Squares Generative Adversarial Networks	ICCV	code	39
Diversified Texture Synthesis With Feed-Forward Networks	CVPR	code	39
No Fuss Distance Metric Learning Using Proxies	ICCV	code	38
Template Matching With Deformable Diversity Similarity	CVPR	code	38
What's in a Question: Using Visual Questions as a Form of Supervision	CVPR	code	38
Face Normals "In-The-Wild" Using Fully Convolutional Networks	CVPR	code	38
Conditional Image Synthesis with Auxiliary Classifier GANs	ICML	code	37
Neural Episodic Control	ICML	code	37
3D-PRNN: Generating Shape Primitives With Recurrent Neural Networks	ICCV	code	37
Structured Embedding Models for Grouped Data	NIPS	code	36
Learning Active Learning from Data	NIPS	code	36
Unified Deep Supervised Domain Adaptation and Generalization	ICCV	code	35
Transformation-Grounded Image Generation Network for Novel 3D View Synthesis	CVPR	code	35
Structured Attentions for Visual Question Answering	ICCV	code	34
Geometric Loss Functions for Camera Pose Regression With Deep Learning	CVPR	code	34
VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization	CVPR	code	34
QMDP-Net: Deep Learning for Planning under Partial Observability	NIPS	code	34
Using Ranking-CNN for Age Estimation	CVPR	code	33
Hierarchical Boundary-Aware Neural Encoder for Video Captioning	CVPR	code	33
Unsupervised Learning of Disentangled Representations from Video	NIPS	code	32
Deep Learning on Lie Groups for Skeleton-Based Action Recognition	CVPR	code	32
Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection	CVPR	code	32
3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder	CVPR	code	32
StyleNet: Generating Attractive Visual Captions With Styles	CVPR	code	32
Dynamic Word Embeddings	ICML	code	32
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon	NIPS	code	31
Continual Learning Through Synaptic Intelligence	ICML	code	31
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes	CVPR	code	31
Learning Detection With Diverse Proposals	CVPR	code	31
LCNN: Lookup-Based Convolutional Neural Network	CVPR	code	31
Towards Accurate Multi-Person Pose Estimation in the Wild	CVPR	code	30
Real-Time Neural Style Transfer for Videos	CVPR	code	30
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training	ICCV	code	30
Deep Co-Occurrence Feature Learning for Visual Object Recognition	CVPR	code	29
Joint distribution optimal transportation for domain adaptation	NIPS	code	29
Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields	CVPR	code	29
SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization	ICML	code	29
The Statistical Recurrent Unit	ICML	code	29
A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation	CVPR	code	28
Learning Spread-Out Local Feature Descriptors	ICCV	code	28
Event-Based Visual Inertial Odometry	CVPR	code	27
DropoutNet: Addressing Cold Start in Recommender Systems	NIPS	code	27
Phrase Localization and Visual Relationship Detection With Comprehensive Image-Language Cues	ICCV	code	27
Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations	CVPR	code	27
Deep 360 Pilot: Learning a Deep Agent for Piloting Through 360deg Sports Videos	CVPR	code	27
Neural Message Passing for Quantum Chemistry	ICML	code	27
State-Frequency Memory Recurrent Neural Networks	ICML	code	27
DeepCD: Learning Deep Complementary Descriptors for Patch Representations	ICCV	code	26
Contrastive Learning for Image Captioning	NIPS	code	26
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure	NIPS	code	26
Learning High Dynamic Range From Outdoor Panoramas	ICCV	code	26
Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors	CVPR	code	26
Learning to Detect Salient Objects With Image-Level Supervision	CVPR	code	26
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions	ICML	code	26
Interspecies Knowledge Transfer for Facial Keypoint Detection	CVPR	code	25
YASS: Yet Another Spike Sorter	NIPS	code	25
Open Set Domain Adaptation	ICCV	code	25
Domain-Adaptive Deep Network Compression	ICCV	code	24
Long Short-Term Memory Kalman Filters: Recurrent Neural Estimators for Pose Regularization	ICCV	code	24
Temporal Context Network for Activity Localization in Videos	ICCV	code	24
Incremental Learning of Object Detectors Without Catastrophic Forgetting	ICCV	code	24
Dense Captioning With Joint Inference and Visual Context	CVPR	code	24
Universal Adversarial Perturbations	CVPR	code	24
Asymmetric Tri-training for Unsupervised Domain Adaptation	ICML	code	24
Reducing Reparameterization Gradient Variance	NIPS	code	24
Exploiting Saliency for Object Segmentation From Image Level Labels	CVPR	code	24
A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering	NIPS	code	24
Shading Annotations in the Wild	CVPR	code	24
Straight to Shapes: Real-Time Detection of Encoded Shapes	CVPR	code	23
Dual Discriminator Generative Adversarial Nets	NIPS	code	23
Zero-Order Reverse Filtering	ICCV	code	23
Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net	NIPS	code	23
Learning Spherical Convolution for Fast Features from 360° Imagery	NIPS	code	22
Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier	ICML	code	22
Deep Cross-Modal Hashing	CVPR	code	22
When Unsupervised Domain Adaptation Meets Tensor Representations	ICCV	code	22
Image Super-Resolution Using Dense Skip Connections	ICCV	code	22
Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer	CVPR	code	22
STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling	CVPR	code	22
Learning Continuous Semantic Representations of Symbolic Expressions	ICML	code	22
Deep Growing Learning	ICCV	code	21
Combined Group and Exclusive Sparsity for Deep Neural Networks	ICML	code	21
Hash Embeddings for Efficient Word Representations	NIPS	code	21
Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM	NIPS	code	21
Disentangled Representation Learning GAN for Pose-Invariant Face Recognition	CVPR	code	21
Learning to Pivot with Adversarial Networks	NIPS	code	21
Learning Dynamic Siamese Network for Visual Object Tracking	ICCV	code	21
POSEidon: Face-From-Depth for Driver Pose Estimation	CVPR	code	20
Deep Metric Learning via Facility Location	CVPR	code	20
Automatic Spatially-Aware Fashion Concept Discovery	ICCV	code	20
The Numerics of GANs	NIPS	code	20
From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur	CVPR	code	20
Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks	ICCV	code	20
Zero-Inflated Exponential Family Embeddings	ICML	code	20
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations	NIPS	code	20
Weakly-Supervised Learning of Visual Relations	ICCV	code	20
Multi-Label Image Recognition by Recurrently Discovering Attentional Regions	ICCV	code	20
Scene Parsing With Global Context Embedding	ICCV	code	20
Context Selection for Embedding Models	NIPS	code	20
Deep Mean-Shift Priors for Image Restoration	NIPS	code	20
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition	CVPR	code	20
Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification	CVPR	code	19
Learning Compact Geometric Features	ICCV	code	19
Structured Generative Adversarial Networks	NIPS	code	19
Joint Gap Detection and Inpainting of Line Drawings	CVPR	code	19
Chained Multi-Stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection	ICCV	code	19
Adversarial Feature Matching for Text Generation	ICML	code	18
BIER - Boosting Independent Embeddings Robustly	ICCV	code	18
Predictive-Corrective Networks for Action Detection	CVPR	code	18
Stochastic Generative Hashing	ICML	code	18
A Bayesian Data Augmentation Approach for Learning Deep Models	NIPS	code	18
Attentive Semantic Video Generation Using Captions	ICCV	code	18
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network	CVPR	code	18
Deep Unsupervised Similarity Learning Using Partially Ordered Sets	CVPR	code	17
DualNet: Learn Complementary Features for Image Recognition	ICCV	code	17
Neural system identification for large populations separating “what” and “where”	NIPS	code	17
FALKON: An Optimal Large Scale Kernel Method	NIPS	code	17
Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks	CVPR	code	17
Deep Learning with Topological Signatures	NIPS	code	17
Streaming Sparse Gaussian Process Approximations	NIPS	code	17
RPAN: An End-To-End Recurrent Pose-Attention Network for Action Recognition in Videos	ICCV	code	17
Awesome Typography: Statistics-Based Text Effects Transfer	CVPR	code	17
RoomNet: End-To-End Room Layout Estimation	ICCV	code	17
Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval	ICCV	code	16
Deep Supervised Discrete Hashing	NIPS	code	16
Few-Shot Learning Through an Information Retrieval Lens	NIPS	code	16
Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach	NIPS	code	16
Learning to Push the Limits of Efficient FFT-Based Image Deconvolution	ICCV	code	16
Federated Multi-Task Learning	NIPS	code	16
Label Distribution Learning Forests	NIPS	code	16
Deep Multitask Architecture for Integrated 2D and 3D Human Sensing	CVPR	code	16
Estimating Mutual Information for Discrete-Continuous Mixtures	NIPS	code	16
Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes	CVPR	code	16
StyleBank: An Explicit Representation for Neural Image Style Transfer	CVPR	code	16
Surface Normals in the Wild	ICCV	code	15
Automatic Discovery of the Statistical Types of Variables in a Dataset	ICML	code	15
Learning Diverse Image Colorization	CVPR	code	15
Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems	ICCV	code	15
Non-Local Deep Features for Salient Object Detection	CVPR	code	15
Structure-Measure: A New Way to Evaluate Foreground Maps	ICCV	code	15
Shallow Updates for Deep Reinforcement Learning	NIPS	code	15
Wasserstein Generative Adversarial Networks	ICML	code	15
Recurrent 3D Pose Sequence Machines	CVPR	code	15
Variational Dropout Sparsifies Deep Neural Networks	ICML	code	15
Captioning Images With Diverse Objects	CVPR	code	15
Off-policy evaluation for slate recommendation	NIPS	code	15
Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-Shot Learning	ICCV	code	14
Benchmarking Denoising Algorithms With Real Photographs	CVPR	code	14
Neural Aggregation Network for Video Face Recognition	CVPR	code	14
Learned Contextual Feature Reweighting for Image Geo-Localization	CVPR	code	14
Streaming Weak Submodularity: Interpreting Neural Networks on the Fly	NIPS	code	14
CVAE-GAN: Fine-Grained Image Generation Through Asymmetric Training	ICCV	code	14
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation	ICCV	code	14
Spherical convolutions and their application in molecular modelling	NIPS	code	14
Multi-Information Source Optimization	NIPS	code	14
Convolutional Neural Network Architecture for Geometric Matching	CVPR	code	14
Neural Face Editing With Intrinsic Image Disentangling	CVPR	code	14
Realistic Dynamic Facial Textures From a Single Image Using GANs	ICCV	code	14
Predictive State Recurrent Neural Networks	NIPS	code	13
Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework	ICCV	code	13
ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events	NIPS	code	13
Hunt For The Unique, Stable, Sparse And Fast Feature Learning On Graphs	NIPS	code	13
Consensus Convolutional Sparse Coding	ICCV	code	13
Weakly Supervised Affordance Detection	CVPR	code	13
Joint Learning of Object and Action Detectors	ICCV	code	13
Light Field Blind Motion Deblurring	CVPR	code	13
Asynchronous Stochastic Gradient Descent with Delay Compensation	ICML	code	13
Unrolled Memory Inner-Products: An Abstract GPU Operator for Efficient Vision-Related Computations	ICCV	code	12
Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification	NIPS	code	12
Self-Organized Text Detection With Minimal Post-Processing via Border Learning	ICCV	code	12
Coordinated Multi-Agent Imitation Learning	ICML	code	12
Gradient descent GAN optimization is locally stable	NIPS	code	12
Removing Rain From Single Images via a Deep Detail Network	CVPR	code	12
Convexified Convolutional Neural Networks	ICML	code	12
Multigrid Neural Architectures	CVPR	code	12
VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization	ICCV	code	12
Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin	NIPS	code	12
Differential Angular Imaging for Material Recognition	CVPR	code	12
A Multilayer-Based Framework for Online Background Subtraction With Freely Moving Cameras	ICCV	code	11
Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation	NIPS	code	11
Max-value Entropy Search for Efficient Bayesian Optimization	ICML	code	11
Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization	ICCV	code	11
Generalized Deep Image to Image Regression	CVPR	code	11
Adversarial Image Perturbation for Privacy Protection -- A Game Theory Perspective	ICCV	code	11
Predicting Human Activities Using Stochastic Grammar	ICCV	code	11
DESIRE: Distant Future Prediction in Dynamic Scenes With Interacting Agents	CVPR	code	11
Fisher GAN	NIPS	code	11
High-Order Attention Models for Visual Question Answering	NIPS	code	11
IM2CAD	CVPR	code	11
On Fairness and Calibration	NIPS	code	11
DeepPermNet: Visual Permutation Learning	CVPR	code	10
f-GANs in an Information Geometric Nutshell	NIPS	code	10
Revisiting IM2GPS in the Deep Learning Era	ICCV	code	10
Attentional Correlation Filter Network for Adaptive Visual Tracking	CVPR	code	10
Learning Cross-Modal Deep Representations for Robust Pedestrian Detection	CVPR	code	10
Confident Multiple Choice Learning	ICML	code	10
Curriculum Dropout	ICCV	code	9
Cognitive Mapping and Planning for Visual Navigation	CVPR	code	9
Optimized Pre-Processing for Discrimination Prevention	NIPS	code	9
Learning Motion Patterns in Videos	CVPR	code	9
Scalable Log Determinants for Gaussian Process Kernel Learning	NIPS	code	9
A Hierarchical Approach for Generating Descriptive Image Paragraphs	CVPR	code	9
Deep Crisp Boundaries	CVPR	code	9
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization	NIPS	code	9
Practical Data-Dependent Metric Compression with Provable Guarantees	NIPS	code	9
Do Deep Neural Networks Suffer from Crowding?	NIPS	code	9
A Non-Convex Variational Approach to Photometric Stereo Under Inaccurate Lighting	CVPR	code	9
End-To-End Learning of Geometry and Context for Deep Stereo Regression	ICCV	code	9
From Bayesian Sparsity to Gated Recurrent Nets	NIPS	code	8
Regret Minimization in MDPs with Options without Prior Knowledge	NIPS	code	8
Following Gaze in Video	ICCV	code	8
Model-Powered Conditional Independence Test	NIPS	code	8
Cost efficient gradient boosting	NIPS	code	8
Reflectance Adaptive Filtering Improves Intrinsic Image Estimation	CVPR	code	8
DeepNav: Learning to Navigate Large Cities	CVPR	code	8
Look, Listen and Learn	ICCV	code	8
Attention-Aware Face Hallucination via Deep Reinforcement Learning	CVPR	code	8
Plan, Attend, Generate: Planning for Sequence-to-Sequence Models	NIPS	code	8
Introspective Neural Networks for Generative Modeling	ICCV	code	8
Affinity Clustering: Hierarchical Clustering at Scale	NIPS	code	8
Gaze Embeddings for Zero-Shot Image Classification	CVPR	code	8
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability	ICML	code	8
Online multiclass boosting	NIPS	code	8
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images	ICCV	code	8
SubUNets: End-To-End Hand Shape and Continuous Sign Language Recognition	ICCV	code	7
Learning Koopman Invariant Subspaces for Dynamic Mode Decomposition	NIPS	code	7
Unsupervised Monocular Depth Estimation With Left-Right Consistency	CVPR	code	7
Personalized Image Aesthetics	ICCV	code	7
Reasoning About Fine-Grained Attribute Phrases Using Reference Games	ICCV	code	7
Lost Relatives of the Gumbel Trick	ICML	code	7
Weakly Supervised Learning of Deep Metrics for Stereo Reconstruction	ICCV	code	7
Centered Weight Normalization in Accelerating Training of Deep Neural Networks	ICCV	code	6
Scalable Planning with Tensorflow for Hybrid Nonlinear Domains	NIPS	code	6
Convex Global 3D Registration With Lagrangian Duality	CVPR	code	6
Building a Regular Decision Boundary With Deep Networks	CVPR	code	6
Learning Spatial Regularization With Image-Level Supervisions for Multi-Label Image Classification	CVPR	code	6
Forecasting Human Dynamics From Static Images	CVPR	code	6
AOD-Net: All-In-One Dehazing Network	ICCV	code	6
K-Medoids For K-Means Seeding	NIPS	code	6
Diverse Image Annotation	CVPR	code	6
Practical Hash Functions for Similarity Estimation and Dimensionality Reduction	NIPS	code	6
Deep Adaptive Image Clustering	ICCV	code	6
Robust Adversarial Reinforcement Learning	ICML	code	6
Improving Training of Deep Neural Networks via Singular Value Bounding	CVPR	code	6
Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems	NIPS	code	6
Tensor Belief Propagation	ICML	code	6
Sparse convolutional coding for neuronal assembly detection	NIPS	code	6
Unsupervised Pixel-Level Domain Adaptation With Generative Adversarial Networks	CVPR	code	6
Bayesian inference on random simple graphs with power law degree distributions	ICML	code	6
Tensor Biclustering	NIPS	code	6
Riemannian approach to batch normalization	NIPS	code	6
Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings	ICCV	code	6
Rolling-Shutter-Aware Differential SfM and Image Rectification	ICCV	code	5
Active Decision Boundary Annotation With Deep Generative Models	ICCV	code	5
Object Co-Skeletonization With Co-Segmentation	CVPR	code	5
Discover and Learn New Objects From Documentaries	CVPR	code	5
Understanding Black-box Predictions via Influence Functions	ICML	code	5
Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach	CVPR	code	5
Decoupling "when to update" from "how to update"	NIPS	code	5
MarioQA: Answering Questions by Watching Gameplay Videos	ICCV	code	5
Differentially private Bayesian learning on distributed data	NIPS	code	5
Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization	ICCV	code	5
Question Asking as Program Generation	NIPS	code	5
Conic Scan-and-Cover algorithms for nonparametric topic modeling	NIPS	code	5
Lip Reading Sentences in the Wild	CVPR	code	5
ROAM: A Rich Object Appearance Model With Application to Rotoscoping	CVPR	code	5
NeuralFDR: Learning Discovery Thresholds from Hypothesis Features	NIPS	code	5
Viraliency: Pooling Local Virality	CVPR	code	5
Learning Algorithms for Active Learning	ICML	code	5
Point to Set Similarity Based Deep Feature Learning for Person Re-Identification	CVPR	code	5
Click Here: Human-Localized Keypoints as Guidance for Viewpoint Estimation	ICCV	code	5
The World of Fast Moving Objects	CVPR	code	5
Cross-Modality Binary Code Learning via Fusion Similarity Hashing	CVPR	code	5
Testing and Learning on Distributions with Symmetric Noise Invariance	NIPS	code	5
Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference	NIPS	code	5
Diving into the shallows: a computational perspective on large-scale shallow learning	NIPS	code	5
Rotation Equivariant Vector Field Networks	ICCV	code	5
Recursive Sampling for the Nystrom Method	NIPS	code	5
Learning From Video and Text via Large-Scale Discriminative Clustering	ICCV	code	5
Global optimization of Lipschitz functions	ICML	code	5
Device Placement Optimization with Reinforcement Learning	ICML	code	4
Alternating Direction Graph Matching	CVPR	code	4
MEC: Memory-efficient Convolution for Deep Neural Network	ICML	code	4
Expert Gate: Lifelong Learning With a Network of Experts	CVPR	code	4
A Simple yet Effective Baseline for 3D Human Pose Estimation	ICCV	code	4
On Structured Prediction Theory with Calibrated Convex Surrogate Losses	NIPS	code	4
Sub-sampled Cubic Regularization for Non-convex Optimization	ICML	code	4
Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval	CVPR	code	4
Bottleneck Conditional Density Estimation	ICML	code	4
Learning Cooperative Visual Dialog Agents With Deep Reinforcement Learning	ICCV	code	4
Multi-way Interacting Regression via Factorization Machines	NIPS	code	4
Joint Discovery of Object States and Manipulation Actions	ICCV	code	4
Predicting Salient Face in Multiple-Face Videos	CVPR	code	4
From Red Wine to Red Tomato: Composition With Context	CVPR	code	4
Encoder Based Lifelong Learning	ICCV	code	4
Deep Recurrent Neural Network-Based Identification of Precursor microRNAs	NIPS	code	4
Guarantees for Greedy Maximization of Non-submodular Functions with Applications	ICML	code	4
Pose-Aware Person Recognition	CVPR	code	4
Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths	CVPR	code	4
Asynchronous Distributed Variational Gaussian Processes for Regression	ICML	code	3
Saliency Pattern Detection by Ranking Structured Trees	ICCV	code	3
Toward Goal-Driven Neural Network Models for the Rodent Whisker-Trigeminal System	NIPS	code	3
Learning Non-Maximum Suppression	CVPR	code	3
Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC	ICML	code	3
Discriminative Bimodal Networks for Visual Localization and Detection With Natural Language Queries	CVPR	code	3
AdaNet: Adaptive Structural Learning of Artificial Neural Networks	ICML	code	3
Large Margin Object Tracking With Circulant Feature Maps	CVPR	code	3
Compatible Reward Inverse Reinforcement Learning	NIPS	code	3
Adversarial Surrogate Losses for Ordinal Regression	NIPS	code	3
Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms	NIPS	code	3
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning	NIPS	code	3
A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control	NIPS	code	3
Counting Everyday Objects in Everyday Scenes	CVPR	code	3
Loss Max-Pooling for Semantic Image Segmentation	CVPR	code	3
Aesthetic Critiques Generation for Photos	ICCV	code	3
Expectation Propagation with Stochastic Kinetic Model in Complex Interaction Systems	NIPS	code	3
Near-Optimal Edge Evaluation in Explicit Generalized Binomial Graphs	NIPS	code	3

↥ back to top

2016

TitleConfCodeStars

R-FCN: Object Detection via Region-based Fully Convolutional Networks	NIPS	code	18356
Image Style Transfer Using Convolutional Neural Networks	CVPR	code	16435
Deep Residual Learning for Image Recognition	CVPR	code	4468
Convolutional Pose Machines	CVPR	code	3260
Synthetic Data for Text Localisation in Natural Images	CVPR	code	787
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis	CVPR	code	731
Instance-Aware Semantic Segmentation via Multi-Task Network Cascades	CVPR	code	433
Learning Multi-Domain Convolutional Neural Networks for Visual Tracking	CVPR	code	350
Convolutional Two-Stream Network Fusion for Video Action Recognition	CVPR	code	342
Learning Deep Features for Discriminative Localization	CVPR	code	323
Deep Metric Learning via Lifted Structured Feature Embedding	CVPR	code	251
Learning Deep Representations of Fine-Grained Visual Descriptions	CVPR	code	229
Eye Tracking for Everyone	CVPR	code	223
NetVLAD: CNN Architecture for Weakly Supervised Place Recognition	CVPR	code	204
Staple: Complementary Learners for Real-Time Tracking	CVPR	code	183
Joint Unsupervised Learning of Deep Representations and Image Clusters	CVPR	code	182
Accurate Image Super-Resolution Using Very Deep Convolutional Networks	CVPR	code	182
Temporal Action Localization in Untrimmed Videos via Multi-Stage CNNs	CVPR	code	167
LocNet: Improving Localization Accuracy for Object Detection	CVPR	code	155
Shallow and Deep Convolutional Networks for Saliency Prediction	CVPR	code	153
Compact Bilinear Pooling	CVPR	code	148
Learning Compact Binary Descriptors With Unsupervised Deep Neural Networks	CVPR	code	144
Dynamic Image Networks for Action Recognition	CVPR	code	133
Rethinking the Inception Architecture for Computer Vision	CVPR	code	130
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images	CVPR	code	126
Context Encoders: Feature Learning by Inpainting	CVPR	code	124
TI-Pooling: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks	CVPR	code	109
Weakly Supervised Deep Detection Networks	CVPR	code	103
Natural Language Object Retrieval	CVPR	code	100
Deeply-Recursive Convolutional Network for Image Super-Resolution	CVPR	code	96
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network	CVPR	code	92
Image Question Answering Using Convolutional Neural Network With Dynamic Parameter Prediction	CVPR	code	88
Recurrent Convolutional Network for Video-Based Person Re-Identification	CVPR	code	82
A Comparative Study for Single Image Blind Deblurring	CVPR	code	82
Neural Module Networks	CVPR	code	81
Stacked Attention Networks for Image Question Answering	CVPR	code	78
Progressive Prioritized Multi-View Stereo	CVPR	code	73
Marr Revisited: 2D-3D Alignment via Surface Normal Prediction	CVPR	code	72
A Hierarchical Deep Temporal Model for Group Activity Recognition	CVPR	code	71
Towards Open Set Deep Networks	CVPR	code	71
Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs	CVPR	code	70
Bilateral Space Video Segmentation	CVPR	code	63
Deep Compositional Captioning: Describing Novel Object Categories Without Paired Training Data	CVPR	code	57
Efficient 3D Room Shape Recovery From a Single Panorama	CVPR	code	55
Non-Local Image Dehazing	CVPR	code	50
Video Segmentation via Object Flow	CVPR	code	50
Deep Supervised Hashing for Fast Image Retrieval	CVPR	code	50
Deep Region and Multi-Label Learning for Facial Action Unit Detection	CVPR	code	43
CRAFT Objects From Images	CVPR	code	41
Slicing Convolutional Neural Network for Crowd Video Understanding	CVPR	code	40
Sketch Me That Shoe	CVPR	code	39
Image Captioning With Semantic Attention	CVPR	code	35
Deep Saliency With Encoded Low Level Distance Map and High Level Features	CVPR	code	34
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation	CVPR	code	33
A Dual-Source Approach for 3D Pose Estimation From a Single Image	CVPR	code	32
Learning Local Image Descriptors With Deep Siamese and Triplet Convolutional Networks by Minimising Global Loss Functions	CVPR	code	30
Ordinal Regression With Multiple Output CNN for Age Estimation	CVPR	code	30
Structured Feature Learning for Pose Estimation	CVPR	code	29
Unsupervised Learning of Edges	CVPR	code	29
PatchBatch: A Batch Augmented Loss for Optical Flow	CVPR	code	27
Dense Human Body Correspondences Using Convolutional Networks	CVPR	code	27
Actionness Estimation Using Hybrid Fully Convolutional Networks	CVPR	code	26
You Only Look Once: Unified, Real-Time Object Detection	CVPR	code	26
Fast Training of Triplet-Based Deep Binary Embedding Networks	CVPR	code	25
Recurrent Attention Models for Depth-Based Person Identification	CVPR	code	24
Detecting Vanishing Points Using Global Image Context in a Non-Manhattan World	CVPR	code	22
First Person Action Recognition Using Deep Learned Descriptors	CVPR	code	21
Proposal Flow	CVPR	code	20
Scale-Aware Alignment of Hierarchical Image Segmentation	CVPR	code	20
Quantized Convolutional Neural Networks for Mobile Devices	CVPR	code	20
Semantic Segmentation With Boundary Neural Fields	CVPR	code	19
Single-Image Crowd Counting via Multi-Column Convolutional Neural Network	CVPR	code	19
Accumulated Stability Voting: A Robust Descriptor From Descriptors of Multiple Scales	CVPR	code	19
Structure From Motion With Objects	CVPR	code	17
Bottom-Up and Top-Down Reasoning With Hierarchical Rectified Gaussians	CVPR	code	16
Semantic Filtering	CVPR	code	16
Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network	CVPR	code	16
ReconNet: Non-Iterative Reconstruction of Images From Compressively Sensed Measurements	CVPR	code	15
Interactive Segmentation on RGBD Images via Cue Selection	CVPR	code	14
Object Contour Detection With a Fully Convolutional Encoder-Decoder Network	CVPR	code	14
Automatic Content-Aware Color and Tone Stylization	CVPR	code	12
Similarity Learning With Spatial Constraints for Person Re-Identification	CVPR	code	11
Personalizing Human Video Pose Estimation	CVPR	code	10
Visually Indicated Sounds	CVPR	code	9
Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification	CVPR	code	9
Region Ranking SVM for Image Classification	CVPR	code	8
Pairwise Matching Through Max-Weight Bipartite Belief Propagation	CVPR	code	8
Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled	CVPR	code	8
Cross-Stitch Networks for Multi-Task Learning	CVPR	code	8
Learning a Discriminative Null Space for Person Re-Identification	CVPR	code	8
Efficient Deep Learning for Stereo Matching	CVPR	code	7
Globally Optimal Manhattan Frame Estimation in Real-Time	CVPR	code	7
Where to Look: Focus Regions for Visual Question Answering	CVPR	code	7
Detecting Migrating Birds at Night	CVPR	code	7
Unsupervised Learning From Narrated Instruction Videos	CVPR	code	7
Efficient and Robust Color Consistency for Community Photo Collections	CVPR	code	7
Recurrent Attentional Networks for Saliency Detection	CVPR	code	7
3D Shape Attributes	CVPR	code	6
Beyond Local Search: Tracking Objects Everywhere With Instance-Specific Proposals	CVPR	code	5
Functional Faces: Groupwise Dense Correspondence Using Functional Maps	CVPR	code	5
Visual Tracking Using Attention-Modulated Disintegration and Integration	CVPR	code	5
Improving Human Action Recognition by Non-Action Classification	CVPR	code	4
Prior-Less Compressible Structure From Motion	CVPR	code	4
DenseCap: Fully Convolutional Localization Networks for Dense Captioning	CVPR	code	4
Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization	CVPR	code	4
Force From Motion: Decoding Physical Sensation in a First Person Video	CVPR	code	3
Context-Aware Gaussian Fields for Non-Rigid Point Set Registration	CVPR	code	3
Using Spatial Order to Boost the Elimination of Incorrect Feature Matches	CVPR	code	3
Fast Algorithms for Convolutional Neural Networks	CVPR	code	3

↥ back to top

2015

TitleConfCodeStars

Faster R-CNN: Towards Real-Time Object Detectionwith Region Proposal Networks	NIPS	code	18356
Fast R-CNN	ICCV	code	18356
Conditional Random Fields as Recurrent Neural Networks	ICCV	code	1189
Fully Convolutional Networks for Semantic Segmentation	CVPR	code	911
Learning to Track: Online Multi-Object Tracking by Decision Making	ICCV	code	308
Learning to Compare Image Patches via Convolutional Neural Networks	CVPR	code	300
Learning Deconvolution Network for Semantic Segmentation	ICCV	code	296
Single Image Super-Resolution From Transformed Self-Exemplars	CVPR	code	289
Sequence to Sequence - Video to Text	ICCV	code	239
Deep Colorization	ICCV	code	198
Deep Neural Decision Forests	ICCV	code	192
Hierarchical Convolutional Features for Visual Tracking	ICCV	code	179
Render for CNN: Viewpoint Estimation in Images Using CNNs Trained With Rendered 3D Model Views	ICCV	code	176
Realtime Edge-Based Visual Odometry for a Monocular Camera	ICCV	code	175
Understanding Deep Image Representations by Inverting Them	CVPR	code	154
Context-Aware CNNs for Person Head Detection	ICCV	code	153
Show and Tell: A Neural Image Caption Generator	CVPR	code	141
Face Alignment by Coarse-to-Fine Shape Searching	CVPR	code	140
An Improved Deep Learning Architecture for Person Re-Identification	CVPR	code	127
FaceNet: A Unified Embedding for Face Recognition and Clustering	CVPR	code	124
Depth-Based Hand Pose Estimation: Data, Methods, and Challenges	ICCV	code	121
DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time	CVPR	code	118
Massively Parallel Multiview Stereopsis by Surface Normal Diffusion	ICCV	code	105
Learning to Propose Objects	CVPR	code	91
Learning Spatially Regularized Correlation Filters for Visual Tracking	ICCV	code	86
A Convolutional Neural Network Cascade for Face Detection	CVPR	code	85
Discriminative Learning of Deep Convolutional Feature Point Descriptors	ICCV	code	77
Unsupervised Visual Representation Learning by Context Prediction	ICCV	code	73
Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images	CVPR	code	71
Deep Filter Banks for Texture Recognition and Segmentation	CVPR	code	68
Saliency Detection by Multi-Context Deep Learning	CVPR	code	66
Multi-Objective Convolutional Learning for Face Labeling	CVPR	code	55
Finding Action Tubes	CVPR	code	51
Category-Specific Object Reconstruction From a Single Image	CVPR	code	48
Convolutional Color Constancy	ICCV	code	47
Face Flow	ICCV	code	45
P-CNN: Pose-Based CNN Features for Action Recognition	ICCV	code	45
Learning From Massive Noisy Labeled Data for Image Classification	CVPR	code	45
Image Specificity	CVPR	code	40
Predicting Depth, Surface Normals and Semantic Labels With a Common Multi-Scale Convolutional Architecture	ICCV	code	35
Neural Activation Constellations: Unsupervised Part Model Discovery With Convolutional Networks	ICCV	code	35
VQA: Visual Question Answering	ICCV	code	35
Mid-Level Deep Pattern Mining	CVPR	code	34
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization	ICCV	code	34
Parsimonious Labeling	ICCV	code	33
Car That Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models	ICCV	code	33
Recurrent Convolutional Neural Network for Object Recognition	CVPR	code	32
TILDE: A Temporally Invariant Learned DEtector	CVPR	code	30
In Defense of Color-Based Model-Free Tracking	CVPR	code	30
Fast Bilateral-Space Stereo for Synthetic Defocus	CVPR	code	29
Phase-Based Frame Interpolation for Video	CVPR	code	28
Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition	CVPR	code	27
Deeply Learned Attributes for Crowded Scene Understanding	CVPR	code	27
Unconstrained 3D Face Reconstruction	CVPR	code	26
Viewpoints and Keypoints	CVPR	code	25
Holistically-Nested Edge Detection	ICCV	code	25
Going Deeper With Convolutions	CVPR	code	25
Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset)	CVPR	code	25
Data-Driven 3D Voxel Patterns for Object Category Recognition	CVPR	code	24
L0TV: A New Method for Image Restoration in the Presence of Impulse Noise	CVPR	code	22
Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues	CVPR	code	21
Understanding Deep Features With Computer-Generated Imagery	ICCV	code	19
HICO: A Benchmark for Recognizing Human-Object Interactions in Images	ICCV	code	18
Structured Feature Selection	ICCV	code	17
Learning Large-Scale Automatic Image Colorization	ICCV	code	17
Semantic Component Analysis	ICCV	code	17
Simultaneous Feature Learning and Hash Coding With Deep Neural Networks	CVPR	code	16
3D Object Reconstruction From Hand-Object Interactions	ICCV	code	15
Learning Temporal Embeddings for Complex Video Analysis	ICCV	code	14
Learning to See by Moving	ICCV	code	14
Reflection Removal Using Ghosting Cues	CVPR	code	14
Where to Buy It: Matching Street Clothing Photos in Online Shops	ICCV	code	14
Oriented Edge Forests for Boundary Detection	CVPR	code	13
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification	CVPR	code	11
Appearance-Based Gaze Estimation in the Wild	CVPR	code	10
Learning a Descriptor-Specific 3D Keypoint Detector	ICCV	code	10
Robust Image Filtering Using Joint Static and Dynamic Guidance	CVPR	code	10
Partial Person Re-Identification	ICCV	code	9
High Quality Structure From Small Motion for Rolling Shutter Cameras	ICCV	code	9
Boosting Object Proposals: From Pascal to COCO	ICCV	code	8
Convolutional Channel Features	ICCV	code	8
Live Repetition Counting	ICCV	code	8
Unsupervised Learning of Visual Representations Using Videos	ICCV	code	8
Supervised Discrete Hashing	CVPR	code	7
Multi-View Convolutional Neural Networks for 3D Shape Recognition	ICCV	code	7
Simpler Non-Parametric Methods Provide as Good or Better Results to Multiple-Instance Learning	ICCV	code	7
Finding Distractors In Images	CVPR	code	7
Piecewise Flat Embedding for Image Segmentation	ICCV	code	7
Long-Term Correlation Tracking	CVPR	code	6
Towards Open World Recognition	CVPR	code	6
Pooled Motion Features for First-Person Videos	CVPR	code	6
Simultaneous Deep Transfer Across Domains and Tasks	ICCV	code	6
What Makes an Object Memorable?	ICCV	code	5
Mining Semantic Affordances of Visual Object Categories	CVPR	code	5
Dense Semantic Correspondence Where Every Pixel is a Classifier	ICCV	code	5
Segment Graph Based Image Filtering: Fast Structure-Preserving Smoothing	ICCV	code	5
Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization	CVPR	code	5
Unsupervised Generation of a Viewpoint Annotated Car Dataset From Videos	ICCV	code	5
Multi-Label Cross-Modal Retrieval	ICCV	code	4
Superdifferential Cuts for Binary Energies	CVPR	code	4
Pose Induction for Novel Object Categories	ICCV	code	4
Efficient Minimal-Surface Regularization of Perspective Depth Maps in Variational Stereo	CVPR	code	4
Low-Rank Matrix Factorization Under General Mixture Noise Distributions	ICCV	code	4
Robust Saliency Detection via Regularized Random Walks Ranking	CVPR	code	3
Simultaneous Video Defogging and Stereo Reconstruction	CVPR	code	3
Hyperspectral Super-Resolution by Coupled Spectral Unmixing	ICCV	code	3
Oriented Object Proposals	ICCV	code	3
kNN Hashing With Factorized Neighborhood Representation	ICCV	code	3
Minimum Barrier Salient Object Detection at 80 FPS	ICCV	code	3

↥ back to top

2014

TitleConfCodeStars

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation	CVPR	code	1681
Locally Optimized Product Quantization for Approximate Nearest Neighbor Search	CVPR	code	437
Clothing Co-Parsing by Joint Image Segmentation and Labeling	CVPR	code	218
Multiscale Combinatorial Grouping	CVPR	code	185
Face Alignment at 3000 FPS via Regressing Local Binary Features	CVPR	code	164
Cross-Scale Cost Aggregation for Stereo Matching	CVPR	code	106
Transfer Joint Matching for Unsupervised Domain Adaptation	CVPR	code	67
Deep Learning Face Representation from Predicting 10,000 Classes	CVPR	code	62
BING: Binarized Normed Gradients for Objectness Estimation at 300fps	CVPR	code	44
One Millisecond Face Alignment with an Ensemble of Regression Trees	CVPR	code	43
3D Reconstruction from Accidental Motion	CVPR	code	42
Predicting Matchability	CVPR	code	38
Dense Semantic Image Segmentation with Objects and Attributes	CVPR	code	28
Scene-Independent Group Profiling in Crowd	CVPR	code	28
Shrinkage Fields for Effective Image Restoration	CVPR	code	25
Adaptive Color Attributes for Real-Time Visual Tracking	CVPR	code	25
Minimal Scene Descriptions from Structure from Motion Models	CVPR	code	22
Parallax-tolerant Image Stitching	CVPR	code	20
Learning Mid-level Filters for Person Re-identification	CVPR	code	20
Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow	CVPR	code	18
Product Sparse Coding	CVPR	code	16
Convolutional Neural Networks for No-Reference Image Quality Assessment	CVPR	code	16
Seeing 3D Chairs: Exemplar Part-based 2D-3D Alignment using a Large Dataset of CAD Models	CVPR	code	15
StoryGraphs: Visualizing Character Interactions as a Timeline	CVPR	code	14
Nonparametric Part Transfer for Fine-grained Recognition	CVPR	code	13
Scalable Multitask Representation Learning for Scene Classification	CVPR	code	11
Investigating Haze-relevant Features in A Learning Framework for Image Dehazing	CVPR	code	7
Reconstructing PASCAL VOC	CVPR	code	6
Collaborative Hashing	CVPR	code	6
Tell Me What You See and I will Show You Where It Is	CVPR	code	6
Salient Region Detection via High-Dimensional Color Transform	CVPR	code	6

↥ back to top

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision > Resources' 카테고리의 다른 글

YouTube에서 'Image to Text with Python - pytesseract' 보기 (0)	2020.04.14

Posted by uniqueone

,

From ICCV 2019: Accurate, robust and fast method for registration of 3D scans https://www.profillic.com/paper/arxiv:1904.05519 (approach significantly outperforms the state-of-the-art robust 3D registration method based on a line process in terms of b..

Computer Vision/Papers2read 2019. 9. 28. 10:25

From ICCV 2019: Accurate, robust and fast method for registration of 3D scans

https://www.profillic.com/paper/arxiv:1904.05519

(approach significantly outperforms the state-of-the-art robust 3D registration method based on a line process in terms of both speed and accuracy)
https://www.facebook.com/groups/TensorFlowKR/permalink/994510107556714/?sfnsn=mo

Posted by uniqueone

,

3D Morphable Face Models -- Past, Present and Future

Computer Vision 2019. 9. 6. 08:38

https://www.facebook.com/groups/TensorFlowKR/permalink/974184509589274/?sfnsn=mo

3D Morphable Face Models -- Past, Present and Future

https://www.profillic.com/paper/arxiv:1909.01815

(Authors provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed.)

'Computer Vision' 카테고리의 다른 글

파일 1개로 간단하게 mono VO를 구성한 예제가 있네요 입문용으로 좋을듯합니다. https://github.com/avisingh599/mono-vo/tree/master/src 블로그에 상세한 설명과 리포트도 있습니다. http://avisingh599.github.. (0)	2019.10.22
Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490 (0)	2019.10.18
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02
GrabCut, GraphCut matlab Source code (0)	2018.06.18
face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks (0)	2017.06.02

Posted by uniqueone

,

Compare Handwritten Shapes Using Procrustes Analysis

Computer Vision 2019. 7. 2. 14:30

https://kr.mathworks.com/help/stats/compare-handwritten-shapes.html

Compare Handwritten Shapes Using Procrustes Analysis

Try it in MATLAB

Try This Example

This example shows how to use Procrustes analysis to compare two handwritten number threes. Visually and analytically explore the effects of forcing size and reflection changes.

Load and Display the Original Data

Input landmark data for two handwritten number threes.

A = [11 39;17 42;25 42;25 40;23 36;19 35;30 34;35 29;... 30 20;18 19]; B = [15 31;20 37;30 40;29 35;25 29;29 31;31 31;35 20;... 29 10;25 18];

Create X and Y from A and B , moving B to the side to make each shape more visible.

X = A; Y = B + repmat([25 0], 10,1);

Plot the shapes, using letters to designate the landmark points. Lines in the figure join the points to indicate the drawing path of each shape.

plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') legend('X = Target','Y = Comparison','location','SE') xlim([0 65]); ylim([0 55]);

Calculate the Best Transformation

Use Procrustes analysis to find the transformation that minimizes distances between landmark data points.

[d,Z,tr] = procrustes(X,Y);

The outputs of the function are d (a standardized dissimilarity measure), Z (a matrix of the transformed landmarks), and tr (a structure array of the computed transformation with fields T , b , and c which correspond to the transformation equation).

Visualize the transformed shape, Z , using a dashed blue line.

plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-',... Z(:,1),Z(:,2),'b:'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') text(Z(:,1), Z(:,2),('abcdefghij')') legend('X = Target','Y = Comparison',... 'Z = Transformed','location','SW') xlim([0 65]); ylim([0 55]);

Examine the Similarity of the Two Shapes

Use two different numerical values, the dissimilarity measure d and the scaling measure b , to assess the similarity of the target shape and the transformed shape.

The dissimilarity measure d gives a number between 0 and 1 describing the difference between the target shape and the transformed shape. Values near 0 imply more similar shapes, while values near 1 imply dissimilarity.

d

d = 0.1502

The small value of d in this case shows that the two shapes are similar. procrustes calculates d by comparing the sum of squared deviations between the set of points with the sum of squared deviations of the original points from their column means.

numerator = sum(sum((X-Z).^2))

numerator = 166.5321

denominator = sum(sum(bsxfun(@minus,X,mean(X)).^2))

denominator = 1.1085e+03

ratio = numerator/denominator

ratio = 0.1502

The resulting measure d is independent of the scale of the size of the shapes and takes into account only the similarity of landmark data.

Examine the size similarity of the shapes.

tr.b

ans = 0.9291

The sizes of the target and comparison shapes in the previous figure appear similar. This visual impression is reinforced by the value of b = % 0.93, which implies that the best transformation results in shrinking the comparison shape by a factor .93 (only 7%).

Restrict the Form of the Transformations

Explore the effects of manually adjusting the scaling and reflection coefficients.

Force b to equal 1 (set 'Scaling' to false) to examine the amount of dissimilarity in size of the target and transformed figures.

ds = procrustes(X,Y,'Scaling',false)

ds = 0.1552

In this case, setting 'Scaling ' to false increases the calculated value of d only 0.0049, which further supports the similarity in the size of the two number threes. A larger increase in d would have indicated a greater size discrepancy.

This example requires only a rotation, not a reflection, to align the shapes. You can show this by observing that the determinant of the matrix T is 1 in this analysis.

det(tr.T)

ans = 1.0000

If you need a reflection in the transformation, the determinant of T is -1. You can force a reflection into the transformation as follows.

[dr,Zr,trr] = procrustes(X,Y,'Reflection',true); dr

dr = 0.8130

The d value increases dramatically, indicating that a forced reflection leads to a poor transformation of the landmark points. A plot of the transformed shape shows a similar result.

plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-',... Zr(:,1),Zr(:,2),'b:'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') text(Zr(:,1), Zr(:,2),('abcdefghij')') legend('X = Target','Y = Comparison',... 'Z = Transformed','Location','SW') xlim([0 65]); ylim([0 55]);

The landmark data points are now further away from their target counterparts. The transformed three is now an undesirable mirror image of the target three.

It appears that the shapes might be better matched if you flipped the transformed shape upside down. Flipping the shapes would make the transformation even worse, however, because the landmark data points would be further away from their target counterparts. From this example, it is clear that manually adjusting the scaling and reflection parameters is generally not optimal.

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping Video: https://www.youtube.com/watch?v=-5XxXRABXJs&feature=youtu.be Code: https://github.com/MIT-SPARK/Kimera Paper: https://arxiv.org/abs/1910.02490 (0)	2019.10.18
3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
GrabCut, GraphCut matlab Source code (0)	2018.06.18
face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks (0)	2017.06.02
Procrustes shape analysis 설명 잘 돼 있음 (0)	2017.06.02

EENG 512 - Computer Vision - 유튜브 동영상 강의

Computer Vision/Course 2019. 6. 13. 14:06

https://www.youtube.com/user/HoffWilliam/playlists

의

https://www.youtube.com/playlist?list=PLyED3W677ALNv8Htn0f9Xh-AHe1aZPftv

에 동영상 목록 있다.

http://inside.mines.edu/~whoff/courses/EENG512/lectures/

여기가 수업 홈피

Semester: Spring 2016

Week	Date	Due	Topics	Book	Before Class	In Class	Additional Material
1	1/13		Introduction	Ch 1		Course Intro (slides) 01-Intro to Computer Vision (slides) Intro to Matlab (slides) Lab 01	Review of linear alg (slides) Matlab tutorial (pdf)
2	1/18	HW0	Image Formation	Ch 2	Matlab image proc toolbox (slides, video 1) 02-Image formation (slides, videos 1, 2, 3, 4)	Quiz 1 Examples (slides) Lab 02	02a-Other sensors (slides)
	1/20		Coordinate Transformations			03-2D-2D Transformations (slides, videos 1, 2) 04-3D-3D Transformations (slides, videos 1, 2, 3, 4) showRotations.m, showRotations2.m Examples (slides)	04a-More on 3D rotations (slides)
3	1/25				05-3D-2D Transformations (slides, videos 1, 2, 3, 4)	Quiz 2 05a-More on 3D-2D (slides) Lab 03	05b-Other examples (slides)
	1/27		Image Filtering	Ch 3		06-Image filtering (slides, videos 1, 2, 3, 4) Image: Fig0309(a).tif, examples (slides)
4	2/1	HW1			07-Binary images (slides, videos 1, 2, 3) Images: Fig9.16(a).jpg, robot.jpg Videos: oneCCC.wmv, fiveCCC.wmv	Quiz 3 07-Examples (slides) Lab 04	Reading & writing movies in Matlab (slides) Lecture from EENG 510 (slides)
	2/3		Linear Least Squares	9.1, A.2		08-2D image transforms (slides, videos 1, 2) Images: book_A, book_B, wall1.jpg, wall2.jpg	08-Examples (slides) Images: pavilionLeft, pavilionCenter, pavilionRight
5	2/8		Alignment	6.1-6.2	15-Feature-based alignment (slides, video 1, 2, 3) 16-Pose estimation (slides, video 1) Images img1_rect, img2_rect, img3_rect	Quiz 4 16-Examples (slides) Lab 05 Images robot1.jpg, robot2.jpg, robot3.jpg
	2/10		Edge detection, OpenCV	4.2		09-Edge detection (slides, videos 1, 2) Image: house.jpg 32-OpenCV First program in Visual C++ (slides) OpenCV in Visual C++ (slides, tutorial)	32-Examples (slides) Video: testvideo.wmv
6	2/15	HW2	President's Day - no class
	2/17					19-Pose estimation using OpenCV (slides) Image: robotTarget.jpg Lab 06
7	2/22				Read: ARTag.pdf by M. Fiala	Quiz 5 33-ARTags (slides) Lab 07 Code: main.cpp, findSquare.cpp, readBitMatrix.cpp, ARTag.h, ARTag.cpp
	2/24		SIFT	4.1.2		12-SIFT (slides, video 1) Vlfeat (slides), images: graffiti.zip, test000.jpg, test012.jpg
8	2/29	HW3			Read “Distinctive Image Features from Scale-Invariant Keypoints" by D. Lowe 14-SIFT-based object recognition (slides, video 1, 2) Images: testSIFTimages.zip	Quiz 6 14-Examples (slides) Lab 08
	3/2		RANSAC	6.1.4		20-RANSAC (slides) Images: floor1.jpg, floor2.jpg, table1.jpg, table2.jpg
9	3/7			4.1	10-Image Patches (slides) Images: building.avi 11-Corners (slides, video 1, 2, 3)	Quiz 7 11-Examples (slides) Lab 09
	3/9		Hough Transform	4.3		Final Project Information 13-Hough transform (slides, video 1, 2, 3)	13-Examples (slides) Image: hallway.jpg, Grenouilles.jpg Videos: sphero1.wmv, sphero3.wmv, sphero4.wmv
	3/14		Spring break - no class
	3/16		Spring break - no class
10	3/21	HW4		2.1, 6.3.2		34-Finding a checkerboard (slides) Video: board.mp4 Lab 10 findCheckerBoard.m
	3/23		Snow day - no class
11	3/28	Project proposal		6.1.4		20a-Finding a planar object using OpenCV (slides) Example video	Code and data: planarCodeData.zip Complete main program
	3/30		Stereo Vision	Ch 11		28-Stereo (slides, video pt1, pt2, pt3) Program: stereo_ball.m Images: left.png, right.png	Exercise Code: stereo_BasicBlockMatch.m, stereo_BlockMatchDynamicProg.m
12	4/4	HW5		A.1.1 6.2.1		Upcoming courses 17-SVD (slides, video 1, 2) 18-Linear pose estimation (slides, video 1, 2, 3)	18-Examples (slides)
	4/6		Calibration	6.3		23-Camera calibration (slides)	23-Examples (slides) calibrationImages.zip
13	4/11			4.1.4, 8.1.3	Read "An Iterative Image Registration Technique with an Application to Stereo Vision" by Lucas and Kanade	Quiz 20b-Tracking a planar object using OpenCV (slides) Lab 11	Code: 20b-ProgramFiles.zip
	4/13		Structure from Motion	7.2		24-Epipolar and Essential matrix (slides, video pt1, pt2) 25-Structure from motion (slides, video pt1, pt2) Programs: createpts.m, essential.m, drawepipolar.m, twoview.m	24-Examples (slides) Images: cube1.jpg, cube2.jpg 25-Examples (slides) Program: doCube.m
14	4/18	Progress report		7.4	26-Fundamental matrix (slides, video pt1)	Quiz Program: syntheticExample.zip Lab 12	Code: dofundamental.zip Images: pavillionCorner1.jpg, pavillionCorner2.jpg
	4/20					27-Bundle adjust (slides, video pt1) 29-Uncertainty (slides, video pt1, pt2) Program: pose.m	29-Examples (slides)
15	4/25	HW6				Guest speaker: Dr. Josh Gordon from National Institute of Standards and Technology
	4/27					Individual project meetings
16	5/2	Presentations				Presentation Schedule
	5/4	Presentations				Presentation Schedule

https://www.youtube.com/playlist?list=PLyED3W677ALNv8Htn0f9Xh-AHe1aZPftv

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision > Course' 카테고리의 다른 글

Computer Vision Specialization \| Coursera. Learn how computers visually process the world. Dive into artificial intelligence’s fastest-growing field There are 4 Courses in this Specialization: COURSE 1) Computer Vision Basics. COURSE 2) Image Processi.. (0)	2019.11.18
Computer Vision for Visual Effects - 유튜브 동영상 강의 (0)	2019.06.13

Posted by uniqueone

,

Computer Vision for Visual Effects - 유튜브 동영상 강의

Computer Vision/Course 2019. 6. 13. 13:56

https://cvfxbook.com/about/

https://www.ecse.rpi.edu/~rjradke/cvfxcourse.html

Computer Vision for Visual Effects by Rich Radke is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Based on a work at http://www.ecse.rpi.edu/~rjradke/cvfxcourse.html.
Permissions beyond the scope of this license may be available at this contact page.

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision > Course' 카테고리의 다른 글

Computer Vision Specialization \| Coursera. Learn how computers visually process the world. Dive into artificial intelligence’s fastest-growing field There are 4 Courses in this Specialization: COURSE 1) Computer Vision Basics. COURSE 2) Image Processi.. (0)	2019.11.18
EENG 512 - Computer Vision - 유튜브 동영상 강의 (0)	2019.06.13

Posted by uniqueone

,

GrabCut, GraphCut matlab Source code

Computer Vision 2018. 6. 18. 08:22

https://github.com/taigw/GrabCut-GraphCut

https://github.com/xiumingzhang/grabcut/tree/master/bin_graphcuts

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

3D Morphable Face Models -- Past, Present and Future (0)	2019.09.06
Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02
face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks (0)	2017.06.02
Procrustes shape analysis 설명 잘 돼 있음 (0)	2017.06.02
Plotting and Intrepretating an ROC Curve (0)	2017.03.01

Posted by uniqueone

,

face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks

Computer Vision 2017. 6. 2. 16:58

https://github.com/GilLevi/AgeGenderDeepLearning/issues/1

Hi,

Thank you for your interest in our work.

Alignment is done by detecting about 68 facial landmark and applying an affine transformation to a predefined set of locations for those 68 landmarks.

I didn't understand what you mean by the size of the face in the image. Can you please elaborate?

Best,
Gil

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

Compare Handwritten Shapes Using Procrustes Analysis (0)	2019.07.02
GrabCut, GraphCut matlab Source code (0)	2018.06.18
Procrustes shape analysis 설명 잘 돼 있음 (0)	2017.06.02
Plotting and Intrepretating an ROC Curve (0)	2017.03.01
awesome-object-proposals (0)	2017.02.28

Posted by uniqueone

,

Procrustes shape analysis 설명 잘 돼 있음

Computer Vision 2017. 6. 2. 14:59

https://www.mathworks.com/matlabcentral/mlc-downloads/downloads/submissions/23972/versions/22/previews/chebfun/examples/geom/html/Procrustes.html

Alex Townsend, August 2011

Shape analysis on a beach holiday

Procrustes shape analysis is a statistical method for analysing the distribution of sets of shapes (see [1]). Let's suppose we pick up a pebble from the beach and want to know how close its shape matches the outline of an frisbee. Here is a plot of the frisbee and the beach pebble.

figure(1);
t=chebfun('x',[0,2*pi]);
f=3*(1.5*cos(t) + 1i*sin(t)); %frisbee
g=exp(1i*pi/3)*(1+cos(t)+1.5i*sin(t)+ .125*(1+1.5i)*sin(3*t).^2); %pebble
plot(f,'r','LineWidth',2), hold on, axis equal, plot(g,'k','LineWidth',2);
title('Frisbee and pebble','FontSize',16); hold off;

Two shapes are equivalent if one can be obtained from the other by translating, scaling and rotating. Before comparison we thus:

1. Translate the shapes so they have mean zero.

2. Scale so the shapes have Root Mean Squared Distance (RMSD) to the origin of one.

3. Rotate to align major axis.

Here is how the frisbee and the pebble compare after each stage.

function [f,g]=ShapeAnalysis(f,g)
    % SHAPEANALYSIS(F,G) Plots the parameterised curves before and after
    % each stage of translating, scaling and aligning. Outputs are
    % parameterised curves ready for Procustes shape analysis.

    LW = 'LineWidth'; FS = 'FontSize';
    % Plot orignal
    subplot(2,2,1)
    plot(f,'r',LW,2), hold on, axis equal, plot(g,'k',LW,2)
    title('Orignal',FS,16)

    % Translate mean to 0.
    f = f-mean(f); g = g-mean(g);
    subplot(2,2,2)
    plot(f,'r',LW,2), hold on, axis equal, plot(g,'k',LW,2)
    title('After translation',FS,16)

    % Scale so RMSD is 1.
    f = f/norm(f); g = g/norm(g);
    subplot(2,2,3)
    plot(f,'r',LW,2), hold on, axis equal, plot(g,'k',LW,2)
    title('After scaling',FS,16)

    % Align major axis.
    subplot(2,2,4)
    % Find argument of major axis.
    [~,fxmax]=max(abs(f)); [~,gxmax]=max(abs(g));
    rotf=angle(f(fxmax)); rotg=angle(g(gxmax));
    % Rotate both so major axis lies on the +ve real axis.
    x = chebfun('x',[0,2*pi]);
    f = exp(-1i*rotf)*f(mod(x+fxmax,2*pi));
    g = exp(-1i*rotg)*g(mod(x+gxmax,2*pi));
    plot(f,'r',LW,2), hold on, axis equal, plot(g,'k',LW,2)
    title('After aligning',FS,16), hold off
end

[f,g] = ShapeAnalysis(f,g);

To calculate the Procrustes distance we would measure the error between the two shapes at a finite number of reference points and compute the vector 2-norm. In this discrete case In Chebfun we calculate the continuous analogue:

norm(f-g)

ans =
   0.072347575424997

A little warning

In the discrete version of Procrustes shape analysis statisticians choose reference points on the two shapes (to compare). They then work out the difference between corresponding reference points. The error computed depends on this correspondence. A different correspondence gives a different error. In the continuous case this correspondence becomes the parameterisation. A different parameterisation of the two curves gives a different error. This continuous version of Procrustes (as implemented in this example) is therefore more of an 'eye-ball' check than a robust statistical analysis.

A shape and its reflection

At the beach shapes reflect on the surface of the sea. An interesting question is: How close, in shape, is a pebble to its reflection? Here is a plot of a pebble and its reflection.

figure(2)
% pebble
f = exp(1i*pi/3)*(1+cos(t)+1.5i*sin(t)+.125*(1+1.5i)*sin(3*t).^2);
% reflection
g = exp(-1i*pi/3)*(1+cos(2*pi-t)-1.5i*sin(2*pi-t)+.125*(1-1.5i)*sin(3*(2*pi-t)).^2);

plot(f,'r','LineWidth',2), hold on, axis equal, plot(g,'k','LineWidth',2)
title('Pebble and its reflection','FontSize',16), hold off

Here is how the pebble and its reflection compare after each stage of translating, scaling and rotating.

[f,g]=ShapeAnalysis(f,g);

Now we calculate the continuous Procrustes distance.

norm(f-g)

ans =
   0.097593759012228

Comparing this result to the Procrustes distance of the pebble and a frisbee shows that the pebble is closer in shape to a frisbee than its own reflection!

end

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

GrabCut, GraphCut matlab Source code (0)	2018.06.18
face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks (0)	2017.06.02
Plotting and Intrepretating an ROC Curve (0)	2017.03.01
awesome-object-proposals (0)	2017.02.28
Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄 (0)	2016.12.08

Posted by uniqueone

,

Plotting and Intrepretating an ROC Curve

Computer Vision 2017. 3. 1. 17:50

http://gim.unmc.edu/dxtests/ROC1.htm

http_gim.unmc.edu_dxtests_ROC1.htm.pdf

Introduction to ROC Curves

| Previous Section | Main Menu | Next Section |

The sensitivity and specificity of a diagnostic test depends on more than just the "quality" of the test--they also depend on the definition of what constitutes an abnormal test. Look at the the idealized graph at right showing the number of patients with and without a disease arranged according to the value of a diagnostic test. This distributions overlap--the test (like most) does not distinguish normal from disease with 100% accuracy. The area of overlap indicates where the test cannot distinguish normal from disease. In practice, we choose a cutpoint (indicated by the vertical black line) above which we consider the test to be abnormal and below which we consider the test to be normal. The position of the cutpoint will determine the number of true positive, true negatives, false positives and false negatives. We may wish to use different cutpoints for different clinical situations if we wish to minimize one of the erroneous types of test results.

We can use the hypothyroidism data from the likelihood ratio section to illustrate how sensitivity and specificity change depending on the choice of T4 level that defines hypothyroidism. Recall the data on patients with suspected hypothyroidism reported by Goldstein and Mushlin (J Gen Intern Med 1987;2:20-24.). The data on T4 values in hypothyroid and euthyroid patients are shown graphically (below left) and in a simplified tabular form (below right).

T4 value	Hypothyroid	Euthyroid
5 or less	18	1
5.1 - 7	7	17
7.1 - 9	4	36
9 or more	3	39
Totals:	32	93

Suppose that patients with T4 values of 5 or less are considered to be hypothyroid. The data display then reduces to:

T4 value	Hypothyroid	Euthyroid
5 or less	18	1
> 5	14	92
Totals:	32	93

You should be able to verify that the sensivity is 0.56 and the specificity is 0.99.

Now, suppose we decide to make the definition of hypothyroidism less stringent and now consider patients with T4 values of 7 or less to be hypothyroid. The data display will now look like this:

T4 value	Hypothyroid	Euthyroid
7 or less	25	18
> 7	7	75
Totals:	32	93

You should be able to verify that the sensivity is 0.78 and the specificity is 0.81.

Lets move the cut point for hypothyroidism one more time:

T4 value	Hypothyroid	Euthyroid
< 9	29	54
9 or more	3	39
Totals:	32	93

You should be able to verify that the sensivity is 0.91 and the specificity is 0.42.

Now, take the sensitivity and specificity values above and put them into a table:

Cutpoint	Sensitivity	Specificity
5	0.56	0.99
7	0.78	0.81
9	0.91	0.42

Notice that you can improve the sensitivity by moving to cutpoint to a higher T4 value--that is, you can make the criterion for a positive test less strict. You can improve the specificity by moving the cutpoint to a lower T4 value--that is, you can make the criterion for a positive test more strict. Thus, there is a tradeoff between sensitivity and specificity. You can change the definition of a positive test to improve one but the other will decline.

The next section covers how to use the numbers we just calculated to draw and interpret an ROC curve.
.

| Previous Section | Main Menu | Next Section | http://gim.unmc.edu/dxtests/roc2.htm

http_gim.unmc.edu_dxtests_roc2.htm.pdf

Plotting and Intrepretating an ROC Curve

| Previous Section | Main Menu | Next Section |This section continues the hypothyroidism example started in the the previous section. We showed that the table at left can be summarized by the operating characteristics at right:

T4 value	Hypothyroid	Euthyroid
5 or less	18	1
5.1 - 7	7	17
7.1 - 9	4	36
9 or more	3	39
Totals:	32	93

Cutpoint	Sensitivity	Specificity
5	0.56	0.99
7	0.78	0.81
9	0.91	0.42

The operating characteristics (above right) can be reformulated slightly and then presented graphically as shown below to the right:

Cutpoint	True Positives	False Positives
5	0.56	0.01
7	0.78	0.19
9	0.91	0.58

This type of graph is called a Receiver Operating Characteristic curve (or ROC curve.) It is a plot of the true positive rate against the false positive rate for the different possible cutpoints of a diagnostic test.

An ROC curve demonstrates several things:

It shows the tradeoff between sensitivity and specificity (any increase in sensitivity will be accompanied by a decrease in specificity).
The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test.
The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test.
The slope of the tangent line at a cutpoint gives the likelihood ratio (LR) for that value of the test. You can check this out on the graph above. Recall that the LR for T4 < 5 is 52. This corresponds to the far left, steep portion of the curve. The LR for T4 > 9 is 0.2. This corresponds to the far right, nearly horizontal portion of the curve.
The area under the curve is a measure of text accuracy. This is discussed further in the next section.

. | Previous Section | Main Menu | Next Section |

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

face Alignment 를 하는 방법에 대해 나옴. affine transformation to a predefined set of locations for those 68 landmarks (0)	2017.06.02
Procrustes shape analysis 설명 잘 돼 있음 (0)	2017.06.02
awesome-object-proposals (0)	2017.02.28
Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄 (0)	2016.12.08
Tangent line to a curve at a given point (0)	2016.11.29

Posted by uniqueone

,

awesome-object-proposals

Computer Vision 2017. 2. 28. 09:37

https://github.com/caocuong0306/awesome-object-proposals

awesome-object-proposals

A curated list of object proposals resources for object detection.

Objectness [Project]
- Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari, What is an object?, CVPR, 2010.
- Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari, Measuring the Objectness of Image Windows, TPAMI, 2012.
Rahtu [Project]
- Esa Rahtu, Juho Kannala, and Matthew Blaschko, Learning a Category Independent Object Detection Cascade, ICCV, 2011.
Cascaded Ranking SVMs [Code]
- Ziming Zhang, Jonathan Warrell, and Philip H. S. Torr, Proposal generation for object detection using cascaded ranking SVMs, CVPR, 2011.
Salient
- Jie Feng, Yichen Wei, Litian Tao, Chao Zhang, and Jian Sun, Salient Object Detection by Composition, ICCV, 2011.
RandomizedSeeds [Project]
- Michael Van den Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van Gool, Online Video SEEDS for Temporal Window Objectness, ICCV, 2013.
BING [Project]
- Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, and Philip Torr, BING: Binarized Normed Gradients for Objectness Estimation at 300fps, CVPR, 2014.
CrackingBING
- Qiyang Zhao, Zhibin Liu, Baolin Yin, Cracking BING and Beyond, BMVC, 2014.
BING++
- Ziming Zhang, Yun Liu, Tolga Bolukbasi, Ming-Ming Cheng, and Venkatesh Saligrama, BING++: A Fast High Quality Object Proposal Generator at 100fps, arXiv:1511.04511.
- Ziming Zhang, Xi Chen, Yanjun Zhu, Zhiguo Cao, Venkatesh Saligrama, and Philip H.S. Torr, Sequential Optimization for Efficient High-Quality Object Proposal Generation, arXiv:1511.04511v2.
EdgeBoxes [Project] [Code]
- Piotr Dollár and C. Lawrence Zitnick, Edge Boxes: Locating Object Proposals from Edges, ECCV, 2014.
ContourBox
- Cewu Lu , Shu Liu, Jiaya Jia and Chi-Keung Tang, Contour Box: Rejecting Object Proposals Without Explicit Closed Contours, ICCV, 2015.

Similarity Grouping

CPMC [Project]
- Joao Carreira and Cristian Sminchisescu, Constrained Parametric Min-Cuts for Automatic Object Segmentation, CVPR, 2010.
- Joao Carreira and Cristian Sminchisescu, CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts, TPAMI, 2012.
Endres [Project]
- Ian Endres and Derek Hoiem, Category Independent Object Proposals, ECCV, 2010.
- Ian Endres and Derek Hoiem, Category-Independent Object Proposals With Diverse Ranking, TPAMI, 2014.
Selective Search [Project]
- Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, and Arnold W. M. Smeulders, Segmentation As Selective Search for Object Recognition, ICCV, 2011.
- Jasper R. R. Uijlings, Koen E. A. van de Sande, Theo Gevers, and Arnold W. M. Smeulders, Selective Search for Object Recognition, IJCV, 2013.
ObjSal [Project]
- Kai-Yueh Chang, Tyng-Luh Liu, Hwann-Tzong, and Chen Shang-Hong Lai, Fusing Generic Objectness and Visual Saliency for Salient Object Detection, ICCV, 2011.
RandomizedPrim [Project]
- Santiago Manen, Matthieu Guillaumin, and Luc Van Gool, Prime Object Proposals with Randomized Prim's Algorithm , ICCV, 2013.
Rantalankila
- Pekka Rantalankila, Juho Kannala, and Esa Rahtu, Generating Object Segmentation Proposals Using Global and Local Search , CVPR, 2014.
RIGOR [Project]
- Ahmad Humayun, Fuxin Li, and James M. Rehg, RIGOR: Reusing Inference in Graph Cuts for Generating Object Regions, CVPR, 2014.
GOP [Project]
- Philipp Krähenbühl and Vladlen Koltun, Geodesic Object Proposals, ECCV, 2014.
MCG [Project]
- Pablo Arbelaez, Jordi Pont-Tuset, Jonathan T. Barron, Ferran Marques, Jitendra Malik, Multiscale Combinatorial Grouping, CVPR, 2014.
- Jordi Pont-Tuset, Pablo Arbelaez, Jonathan T. Barron, Ferran Marques, Jitendra Malik, Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation, TPAMI, 2017.

Supervised Learning

MultiBox [Project]
- Dumitru Erhan, Christian Szegedy, Alexander Toshev, and Dragomir Anguelov, Scalable Object Detection using Deep Neural Networks, CVPR, 2014.
- Christian Szegedy, Scott Reed, Dumitru Erhan, and Dragomir Anguelov, Scalable, High-Quality Object Detection, arXiv:1412.1441.
DeepMask [Code]
- Pedro O. Pinheiro, Ronan Collobert and Piotr Dollár, Learning to Segment Object Candidates, NIPS, 2015.
Mid-level Cues
- Tom Lee, Sanja Fidler, and Sven Dickinson, Learning to Combine Mid-level Cues for Object Proposal Generation, ICCV, 2015.
LPO [Project]
- Philipp Krähenbühl and Vladlen Koltun, Learning to Propose Objects, CVPR, 2015.
RPN [Project]
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, 2015.
DeepProposal [Code]
- Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, and Luc Van Gool, DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers, ICCV, 2015.
3DOP [Project]
- Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew Berneshawi, Huimin Ma, Sanja Fidler, and Raquel Urtasun, 3D Object Proposals for Accurate Object Class Detection, NIPS, 2015.
Mono3D [Project]
- Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun, Monocular 3D Object Detection for Autonomous Driving, CVPR, 2016.
HyperNet
- Tao Kong, Anbang Yao, Yurong Chen, and Fuchun Sun, HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection, CVPR, 2016.
CRAFT [Project]
- Bin Yang, Junjie Yan, Zhen Lei, and Stan Z. Li, CRAFT Objects From Images, CVPR, 2016.
AttractioNet [Project]
- Spyros Gidaris and Nikos Komodakis, Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization, BMVC, 2016.
SPOP-net
- Zequn Jie, Xiaodan Liang, Jiashi Feng, Wen Feng Lu, Eng Hock Francis Tay, and Shuicheng Yan, Scale-Aware Pixelwise Object Proposal Networks, TIP, 2016.
FCN
- Zequn Jie, Wen Feng Lu, Siavash Sakhavi, Yunchao Wei, Eng Hock Francis Tay, and Shuicheng Yan, Object Proposal Generation with Fully Convolutional Networks, TCSVT, 2016.
InstanceFCN
- Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, and Jian Sun, Instance-Sensitive Fully Convolutional Networks, ECCV, 2016.
MV3D [Project]
- Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia, Multi-View 3D Object Detection Network for Autonomous Driving, arxiv.1611.07759. 2016.

Hybrid / Part-based

ShapeSharing [Project]
- Jaechul Kim and Kristen Grauman, Shape Sharing for Object Segmentation, ECCV, 2012.
OOP [Project]
- Shengfeng He and Rynson W.H. Lau, Oriented Object Proposals, ICCV, 2015.
Object Discovery [Project]
- Minsu Cho, Suha Kwak, Cordelia Schmid, and Jean Ponce, Unsupervised Object Discovery and Localization in the Wild: Part-based Matching with Bottom-up Region Proposals, CVPR, 2015.
Adobe Boxes [Code]
- Authors, Adobe Boxes: Locating Object Proposals Using Object Adobes, TIP, 2016.

RGB-D

MCG-D [Project]
- Saurabh Gupta, Ross Girshick, Pablo Arbeláez and Jitendra Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation, ECCV, 2014.
StereoObj [Dataset]
- Shao Huang, Weiqiang Wang, Shengfeng He, and Rynson W.H. Lau, Stereo Object Proposals, TIP, 2017.
Elastic Edge Boxes [Project]
- Jing Liu, Tongwei Ren, Yuantian Wang, Sheng-Hua Zhong, Jia Bei, Shengchao Chen, Object proposal on RGB-D images via elastic edge boxes, Neurocomputing, 2017.

Re-ranking & Refinement

MTSE [Project]
- Xiaozhi Chen, Huimin Ma, Xiang Wang, Zhichen Zhao, Improving Object Proposals with Multi-Thresholding Straddling Expansion, CVPR, 2015.
- Xiaozhi Chen, Huimin Ma, Chenzhuo Zhu, Xiang Wang, Zhichen Zhao, Boundary-aware box refinement for object proposal generation, Neurocomputing, 2017.
DeepBox [Project]
- Weicheng Kuo, Bharath Hariharan, and Jitendra Malik, DeepBox: Learning Objectness with Convolutional Networks, ICCV, 2015.
SharpMask [Code]
- Pedro O. Pinheiro, Tsung-Yi Lin, Ronan Collobert, and Piotr Dollár, Learning to Refine Object Segments, ECCV, 2016.
DeepStereoOP
- Cuong C. Pham and Jae Wook Jeon, Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks, SPIC, 2017.

Spatio-Temporal

STMOP [Project]
- Katerina Fragkiadaki, Pablo Arbelaez, Panna Felsen, and Jitendra Malik, Learning to Segment Moving Objects in Videos, CVPR, 2015.

Evaluation

Hosang benchmark [Project] [Code]
- Jan Hosang, Rodrigo Benenson, and Bernt Schiele, How good are detection proposals, really?, BMVC, 2014.
- Jan Hosang, Rodrigo Benenson, Piotr Dollár, and Bernt Schiele, What makes for effective detection proposals?, TPAMI, 2016.
Jordi Pont-Tuset and Luc Van Gool, Boosting Object Proposals: From Pascal to COCO, ICCV, 2015. [Project]
Neelima Chavali, Harsh Agrawal, Aroma Mahendru, and Dhruv Batra, Object-Proposal Evaluation Protocol is 'Gameable', CVPR, 2016. [Project]

Low-Level Processing

Felzenszwalb's segmentation [Project]
- Pedro F. Felzenszwalb and Daniel P. Huttenlocher, Efficient Graph-Based Image Segmentation, IJCV, 2004.
Structured Edge Detection [Code]
- Piotr Dollár and C. Lawrence Zitnick, Structured Forests for Fast Edge Detection , ICCV, 2013.

Datasets

PASCAL [Project]
- Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman, The PASCAL Visual Object Classes (VOC) Challenge, IJCV, 2010.
MS COCO [Project]
- Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollár, Microsoft COCO: Common Objects in Context, ECCV, 2014.
ImageNet [Project]
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database, CVPR, 2009.
NYU Depth Dataset [Project]
- Nathan Silberman, Pushmeet Kohli, Derek Hoiem, and Rob Fergus, Indoor Segmentation and Support Inference from RGBD Images, ECCV, 2012.
KITTI [Project]
- Andreas Geiger and Philip Lenz and Raquel Urtasun, Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite, CVPR, 2012.

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

Procrustes shape analysis 설명 잘 돼 있음 (0)	2017.06.02
Plotting and Intrepretating an ROC Curve (0)	2017.03.01
Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄 (0)	2016.12.08
Tangent line to a curve at a given point (0)	2016.11.29
cosine similarity matlab code (0)	2016.11.29

Posted by uniqueone

,

Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄

Computer Vision 2016. 12. 8. 09:59

https://kr.mathworks.com/company/newsletters/articles/creating-computer-vision-and-machine-learning-algorithms-that-can-analyze-works-of-art.html?s_eid=PSM_da

예술 작품을 분석 할 수있는 컴퓨터 비전 및 기계 학습 알고리즘 생성

Ahut Elgammal, Rutgers 대학 작성

당신이 그림을 공부할 때, 그것에 대해 여러 가지 추측을 할 수있는 기회가 있습니다. 예를 들어 주제를 이해하는 것 외에도 기간, 스타일 및 아티스트별로 분류 할 수 있습니다. 컴퓨터 알고리즘은 사람처럼 쉽게 분류 작업을 수행 할 수있을 정도로 그림을 "이해"할 수 있습니까?

Rutgers 대학의 Art and Art Intelligence Laboratory의 동료들과 저는 MATLAB®, Statistics and Machine Learning Toolbox ™ 및 지난 6 세기의 수천 점의 그림 데이터베이스를 사용하여이 문제를 탐구했습니다. 우리는 또한 AI 알고리즘의 기능과 한계에 관한 흥미로운 두 가지 질문을 제기했습니다. 즉, 어떤 그림이 최신 아티스트에게 가장 큰 영향을 미치는지, 그리고 시각적 기능 만 사용하여 그림의 창의성을 측정 할 수 있는지 여부입니다.

그림 분류를위한 시각적 특징 추출하기

우리는 스타일 (예 : 입체파, 인상파, 추상 표현주의 또는 바로크), 장르 (예 : 풍경, 인물 또는 정물) 및 아티스트와 같이 대규모 그룹의 그림을 스타일별로 분류 할 수있는 알고리즘을 개발하고자했습니다. 이 분류에 대한 요구 사항 중 하나는 색상, 구성, 질감, 원근감, 주제 및 기타 시각적 기능을 인식하는 것입니다. 두 번째는 그림 간의 유사성을 가장 잘 나타내는 시각적 기능을 선택하는 기능입니다.

MATLAB 및 Image Processing Toolbox ™를 사용하여 페인팅의 시각적 기능을 추출하는 알고리즘을 개발했습니다. 특징 추출 알고리즘은 컴퓨터 비전에서 매우 일반적이며 구현하기 쉽습니다. 더 어려운 작업은 최고의 기계 학습 기술을 찾는 것이 었습니다. 우리는 통계 및 Machine Learning Toolbox에서 SVM (Support Vector Machine) 및 기타 분류 알고리즘을 테스트하여 스타일 분류에 유용한 시각적 기능을 식별하는 것으로 시작했습니다. MATLAB에서 우리는 거리 메트릭 학습 기법을 적용하여 특징을 평가하고 알고리즘의 그림 분류 능력을 향상 시켰습니다.

우리가 개발 한 알고리즘은 우리 데이터베이스의 회화 스타일을 60 %의 정확도로 분류했으며, 우연한 결과는 약 2 %였습니다. 예술 사학자는 60 % 이상의 정확성으로이 작업을 수행 할 수 있지만, 알고리즘은 일반적인 비전문가보다 뛰어나다.

예술적 영향을 밝히기 위해 기계 학습 사용

한 쌍의 그림들 사이의 유사점을 확실하게 식별 할 수있는 알고리즘을 갖추었을 때, 우리는 다음 과제를 해결할 준비가되었습니다. 기계 학습을 사용하여 예술적 영향을 나타냅니다. 우리의 가설은 스타일 분류 (감독 학습 문제)에 유용한 시각적 인 특징이 또한 영향 (감독되지 않은 문제)을 결정하는 데 사용될 수 있다는 것이었다.

예술 사학자들은 예술가들이 동시대 사람들과 함께 일하고, 여행했거나, 훈련 받았던 방식을 토대로 예술적 영향 이론을 개발합니다. MATLAB 기반 기계 학습 알고리즘은 시각적 요소와 구성 날짜 만 사용했습니다. 우리는 그림에서 물체와 기호를 고려한 알고리즘이 색상이나 질감과 같은 저수준 기능에 의존하는 알고리즘보다 효과적 일 것이라고 가정했습니다. 이를 염두에두고 Google은 특정 이미지를 식별하기 위해 Google 이미지에서 학습 한 분류 알고리즘을 사용했습니다.

우리는 550 년의 기간에 걸쳐 작업 한 66 명의 다른 아티스트의 1700 개 이상의 그림에 대한 알고리즘을 테스트했습니다. 이 알고리즘은 Diego Velazquez의 "Portrait of Pope Innocent X"가 Francis Bacon의 "Velazquez의 Portrait of Pope Innocent X"(그림 1)에 미치는 영향을 쉽게 확인했습니다.

그림 1. 왼쪽 : Diego Velázquez의 "Portrait of Pope innocent X."오른쪽 : Francis Bacon의 "Velázquez의 초상화 이후의 연구"

이 두 그림 사이의 구성과 주제의 유사점은 평신도가 쉽게 발견 할 수 있지만 알고리즘은 또한 우리가 함께 작업 한 예술 사학자를 놀라게 한 결과를 만들어 냈습니다. 예를 들어, 우리의 알고리즘은 "Bazille 's Studio; 9 rue de la Condamine ", 1870 년 프랑스 인상파 Frederic Bazille에 의해 그려진 Norman Rockwell의"Shuffleton 's Barbershop "에 대한 영향력으로 80 년 후에 완성되었습니다 (그림 2). 한눈에 보면 그림이 비슷하지는 않겠지 만, 각 작품의 오른쪽 하단에있는 히터, 가운데에있는 세 사람의 그룹, 실내에있는 의자와 삼각형 공간을 포함하여 구성과 주제의 유사점을 자세히 살펴볼 수 있습니다. 왼쪽 아래.

Rutgers_fig2_w.jpg
그림 2. 왼쪽 : Frederic Bazille의 "Bazille 's Studio; 9 rue de la Condamine "오른쪽 : Norman Rockwell의"Shuffleton 's Barbershop "노란색 원은 비슷한 물체를 나타내고 빨간색 선은 비슷한 구성을 나타내고 파란색 직사각형은 비슷한 구조 요소를 나타냅니다.

우리의 데이터 세트에서, 알고리즘은 예술 사학자가 인정한 55 가지 영향의 60 %를 정확하게 식별하여, 시각적 유사성만으로 많은 영향을 결정할 수있는 충분한 정보를 알고리즘에 제공합니다.

네트워크 중심성 문제를 해결함으로써 창의성 측정

최근 우리의 연구는 예술의 창의성을 측정하는 알고리즘 개발에 중점을두고 있습니다. 우리는이 프로젝트가 광범위하고 사용되는 정의에 기반을 두었습니다.이 정의는 객체가 새롭고 영향력있는 것이라면 창의적이라고 식별합니다. 이 점에서, 창조적 인 그림은 그 전에 (소설적인) 그림과 다르지만 그 뒤에 오는 그림 (유력한 것)과 유사합니다.

이 문제를 해결하기 위해 우리는 그림 사이의 유사점을 식별하기 위해 MATLAB 알고리즘을 적용 할 수있는 기회를 다시 한번 보았습니다. MATLAB에서 정점이 그림이고 각 모서리가 정점에서 두 그림 간의 유사성을 나타내는 네트워크를 만들었습니다. 이 네트워크의 일련의 변형을 통해 우리는 그러한 그래프에서 창의성에 대한 추론이 네트워크 중심성 문제이며 MATLAB을 사용하여 효율적으로 해결할 수 있음을 확인했습니다.

우리는 62,000 개 이상의 그림이 포함 된 두 가지 데이터 세트에서 창의성 알고리즘을 테스트했습니다. 이 알고리즘은 그림 3에 나와있는 작품 중 일부를 포함하여 미술 사학자들이 소설과 영향력으로 인정한 여러 작품에 높은 점수를주었습니다. 파블로 피카소의 "젊은 아가씨들"(1907)보다 높은 순위는 같은 기간에 여러 그림 Kazimir Malevich. Malevich의 작업에 대해 거의 알지 못했기 때문에이 결과는 처음에는 놀랐습니다. 나는 그가 추상적 예술의 가장 초기 발전 중 하나 인 Suprematism 운동의 창시자라는 것을 그 후 배웠다.

Rutgers_fig3_w.jpg
그림 3. 1400에서 2000 (x 축)까지의 회화에 대한 계산 된 창의성 점수 (y 축). 개별 기간 동안 가장 높은 점수를받은 그림을 보여줍니다.

알고리즘의 기본 검증을 수행하기 위해 특정 미술 작품의 날짜를 변경하여 효과적으로 시간을 앞뒤로 이동했습니다. 이 "타임 머신"실험에서 우리는 인상주의 미술이 1600 년대로 되돌아 가면서 상당한 창의성 점수 증가를 보았고 바로크 그림의 경우 상당한 감소가 1900 년대로 나아갔습니다. 알고리즘은 300 년 전에 창조적이었던 것이 오늘날 창조적이지 않다는 것을 정확하게 인식했으며, 과거에 도입 된 경우 창조적 인 무언가가 훨씬 창조적이었을 것입니다.

예술 분야의 지속적인 연구를위한 확장 가능하고 확장 가능한 프레임 워크

인간은 예술을 분류 할 수있는 선천적 인 지각 기술을 가지고 있으며 그림 쌍의 유사성을 식별하는 데 탁월하지만 수천 또는 수백만 개의 그림에 객관적으로 이러한 기술을 적용 할 시간과 인내가 없습니다. 이 규모에서 작업을 처리하는 것은 컴퓨터가 자체적으로 들어오는 곳입니다. 인간과 유사한 지각 능력을 가진 기계 학습 알고리즘을 개발함으로써, 우리의 목표는 예술 역사가에게 방대한 이미지 데이터베이스를 탐색 할 수있는 도구를 제공하는 것입니다.

유사점을 확인하고 창의성을 측정하기 위해 MATLAB에서 개발 한 프레임 워크는 예술에 국한되지 않습니다. 개별 저작물을 알고리즘에 액세스 할 수있는 방식으로 인코딩 할 수있는 한 문학, 음악 또는 거의 모든 다른 창의적 도메인에 적용 할 수 있습니다.

그러나 지금은 시각 예술에 중점을두고 있습니다. 우리는 기계 학습 알고리즘이 좋은 결과를 가져올뿐만 아니라 그러한 결과에 어떻게 도달하는지에 관심을 가지고 있습니다. 이 영역에서도 MATLAB은 결과를 쉽고 빠르게 시각화 할 수있는 많은 방법을 제공하므로 엄청난 이점입니다. 이러한 시각화를 통해 우리는 결과를 이해하고 진행중인 인공 지능 연구에이를 알릴 수 있습니다.

Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art

By Ahmed Elgammal, Rutgers University

When you study a painting, chances are that you can make several inferences about it. In addition to understanding the subject matter, for example, you may be able to classify it by period, style, and artist. Could a computer algorithm “understand” a painting well enough to perform these classification tasks as easily as a human being?

My colleagues and I at the Art and Artificial Intelligence Laboratory at Rutgers University explored this question using MATLAB^®, Statistics and Machine Learning Toolbox™, and a database of thousands of paintings from the past six centuries. We also addressed two other intriguing questions about the capabilities and limitations of AI algorithms: whether they can identify which paintings have had the greatest influence on later artists, and whether they can measure a painting’s creativity using only its visual features.

Extracting Visual Features for Classifying Paintings

We wanted to develop algorithms capable of classifying large groups of paintings by style (for example, as Cubist, Impressionist, Abstract Expressionist, or Baroque), genre (for example, landscape, portrait, or still life), and artist. One requirement for this classification is the ability to recognize color, composition, texture, perspective, subject matter, and other visual features. A second is the ability to select those visual features that best indicate similarities between paintings.

Working with MATLAB and Image Processing Toolbox™, we developed algorithms to extract the visual features of a painting. The feature extraction algorithm is fairly common in computer vision, and straightforward to implement. The more challenging task was finding the best machine learning techniques. We began by testing support vector machines (SVMs) and other classification algorithms in Statistics and Machine Learning Toolbox to identify visual features that are useful in style classification. In MATLAB, we then applied distance metric learning techniques to weigh the features and thereby improve the algorithm’s ability to classify paintings.

The algorithms we developed classified the styles of paintings in our database with 60% accuracy, where chance performance would have been about 2%. While art historians can perform this task with much more than 60% accuracy, the algorithm outperforms typical non-expert humans.

Using Machine Learning to Uncover Artistic Influences

Once we had algorithms that could reliably identify similarities between pairs of paintings, we were ready to tackle our next challenge: using machine learning to reveal artistic influences. Our hypothesis was that visual features useful for style classification (a supervised learning problem) could also be used to determine influences (an unsupervised problem).

Art historians develop theories of artistic influence based on how the artists worked, traveled, or trained with contemporaries. Our MATLAB based machine learning algorithms used only visual elements and dates of composition. We hypothesized that an algorithm that took into account objects and symbols in the painting would be more effective than one that relied on low-level features such as color and texture. With this in mind, we used classification algorithms that were trained on Google images to identify specific objects.

We tested the algorithms on more than 1700 paintings from 66 different artists working over a span of 550 years. The algorithm readily identified the influence of Diego Velazquez's “Portrait of Pope Innocent X” on Francis Bacon's “Study After Velazquez's Portrait of Pope Innocent X” (Figure 1).

Figure 1. Left: Diego Velázquez’s “Portrait of Pope Innocent X.” Right: Francis Bacon’s “Study After Velázquez’s Portrait of Pope Innocent X.”

The similarities in composition and subject matter between these two paintings are easy even for a layman to spot, but the algorithm also produced results that surprised the art historians we worked with. For example, our algorithm identified “Bazille’s Studio; 9 rue de la Condamine,” painted by French Impressionist Frederic Bazille in 1870, as a possible influence on Norman Rockwell’s “Shuffleton’s Barbershop,” completed 80 years later (Figure 2). Although the paintings might not look similar at first glance, a closer examination reveals similarities in composition and subject matter, including the heaters in the lower right of each work, the group of three men in the center, and the chairs and triangular spaces in the lower left.

Figure 2. Left: Frederic Bazille’s “Bazille’s Studio; 9 rue de la Condamine.” Right: Norman Rockwell’s “Shuffleton’s Barbershop.” Yellow circles indicate similar objects, red lines indicate similar composition, and the blue rectangle indicates a similar structural element.

In our data set, the algorithms correctly identified 60% of the 55 influences recognized by art historians, suggesting that visual similarity alone provides sufficient information for algorithms (and possibly for humans) to determine many influences.

Measuring Creativity by Solving a Network Centrality Problem

Recently, our research has focused on developing algorithms to measure creativity in art. We based this project on a widely used definition that identifies an object as creative if it is both novel and influential. In these terms, a creative painting will be unlike the paintings that came before it (novel), but similar to those that came after it (influential).

In addressing this problem, we once again saw an opportunity to apply our MATLAB algorithms for identifying similarities between paintings. In MATLAB, we created a network in which the vertices are paintings and each edge represents the similarity between the two paintings at its vertices. Through a series of transformations on this network we saw that making inferences about creativity from such a graph is a network centrality problem, which can be solved efficiently using MATLAB.

We tested our creativity algorithms on two data sets containing more than 62,000 paintings. The algorithm gave high scores to several works recognized by art historians as both novel and influential, including some of the works shown in Figure 3. Ranking even higher than Pablo Picasso’s “Young Ladies of Avignon” (1907) in the same period were several paintings by Kazimir Malevich. This result initially surprised me, as I knew little about Malevich’s work. I have since learned that he was the founder of the Suprematism movement, one of the earliest developments in abstract art.

Figure 3. Computed creativity scores (y-axis) for paintings from 1400 to 2000 (x-axis), showing selected highest-scoring paintings for individual periods.

To perform a basic validation of our algorithm, we changed the date on specific works of art, effectively shifting them backwards or forwards in time. In these “time machine” experiments, we saw significant creativity score increases for Impressionist art moved back to the 1600s and significant reductions for Baroque paintings moved forward to the 1900s. The algorithms correctly perceived that what was creative 300 years ago is not creative today, and that something that is creative now would have been much more creative if introduced far in the past.

A Scalable and Extensible Framework for Ongoing Research in the Arts

Humans have the innate perceptual skills to classify art, and they excel at identifying similarities in pairs of paintings, but they lack the time and patience to apply these skills objectively to thousands or millions of paintings. Handling tasks at this scale is where computers come into their own. By developing machine learning algorithms that have perceptual capabilities similar to humans, our goal is to provide art historians with tools to navigate vast databases of images.

The framework we developed in MATLAB for identifying similarities and measuring creativity is not confined to art. It could be applied to literature, music, or virtually any other creative domain, as long as the individual works can be encoded in a way that is accessible to the algorithms.

For now, however, our focus remains on the visual arts. We are interested not only in ensuring that machine learning algorithms produce good results but also in how they arrive at those results. In this area, too, MATLAB is a tremendous advantage because it provides many ways to quickly and easily visualize results. These visualizations enable us to understand the results and use them to inform ongoing AI research.

Article featured in MathWorks News & Notes

About the Author

Dr. Ahmed Elgammal is an associate professor in the department of computer science at Rutgers, the State University of New Jersey. His research interests include computer vision, visual learning, data science, digital humanities, and human motion analysis.

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

Plotting and Intrepretating an ROC Curve (0)	2017.03.01
awesome-object-proposals (0)	2017.02.28
Tangent line to a curve at a given point (0)	2016.11.29
cosine similarity matlab code (0)	2016.11.29
Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab (0)	2016.05.11

Posted by uniqueone

,

Tangent line to a curve at a given point

Computer Vision 2016. 11. 29. 21:22

https://www.mathworks.com/matlabcentral/answers/89306-tangent-line-to-a-curve-at-a-given-point

%Example
t=0:0.01:10
y=sin(t)
plot(t,y)
%-------------------------
dy=diff(y)./diff(t)
k=220; % point number 220
tang=(t-t(k))*dy(k)+y(k)
hold on
plot(t,tang)
scatter(t(k),y(k))
hold off

gradient(y,t) is better than diff(y)/diff(t), because it applies a 2nd order method. At least this is true for numerical differentiation. Does this concern symbolic operations also? I cannot test this, because I do not have the symbolic toolbox.

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

awesome-object-proposals (0)	2017.02.28
Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄 (0)	2016.12.08
cosine similarity matlab code (0)	2016.11.29
Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab (0)	2016.05.11
facerec framework (0)	2016.05.10

Posted by uniqueone

,

cosine similarity matlab code

Computer Vision 2016. 11. 29. 21:19

http://stackoverflow.com/questions/22432673/how-to-measure-the-cosine-similarity-between-2-images

You could use the matlab's built in function to get the cosine distance:

pdist([u;v],'cosine')

which returns the "One minus the cosine of the included angle between points". You could then subtract the answer from one to get the 'cosine of the included angle' (similarity), like this:

1 - pdist([u;v],'cosine')

Source: Pairwise distance between pairs of objects.

저작자표시 비영리 동일조건 (새창열림)

'Computer Vision' 카테고리의 다른 글

Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art - 화가들의 작품을 분석하여 영향을 끼친 그림들을 찾아내고 그림들을 시간순으로 분석해냄 (0)	2016.12.08
Tangent line to a curve at a given point (0)	2016.11.29
Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab (0)	2016.05.11
facerec framework (0)	2016.05.10
OpenFace (0)	2016.05.09

Posted by uniqueone

,

Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab

Computer Vision 2016. 5. 11. 14:56

https://github.com/Qingbao/iris

iris

Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab.

DESCRIPTION:

Iris is one of the most important biometric approaches that can perform high confidence recognition. Iris contains rich and random Information. Most of commercial iris recognition systems are using the Daugman algorithm. The algorithms are using in this case from open sourse with modification, if you want to use the source code, please check the LICENSE.

Daugman algorithm:

where I(x,y) is the eye image, r is the radius to searches over the image (x,y), G(r) is a Gaussian smoothing function. The algorithm starts to search from the pupil, in order to detect the changing of maximum pixel values (partial derivative).

Hough transform:

The Hough transform is a feature extraction technique used in image analysis, computer vision, and digital image processing. where (xi, yi) are central coordinates, and r is the radius. Generally, and eye would be modeled by two circles, pupil and limbus (iris region), and two parabolas, upper and lower eyelids

Starts to detect the eyelids form the horizontal direction, then detects the pupil and iris boundary by the vertical direction.

NORMALIZATION AND FEATURE ENCODING:

From circles to oblong block By using the 1D Log-Gabor filter. In order to extract 9600 bits iris code, the upper and lower eyelids will be processed as a 9600 bits mask during the encoding.

MATCHING:

Hamming distance (HD): where A and B are subjects to compare, which contains 20480=9600 template bits and 20480=9600 mask bits, respectively, in order to calculate by using XOR and AND boolean operators.

Results:

CASIA Iris Image Database(version 1.0) (http://biometrics.idealtest.org/dbDetailForUser.do?id=1): 756 iris images form 108 different subjects. High quality of images by using NIR camera.

Resolution of 320*280.

Totally, 756*755/2=285390 pairs of comparison for each algorithm, 2268 for intra-class comparison and 283 122 for inter-class comparison.

EER:

Daugman algorithm: 0.0157 Hough transform: 0.0500

'Computer Vision' 카테고리의 다른 글

Tangent line to a curve at a given point (0)	2016.11.29
cosine similarity matlab code (0)	2016.11.29
facerec framework (0)	2016.05.10
OpenFace (0)	2016.05.09
Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/ (0)	2016.05.09

Posted by uniqueone

,

facerec framework

Computer Vision 2016. 5. 10. 09:45

https://github.com/bytefish/facerec/wiki

Home

bytefish edited this page on Jan 10 2013 · 7 revisions

Clone this wiki locally

facerec

These wiki pages are not necessarily related to the facerec framework, but serve as a place to put interesting links, ideas, algorithms or implementations to. Feel free to extend the list, add new wiki pages or request new features in the issue tracker.

Research

Eye Tracking, Image Alignment and Head Pose Estimation

Gary B. Huang, Vidit Jain, and Erik Learned-Miller. Unsupervised joint alignment of complex images. International Conference on Computer Vision (ICCV), 2007. (Project page, PDF Online available, C++ Code)
X. Zhu, D. Ramanan. Face Detection, Pose Estimation and Landmark Localization in the Wild Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012. (Project page, PDF Online available, C++ Code)
F. Timm and E. Barth. Accurate eye centre localisation by means of gradients. In Proceedings of the Int. Conference on Computer Theory and Applications (VISAPP), volume 1, pages 125-130, Algarve, Portugal, 2011. INSTICC. (PDF Online available, C++ Code)

Feature Extraction

Lior Wolf, Tal Hassner and Yaniv Taigman, Descriptor Based Methods in the Wild, Faces in Real-Life Images workshop at the European Conference on Computer Vision (ECCV), Oct 2008. (PDF Online available), (Matlab Code)
Ojansivu V & Heikkilä J (2008) Blur insensitive texture classification using local phase quantization. Proc. Image and Signal Processing (ICISP 2008), 5099:236-243. (PDF Online available), (Matlab Code)

'Computer Vision' 카테고리의 다른 글

cosine similarity matlab code (0)	2016.11.29
Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab (0)	2016.05.11
OpenFace (0)	2016.05.09
Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/ (0)	2016.05.09
Facial Feature Tracking with SO-CLM (0)	2016.05.09

Posted by uniqueone

,

OpenFace

Computer Vision 2016. 5. 9. 16:41

https://cmusatyalab.github.io/openface/

OpenFace

Free and open source face recognition with deep neural networks.

News

2016-01-19: OpenFace 0.2.0 released! See this blog post for more details.

OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google. Torch allows the network to be executed on a CPU or with CUDA.

Crafted by Brandon Amos in Satya's research group at Carnegie Mellon University.

The code is available on GitHub at cmusatyalab/openface.
API Documentation
Join the cmu-openface group or the gitter chat for discussions and installation issues.
Development discussions and bugs reports are on the issue tracker.

This research was supported by the National Science Foundation (NSF) under grant number CNS-1518865. Additional support was provided by the Intel Corporation, Google, Vodafone, NVIDIA, and the Conklin Kistler family fund. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and should not be attributed to their employers or funding sources.

Isn't face recognition a solved problem?

No! Accuracies from research papers have just begun to surpass human accuracies on some benchmarks. The accuracies of open source face recognition systems lag behind the state-of-the-art. See our accuracy comparisons on the famous LFW benchmark.

Please use responsibly!

We do not support the use of this project in applications that violate privacy and security. We are using this to help cognitively impaired users sense and understand the world around them.

Overview

The following overview shows the workflow for a single input image of Sylvestor Stallone from the publicly available LFW dataset.

Detect faces with a pre-trained models from dlib or OpenCV.
Transform the face for the neural network. This repository uses dlib's real-time pose estimation with OpenCV's affine transformation to try to make the eyes and bottom lip appear in the same location on each image.
Use a deep neural network to represent (or embed) the face on a 128-dimensional unit hypersphere. The embedding is a generic representation for anybody's face. Unlike other face representations, this embedding has the nice property that a larger distance between two face embeddings means that the faces are likely not of the same person. This property makes clustering, similarity detection, and classification tasks easier than other face recognition techniques where the Euclidean distance between features is not meaningful.
Apply your favorite clustering or classification techniques to the features to complete your recognition task. See below for our examples for classification and similarity detection, including an online web demo.

News

[Oct 15, 2015] (Spanish) GenBeta: OpenFace, un nuevo software de reconocimiento facial, de código abierto
[Oct 15, 2015] TheNextWeb: Watch this open-source program recognize faces in real time

Blogosphere

[Feb 24, 2016] Hey Zuck, We Built Your Office A.I. Solution
[Feb 3, 2016] RTNiFiOpenFace and WebSocketServer add face recognition to an Apache NiFi video flow
[Jan 29, 2016] Integrating OpenFace into an Apache NiFi flow using WebSockets

Projects using OpenFace

pyannote/pyannote-video: Face detection, tracking, and clustering in videos.
aybassiouny/OpenFaceCpp: Unofficial C++ implementation.

Citations

The following is a BibTeX and plaintext reference for the OpenFace GitHub repository. The reference may change in the future. The BibTeX entry requires the url LaTeX package.

@misc{amos2016openface,
    title        = {{OpenFace: Face Recognition with Deep Neural Networks}},
    author       = {Amos, Brandon and Ludwiczuk, Bartosz and Harkes, Jan and
                    Pillai, Padmanabhan and Elgazzar, Khalid and Satyanarayanan, Mahadev},
    howpublished = {\url{http://github.com/cmusatyalab/openface}},
    note         = {Accessed: 2016-01-11}
}

Brandon Amos, Bartosz Ludwiczuk, Jan Harkes, Padmanabhan Pillai,
Khalid Elgazzar, and Mahadev Satyanarayanan.
OpenFace: Face Recognition with Deep Neural Networks.
http://github.com/cmusatyalab/openface.
Accessed: 2016-01-11

Acknowledgements

The fantastic Torch ecosystem and community.
Alfredo Canziani's implementation of FaceNet's loss function in torch-TripletEmbedding.
Nicholas Léonard for quickly merging my pull requests to nicholas-leonard/dpnn modifying the inception layer.
Francisco Massa and Andrej Karpathy for quickly releasing nn.Normalize after I expressed interest in using it.
Soumith Chintala for help with the fbcunn example code.
Davis King's dlib library for face detection and alignment.
The GitHub issue and pull request templates are inspired from Randy Olsen's templates at rhiever/tpot, Justin Abrahms' PR template, and Aurelia Moser's issue template.
Zhuo Chen, Kiryong Ha, Wenlu Hu, Rahul Sukthankar, and Junjue Wang for insightful discussions.

Licensing

Unless otherwise stated, the source code and trained Torch and Python model files are copyright Carnegie Mellon University and licensed under the Apache 2.0 License. Portions from the following third party sources have been modified and are included in this repository. These portions are noted in the source files and are copyright their respective authors with the licenses listed.

Project	Modified	License
Atcold/torch-TripletEmbedding	No	MIT
facebook/fbnn	Yes	BSD

'Computer Vision' 카테고리의 다른 글

Iris Recognition Algorithms Comparison between Daugman algorithm and Hough transform on Matlab (0)	2016.05.11
facerec framework (0)	2016.05.10
Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/ (0)	2016.05.09
Facial Feature Tracking with SO-CLM (0)	2016.05.09
gaze tracker (0)	2016.05.09

Posted by uniqueone

,

Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/

Computer Vision 2016. 5. 9. 16:41

https://github.com/cmusatyalab/openfaceOpenFace •

Free and open source face recognition with deep neural networks.

Website: http://cmusatyalab.github.io/openface/
API Documentation
Join the cmu-openface group or the gitter chat for discussions and installation issues.
Development discussions and bugs reports are on the issue tracker.

This research was supported by the National Science Foundation (NSF) under grant number CNS-1518865. Additional support was provided by the Intel Corporation, Google, Vodafone, NVIDIA, and the Conklin Kistler family fund. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and should not be attributed to their employers or funding sources.

What's in this repository?

batch-represent: Generate representations from a batch of images. Example directory structure.
demos/web: Real-time web demo.
demos/compare.py: Demo to compare two images.
demos/vis-outputs.lua: Demo to visualize the network's outputs.
demos/classifier.py: Demo to train and use classifiers.
evaluation: LFW accuracy evaluation scripts.
openface: Python library code.
models: Model directory for openface and 3rd party libraries.
tests: Tests for scripts and library code, including neural network training.
training: Scripts to train new OpenFace neural network models.
util: Utility scripts.

Citations

The following is a BibTeX and plaintext reference for the OpenFace GitHub repository. The reference may change in the future. The BibTeX entry requires the url LaTeX package.

@misc{amos2016openface,
    title        = {{OpenFace: Face Recognition with Deep Neural Networks}},
    author       = {Amos, Brandon and Ludwiczuk, Bartosz and Harkes, Jan and
                    Pillai, Padmanabhan and Elgazzar, Khalid and Satyanarayanan, Mahadev},
    howpublished = {\url{http://github.com/cmusatyalab/openface}},
    note         = {Accessed: 2016-01-11}
}

Brandon Amos, Bartosz Ludwiczuk, Jan Harkes, Padmanabhan Pillai,
Khalid Elgazzar, and Mahadev Satyanarayanan.
OpenFace: Face Recognition with Deep Neural Networks.
http://github.com/cmusatyalab/openface.
Accessed: 2016-01-11

Licensing

Unless otherwise stated, the source code and trained Torch and Python model files are copyright Carnegie Mellon University and licensed under the Apache 2.0 License. Portions from the following third party sources have been modified and are included in this repository. These portions are noted in the source files and are copyright their respective authors with the licenses listed.

Project	Modified	License
Atcold/torch-TripletEmbedding	No	MIT
facebook/fbnn	Yes	BSD

'Computer Vision' 카테고리의 다른 글

facerec framework (0)	2016.05.10
OpenFace (0)	2016.05.09
Facial Feature Tracking with SO-CLM (0)	2016.05.09
gaze tracker (0)	2016.05.09
TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12

Posted by uniqueone

,

Facial Feature Tracking with SO-CLM

Computer Vision 2016. 5. 9. 16:10

http://kylezheng.org/facial-feature-mobile-device/

Facial Feature Tracking with SO-CLM

Facial Feature Detection and Tracking with Approximated Structured Output Learning based Constrained Local Model

Shuai Zheng¹ Paul Sturgess¹Philip H. S. Torr¹

¹Brookes Vision Group, (Now Torr-Vision at University of Oxford)

Abstract

An approximated structured output learning approach is developed to learn the appearance model from CLM overtime on low-powered portable devices, such as iPad 2, iPhon 4S, Galaxy S2.

Facial feature detection and tracking are very important in many applications such as face recognition, and face animation.

What has been done

Existing facial feature detectors like tree-structured SVM, Constrained Local Model (CLMs) achieve the state-of-the-art accuracy performance in many benchmarks (e.g. CMU MultiPIE, BioID etc.). However, when it comes to the low-powered device application, the trade-off among accuracy, speed and memory cost becomes apparently the main concern in the facial feature detection related application.

How to make facial feature detection efficient in speed and memory

There are two ways to address the speeding up problem. One is to use GPU(e.g. CUDA), and parallel computing (e.g. OpenMP) techniques to speed up the existing algorithms (e.g. AAM, CLM etc). Another is to improve the steps inside existing algorithms, or let’s say developing a new algorithm. In this paper, we explored how to speed up the facial feature detection with an approach called approximate structured output learning for constrained local model.

What we did

Within this paper we examine the learning of the appearance model in Constrained Local Models (CLM) technique. We have two contributions: firstly we examine an approximate method for doing structured learning, which jointly learns all the appearances of the landmarks. Even though this method has no guarantee of optimality we find it performs better than training the appearance models independently. This also allows for efficiently online learning of a particular instance of a face. Secondly we use a binary approximation of our learnt model that when combined with binary features, leads to efficient inference at runtime using bitwise AND operations. We quantify the generalization performance of our approximate SO-CLM, by training the model parameters on a single dataset, and testing on a total of five unseen benchmarks.

The speed at runtime is demonstrated on the ipad2 platform. Our results clearly show that our proposed system runs in real-time, yet still performs at state-of-the-art levels of accuracy.

Publication

[1] S. Zheng, P. Sturgess, and P. Torr, “Approximate Structured Output Learning for Constrained Local Models with Application to Real-time Facial Feature Detection and Tracking on Low-power Devices”, IEEE Conference on Automatic Face and Gesture Recognition (AFGR) , 2013.[[pdf][bib][poster][ppt][ResultsFig][IEEExplore]

Software

[1] Demo Program. [7.2M Win64ExcutableProgram][Win32][Linux][Mac]

FAQ: you can get the detection results by type “BrookesFaceTracker.exe gaga.png”, you can get demo by type”BrookesFaceTracker.exe”.

Data

Frontal Facial landmark Annotation Dataset [link]

Acknowledgement

This project is supported by EPSRC EP/I001107/1.

Related Links

[1] S. Hare, A. Saffari, and P. Torr, “Efficient Online Structured Output Learning for Keypoint-based Object Tracking“, CVPR, 2012.[paper&C++code]

[2] X. Zhu, D. Ramanan, “Face Detection, pose estimation and landmark localization in the wild“, CVPR, 2012. [project]

[3] Struct SVM. [project]

[4] Struct SVM in Matlab.[Project]

[5] Flandmark. [project]

[6] CI2CV. [Website]

[7] CLM-Wlid [Website].

[8] FacePlusPlus[Website]

Data Links

[1] BioID http://www.bioid.com/index.php?q=downloads/software/bioid-face-database.html

[2] CMU MultiPie http://www.flintbox.com/public/project/4742/

[3] XM2VTS http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/

[4] LFPW http://www.kbvt.com/LFPW/

[5] Talking face http://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html

[6]300 Faces in-the-wild [i-Bug webiste]

'Computer Vision' 카테고리의 다른 글

OpenFace (0)	2016.05.09
Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/ (0)	2016.05.09
gaze tracker (0)	2016.05.09
TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12
Generating the feature of GLOC (0)	2014.10.09

Posted by uniqueone

,

gaze tracker

Computer Vision 2016. 5. 9. 16:04

https://github.com/trishume/eyeLike

eyeLike

An OpenCV based webcam gaze tracker based on a simple image gradient-based eye center algorithm by Fabian Timm.

DISCLAIMER

This does not track gaze yet. It is basically just a developer reference implementation of Fabian Timm's algorithm that shows some debugging windows with points on your pupils.

If you want cheap gaze tracking and don't mind hardware check out The Eye Tribe. If you want webcam-based eye tracking contact Xlabs or use their chrome plugin and SDK. If you're looking for open source your only real bet is Pupil but that requires an expensive hardware headset.

Status

The eye center tracking works well but I don't have a reference point like eye corner yet so it can't actually track where the user is looking.

If anyone with more experience than me has ideas on how to effectively track a reference point or head pose so that the gaze point on the screen can be calculated contact me.

Building

CMake is required to build eyeLike.

OSX or Linux with Make

# do things in the build directory so that we don't clog up the main directory
mkdir build
cd build
cmake ../
make
./bin/eyeLike # the executable file

On OSX with XCode

mkdir build
./cmakeBuild.sh

then open the XCode project in the build folder and run from there.

On Windows

There is some way to use CMake on Windows but I am not familiar with it.

Blog Article:

Using Fabian Timm's Algorithm

Paper:

Timm and Barth. Accurate eye centre localisation by means of gradients. In Proceedings of the Int. Conference on Computer Theory and Applications (VISAPP), volume 1, pages 125-130, Algarve, Portugal, 2011. INSTICC.

(also see youtube video at http://www.youtube.com/watch?feature=player_embedded&v=aGmGyFLQAFM)

eyeLike-master.zip

'Computer Vision' 카테고리의 다른 글

Face recognition with deep neural networks. http://cmusatyalab.github.io/openface/ (0)	2016.05.09
Facial Feature Tracking with SO-CLM (0)	2016.05.09
TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12
Generating the feature of GLOC (0)	2014.10.09
Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30

Posted by uniqueone

,

TCPM (Tree Structured Parts Model) Tutorial

Computer Vision 2016. 4. 12. 17:36

https://github.com/jhlddm/TSPM

TCPM (Tree Structured Parts Model) Tutorial

We use a mixture of models for face detection. Let's first look into the original model implemented in the code. As for the original mixture of models, total of 13 models form a mixture. Model 1 to 3 have the same tree structure, and are of purpose of detecting human faces which are heading left. Model 4 to 10, in sequence, have the same tree structure, and these 7 are for detecting the frontal faces. The remaining 3 (model 11 to 13) have the same tree structure, and are for detecting the faces heading right.

For better understanding, structure of the model 7 is constructed like below:

Note that there are two kinds of labeling (numbering) systems. One is annotation order and the other is tree order. Annotation order is the ordering system under which the annotations (coordinates of landmark points) on the training images were made, while tree order is of the actual tree structure of a model used on the score evaluation stage.

If you want to use a new facial model, follow the next steps.

Construct a mixture of models. Decide how many models the mixture consists of.
Design a tree structure which fit to the human faces for each model.
Give labels to nodes of trees. As mentioned earlier, each node should be labeled with two numbers, one for annotation ordering system, and the other for tree ordering system.

Annotation order: If you have annotations within the training data, then you have to follow the labeling order of those annotations.
Tree order: Be aware that the id number of parent nodes should be larger than their children's.

For example, simpler model might be like:

The following material of this document is based on a mixture of models which consists of 3 models. Each of three models corresponds to viewpoints of 30, 0, -30 degree, respectively. And all the models have the same tree structure as shown above.

Prepare dataset

For training set, we need data of following files to be prepared:

Image files that include the human faces which we aim to detect.
Annotation files on images that include the coordinate values of landmark points (same as the center of parts in the models)

For each of all the image files, there should be an annotation file named "[Image file name]_lb.mat" in the certain directory for annotation files. These are .mat files, where the coordinate values of landmark points are stored using a matrix variable named "pts". As for our simple model, size of the matrix "pts" is 15 by 2, cause we have 15 landmark points per an image. The first column refers to the x values of landmark points, and the second column refers to the y values, and each of 15 rows corresponds to each landmark point.

The training data might be structured like:

[ Image files ]
*image_dir*/*image_name_1.jpg*
...
*image_dir*/*image_name_n.jpg*

[ Annotation files ]
*anno_dir*/*image_name_1_lb.mat*
...
*anno_dir*/*image_name_n_lb.mat*

Edit multipie_init.m

As the first step of change in codes, open "multipie_init.m" file located at the project root directory, and modify as follows.

First of all, see the following part of the code.

opts.viewpoint = 90:-15:-90;
opts.partpoolsize = 39+68+39;

opts.viewpoint is a list consisting of the viewpoint angles which the objects face towards. As for the original setting above, for example, it means that we aim to detect the human faces each of which is heading at 90 degree, 75 degree, ..., -90 degree respectively. (zero degree corresponds to the frontal view.)

opts.partpoolsize is a sum over the number of parts of every different model. In the original setting, we have total of 3 different models for detecting the left, front, and right side of faces. These three models are composed of 39, 68, and 39 parts respectively. Thus we have the value of 39+68+39 in result.

Change these two properly to work well with our model. Then the code above should be modified to be like:

opts.viewpoint = [30, 0, -30];
opts. partpoolsize = 15;

And then, we specify mixture of models concretly. We have three models in our mixture of models, so what we have to do in this section is to define opts.mixture(1), opts.mixture(2), and opts.mixture(3) which correspond to our three models.

Let's define a opts.mixture(1) first.

At first, poolid should be a list of integer from 1 to 15, because every single model of our mixture has 15 parts.

Next, the variable I and J define a transformation relation between the annotation and tree order labels. Let I have a array of integer range from 1 to the number of parts. And then, take a close look at J. The k'th number in array J, say n_k, means that the node labeled with k in tree order is labeled with n_k in annotation order.

S in the next line should be modified to be the array consisting of ones that takes the number of parts as it's length.

Using the variables defined above, we set the anno2treeorder to represent the transformation matrix from annotation to tree order. Just replace the 4th, and 5th argument of the sparse() function with the number of parts.

Finally, pa specifies the id number of parent of each nodes. Note that you should follow the tree order in refering to a node here.

As a result, the original codes might be changed as follows:

% Global mixture 1 to 3, left, frontal, and right face
opts.mixture(1).poolid = 1:15;
I = 1:15;
J = [9 10 11 8 7 3 2 1 6 5 4 12 13 14 15];
S = ones(1,15);
opts.mixture(1).anno2treeorder = full(sparse(I,J,S,15,15)); % label transformation
opts.mixture(1).pa = [0 1 1 1 4 5 6 7 5 9 10 1 12 12 12];

opts.mixture(2) = opts.mixture(1);
opts.mixture(3) = opts.mixture(1);

Edit multipie_data.m

This "multipie_data.m" file does the data preparation work.

First, define what images in our dataset will be used for training set, and what images for test set. Modify the lists trainlist and testlist properly based on the prepared dataset.

Next, set the pathes where the image and annotation in the dataset are located. multipiedir is a path for image files, and annodir is a path for annotation files.

Edit multipie.mat

This .mat file includes a struct variable named multipie which includes the name of the image files classified by the models in our mixture. You may consult the "make_multipie_info" script in tools/ directory to make your own multipie variable more easily.

Run multipie_main.m

Run "compile.m" file first.
Run "multipie_main" file. This script trains the model using the prepared dataset, evaluate the trained model, and even shows the result graphically.

'Computer Vision' 카테고리의 다른 글

Facial Feature Tracking with SO-CLM (0)	2016.05.09
gaze tracker (0)	2016.05.09
Generating the feature of GLOC (0)	2014.10.09
Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30
Logistic Regression에 대한 간단한 설명 (0)	2013.06.26

Posted by uniqueone

,

Generating the feature of GLOC

Computer Vision 2014. 10. 9. 14:36

Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling.
http://vis-www.cs.umass.edu/GLOC/

http://vis-www.cs.umass.edu/lfw/part_labels/

The authors' feature Code:
The following code is used to generate the features.

[gloc_features.zip] (md5sum 4bab12e8bea70ada9a7024f9166f9109)

However, it produces some error in my MS VS2010 on Windows 7 (64-bit).

I modified itgenerate_features1.cpp

It needs OpenCV lib and it needs 'Command Argument' as follows.

parts_all.txt

LABClusterFile

G:\Research\faceSeg\database\LFW\lfw_unfunneled\lfw_unfunneled

G:\Research\faceSeg\database\LFW\lfw_superpixels_fine

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_spseg_features

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images_generate

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_superpixels_mat

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_tex

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_pb

1. parts_all.txt: File List. Provided in the authors homepage.

2. LABClusterFile: LAB Cluster File. It will be created. Just type in 'Command Argument'

3. G:\Research\faceSeg\database\LFW\lfw_unfunneled\lfw_unfunneled: LFW image Directory

4. G:\Research\faceSeg\database\LFW\lfw_superpixels_fine: Superpixel Directory. The folder of Superpixel PPM files.

5. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_spseg_features: Features Directory. Feature will be save in this directory.

6. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images: Face and hair Ground Truth PPM files directory.

7. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images_generate: will be generated.

8. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_superpixels_mat: Superpixel label files in '.dat' extension.

9. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_tex: texture files generated from running 'generate_textures.m'.

10. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_pb: boundary infor. files generated from running 'generate_PB.m'.

'Computer Vision' 카테고리의 다른 글

gaze tracker (0)	2016.05.09
TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12
Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30
Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12

Posted by uniqueone

,

'Computer Vision'에 해당되는 글 41건

'Computer Vision > Resources' 카테고리의 다른 글

'Computer Vision > Course' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

2018

2017

2016

2015

2014

'Computer Vision > Resources' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

Compare Handwritten Shapes Using Procrustes Analysis

Load and Display the Original Data

Calculate the Best Transformation

Examine the Similarity of the Two Shapes

Restrict the Form of the Transformations

See Also

Related Topics

'Computer Vision' 카테고리의 다른 글

Semester: Spring 2016

'Computer Vision > Course' 카테고리의 다른 글

'Computer Vision > Course' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

Contents

Shape analysis on a beach holiday

A little warning

A shape and its reflection

'Computer Vision' 카테고리의 다른 글

Introduction to ROC Curves

Plotting and Intrepretating an ROC Curve

'Computer Vision' 카테고리의 다른 글

Table of Contents

Introduction

Tutorials

Papers

Objectness Scoring

Similarity Grouping

Supervised Learning

Hybrid / Part-based

RGB-D

Re-ranking & Refinement

Spatio-Temporal

Evaluation

Low-Level Processing

Datasets

'Computer Vision' 카테고리의 다른 글

Creating Computer Vision and Machine Learning Algorithms That Can Analyze Works of Art

Extracting Visual Features for Classifying Paintings

Using Machine Learning to Uncover Artistic Influences

Measuring Creativity by Solving a Network Centrality Problem

A Scalable and Extensible Framework for Ongoing Research in the Arts

About the Author

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

DESCRIPTION:

Daugman algorithm:

Hough transform:

NORMALIZATION AND FEATURE ENCODING:

MATCHING:

Results:

EER:

'Computer Vision' 카테고리의 다른 글

Home

Clone this wiki locally

facerec

Research

Eye Tracking, Image Alignment and Head Pose Estimation

Feature Extraction

'Computer Vision' 카테고리의 다른 글

OpenFace

News

Isn't face recognition a solved problem?

Please use responsibly!

Overview

News