Date
|
Topics
|
Papers and links
|
Presenters
|
Items due
|
Aug 24
|
Course intro
|
|
[slides]
|
Topic preferences due via email by Monday August 29
|
I. Single-object recognition fundamentals: representation, matching, and classification
|
Aug 31
|
Recognizing specific objects:
Invariant local features, instance recognition, bag-of-words models
|
-
*Object Recognition from Local Scale-Invariant Features, Lowe, ICCV 1999. [pdf] [code] [other implementations of SIFT] [IJCV]
-
*Local Invariant Feature Detectors: A Survey, Tuytelaars and Mikolajczyk. Foundations and Trends in Computer Graphics and Vision, 2008. [pdf] [Oxford code] [Read pp. 178-188, 216-220, 254-255]
-
*Video Google: A Text Retrieval Approach to Object Matching in Videos, Sivic and Zisserman, ICCV 2003. [pdf] [demo]
-
For more background on feature extraction: Szeliski book: Sec 3.2 Linear filtering, 4.1 Points and patches, 4.2 Edges
-
Scalable Recognition with a Vocabulary Tree, D. Nister and H. Stewenius, CVPR 2006. [pdf]
-
SURF: Speeded Up Robust Features, Bay, Ess, Tuytelaars, and Van Gool, CVIU 2008. [pdf] [code]
-
Bundling Features for Large Scale Partial-Duplicate Web Image Search. Z. Wu, Q. Ke, M. Isard, and J. Sun. CVPR 2009. [pdf]
-
Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, J. Matas, O. Chum, U. Martin, and T. Pajdla, BMVC 2002. [pdf]
-
City-Scale Location Recognition, G. Schindler, M. Brown, and R. Szeliski, CVPR 2007. [pdf]
-
Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf]
-
I Know What You Did Last Summer: Object-Level Auto-annotation of Holiday Snaps, S. Gammeter, L. Bossard, T.Quack, L. van Gool, ICCV 2009. [pdf]
-
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. O. Chum et al. CVPR 2007. [pdf]
-
A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid. CVPR 2003 [pdf]
|
[slides]
|
|
Sept 7
|
Recognition via classification and global models:
Global appearance models for category and scene recognition, sliding window detection, detection as a binary decision.
|
-
*A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008. [pdf] [code]
-
*Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce, CVPR 2006. [pdf] [15 scenes dataset] [libpmk] [Matlab]
-
*Rapid Object Detection Using a Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001. [pdf] [code]
-
Histograms of Oriented Gradients for Human Detection, Dalal and Triggs, CVPR 2005. [pdf] [video] [code] [PASCAL datasets]
-
Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, Oliva and Torralba, IJCV 2001. [pdf] [Gist code]
-
Locality-Constrained Linear Coding for Image Classification. J. Wang, J. Yang, K. Yu, and T. Huang CVPR 2010. [pdf] [code]
-
Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, ECCV International Workshop on Statistical Learning in Computer Vision, 2004. [pdf]
-
Pedestrian Detection in Crowded Scenes, Leibe, Seemann, and Schiele, CVPR 2005. [pdf]
-
Pyramids of Histograms of Oriented Gradients (pHOG), Bosch and Zisserman. [code]
-
Eigenfaces for Recognition, Turk and Pentland, 1991. [pdf]
-
Sampling Strategies for Bag-of-Features Image Classification. E. Nowak, F. Jurie, and B. Triggs. ECCV 2006. [pdf]
-
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search. C. Lampert, M. Blaschko, and T. Hofmann. CVPR 2008. [pdf] [code]
-
A Trainable System for Object Detection, C. Papageorgiou and T. Poggio, IJCV 2000. [pdf]
-
Object Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. CVPR 2005. [pdf]
|
[slides]
|
|
Sept 14
|
Regions and mid-level representations
Segmentation, grouping, surface estimation
|
-
*Constrained Parametric Min-Cuts for Automatic Object Segmentation. J. Carreira and C. Sminchisescu. CVPR 2010. [pdf] [code]
-
*Geometric Context from a Single Image, by D. Hoiem, A. Efros, and M. Hebert, ICCV 2005. [pdf] [web] [code]
-
*Contour Detection and Hierarchical Image Segmentation. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. PAMI 2011. [pdf] [data and code]
-
From Contours to Regions: An Empirical Evaluation. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. CVPR 2009. [pdf] [code]
-
Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. CVPR 2011. [pdf] [code]
-
Object Recognition as Ranking Holistic Figure-Ground Hypotheses. F. Li, J. Carreira, and C. Sminchisescu. CVPR 2010. [pdf]
-
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code]
-
Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon, and S. Ullman. CVPR workshop 2004. [pdf] [data]
-
Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. CVPR 2011. [pdf] [code] [data]
-
Extracting Subimages of an Unknown Category from a Set of Images, S. Todorovic and N. Ahuja, CVPR 2006. [pdf]
-
Learning Mid-level Features for Recognition. Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. CVPR, 2010.
-
Class-Specific, Top-Down Segmentation, E. Borenstein and S. Ullman, ECCV 2002. [pdf]
-
Object Recognition by Integrating Multiple Image Segmentations, C. Pantofaru, C. Schmid, and M. Hebert, ECCV 2008 [pdf]
-
Image Parsing: Unifying Segmentation, Detection, and Recognition. Tu, Z., Chen, Z., Yuille, A.L., Zhu, S.C. ICCV 2003 [pdf]
-
GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page]
-
Recognition Using Regions. C. Gu, J. Lim, P. Arbelaez, J. Malik, CVPR 2009. [pdf] [code]
-
Robust Higher Order Potentials for Enforcing Label Consistency, P. Kohli, L. Ladicky, and P. Torr. CVPR 2008.
-
Co-segmentation of Image Pairs by Histogram Matching --Incorporating a Global Constraint into MRFs, C. Rother, V. Kolmogorov, T. Minka, and A. Blake. CVPR 2006. [pdf]
-
Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [data]
-
An Efficient Algorithm for Co-segmentation, D. Hochbaum, V. Singh, ICCV 2009. [pdf]
-
Normalized Cuts and Image Segmentation, J. Shi and J. Malik. PAMI 2000. [pdf] [code]
- Greg Mori's superpixel code
- Berkeley Segmentation Dataset and code
- Pedro Felzenszwalb's graph-based segmentation code
- Michael Maire's segmentation code and paper
- Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf] [code, Matlab interface by Shai Bagon]
- David Blei's Topic modeling code
|
[slides] Expts: Brian, Cho-Jui
|
Implementation assignment due Friday Sept 16, 5 PM |
II. Beyond single objects: scenes and properties |
Sept 21
|
Context and scenes
Multi-object scenes, inter-object relationships, understanding scenes' spatial layout, 3d context
|
-
*Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces. D. Lee, A. Gupta, M. Hebert, and T. Kanade. NIPS 2010. [pdf] [code]
-
*Multi-Class Segmentation with Relative Location Prior. S. Gould, J. Rodgers, D. Cohen, G. Elidan and D. Koller. IJCV 2008. [pdf] [code]
-
*Using the Forest to See the Trees: Exploiting Context for Visual Object Detection and Localization. Torralba, Murphy, and Freeman. CACM 2009. [pdf] [related code]
-
Contextual Priming for Object Detection, A. Torralba. IJCV 2003. [pdf] [web] [code]
-
TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. J. Shotton, J. Winn, C. Rother, A. Criminisi. ECCV 2006. [pdf] [web] [data] [code]
-
Recognition Using Visual Phrases. M. Sadeghi and A. Farhadi. CVPR 2011. [pdf]
-
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry. V. Hedau, D. Hoiem, and D. Forsyth. ECCV 2010 [pdf] [code and data]
-
Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, A. Gupta, A. Efros, and M. Hebert. ECCV 2010. [pdf]
-
Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [code]
-
Geometric Reasoning for Single Image Structure Recovery. D. Lee, M. Hebert, and T. Kanade. CVPR 2009. [pdf] [web] [code]
-
Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006. [pdf] [web]
-
Discriminative Models for Multi-Class Object Layout, C. Desai, D. Ramanan, C. Fowlkes. ICCV 2009. [pdf] [slides] [SVM struct code] [data]
-
Closing the Loop in Scene Interpretation. D. Hoiem, A. Efros, and M. Hebert. CVPR 2008. [pdf]
-
Decomposing a Scene into Geometric and Semantically Consistent Regions, S. Gould, R. Fulton, and D. Koller, ICCV 2009. [pdf] [slides]
-
Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008. [pdf] [code]
-
An Empirical Study of Context in Object Detection, S. Divvala, D. Hoiem, J. Hays, A. Efros, M. Hebert, CVPR 2009. [pdf] [web]
-
Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.[ pdf]
-
Context Based Object Categorization: A Critical Survey. C. Galleguillos and S. Belongie. [pdf]
-
What, Where and Who? Classifying Events by Scene and Object Recognition, L.-J. Li and L. Fei-Fei, ICCV 2007. [pdf]
-
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Unsupervised Framework, L-J. Li, R. Socher, L. Fei-Fei, CVPR 2009. [pdf]
|
Papers: Nishant, Jung Expts: Saurajit
|
|
Sept 28
|
Saliency and attention
Among all items in the scene, which deserve attention (first)?
|
-
*A Model of Saliency-based Visual Attention for Rapid Scene Analysis. L. Itti, C. Koch, and E. Niebur. PAMI 1998 [pdf]
-
*Learning to Detect a Salient Object. T. Liu et al. CVPR 2007. [pdf] [results] [data] [code by Vicente Ordonez]
-
*Figure-Ground Segmentation Improves Handled Object Recognition in Egocentric Video. X. Ren and C. Gu. CVPR 2010 [pdf] [videos] [data]
-
*What Do We Perceive in a Glance of a Real-World Scene? L. Fei-Fei, A. Iyer, C. Koch, and P. Perona. Journal of Vision, 2007. [pdf]
-
Interesting Objects are Visually Salient. L. Elazary and L. Itti. Journal of Vision, 8(3):1–15, 2008. [pdf]
-
Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. BMVC 2010. [pdf] [web] [data]
-
Some Objects are More Equal Than Others: Measuring and Predicting Importance, M. Spain and P. Perona. ECCV 2008. [pdf]
- What Makes an Image Memorable? P. Isola et al. CVPR 2011. [pdf]
-
The Discriminant Center-Surround Hypothesis for Bottom-Up Saliency. D. Gao, V.Mahadevan, and N. Vasconcelos. NIPS, 2007. [pdf]
-
Category-Independent Object Proposals. I. Endres and D. Hoiem. ECCV 2010. [pdf] [code]
-
What is an Object? B. Alexe, T. Deselaers, and V. Ferrari. CVPR 2010. [pdf] [code]
-
A Principled Approach to Detecting Surprising Events in Video. L. Itti and P. Baldi. CVPR 2005 [pdf]
-
Optimal Scanning for Faster Object Detection, N. Butko, J. Movellan. CVPR 2009. [pdf]
-
What Attributes Guide the Deployment of Visual Attention and How Do They Do It? J. Wolfe and T. Horowitz. Neuroscience, 5:495–501, 2004. [pdf]
-
Visual Correlates of Fixation Selection: Effects of Scale and Time. B. Tatler, R. Baddeley, and I. Gilchrist. Vision Research, 45:643, 2005. [pdf]
-
Objects Predict Fixations Better than Early Saliency. W. Einhauser, M. Spain, and P. Perona. Journal of Vision, 8(14):1–26, 2008. [pdf]
-
Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010. [pdf] [data]
-
Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video. S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstrack,S. Chung, A. Ng. IJCAI 2007. [pdf]
-
Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006. [pdf] [web]
-
Determining Patch Saliency Using Low-Level Context, D. Parikh, L. Zitnick, and T. Chen. ECCV 2008. [pdf]
-
Visual Recognition and Detection Under Bounded Computational Resources, S. Vijayanarasimhan and A. Kapoor. CVPR 2010.
-
Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. ICCV 2011 [pdf]
-
Contextual Guidance of Eye Movements and Attention in Real-World Scenes: The Role of Global Features on Object Search. A. Torralba, A. Oliva, M. Castelhano, J. Henderson. [pdf] [web]
-
The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search, G. Zelinsky, W. Zhang, B. Yu, X. Chen, D. Samaras, NIPS 2005. [pdf] |
Papers: Lu Xia Expts: Larry
|
|
Oct 5
|
Attributes:
Visual properties, learning from natural language descriptions, intermediate representations
|
-
*Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, C. Lampert, H. Nickisch, and S. Harmeling, CVPR 2009 [pdf] [web] [data]
-
*Describing Objects by Their Attributes, A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, CVPR 2009. [pdf] [web] [data]
-
*Attribute and Simile Classifiers for Face Verification, N. Kumar, A. Berg, P. Belhumeur, S. Nayar. ICCV 2009. [pdf] [web] [lfw data] [pubfig data]
-
Relative Attributes. D. Parikh and K. Grauman. ICCV 2011. [pdf] [data]
-
A Discriminative Latent Model of Object Classes and Attributes. Y. Wang and G. Mori. ECCV, 2010. [pdf]
-
Learning Visual Attributes, V. Ferrari and A. Zisserman, NIPS 2007. [pdf]
-
Learning Models for Object Recognition from Natural Language Descriptions, J. Wang, K. Markert, and M. Everingham, BMVC 2009.[pdf]
-
FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur, and S. Nayar. ECCV 2008. [pdf]
-
Attribute-Centric Recognition for Cross-Category Generalization. A. Farhadi, I. Endres, D. Hoiem. CVPR 2010. [pdf]
-
Automatic Attribute Discovery and Characterization from Noisy Web Data. T. Berg et al. ECCV 2010. [pdf] [data]
-
Attributes-Based People Search in Surveillance Environments. D. Vaquero, R. Feris, D. Tran, L. Brown, A. Hampapur, and M. Turk. WACV 2009. [pdf] [project page]
-
Image Region Entropy: A Measure of "Visualness" of Web Images Associated with One Concept. K. Yanai and K. Barnard. ACM MM 2005. [pdf]
-
What Helps Where And Why? Semantic Relatedness for Knowledge Transfer. M. Rohrbach, M. Stark, G. Szarvas, I. Gurevych and B. Schiele. CVPR 2010. [pdf]
-
Recognizing Human Actions by Attributes. J. Liu, B. Kuipers, S. Savarese, CVPR 2011. [pdf]
-
Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. CVPR 2011. [pdf] [web]
|
Papers: Saurajit Expts: Qiming, Harsh
|
Proposal abstracts due Friday Oct 7, 5 PM
|
III. External input in recognition
|
Oct 12
|
Language and description
Discovering the correspondence between words and other language constructs and images, generating descriptions
|
-
*Baby Talk: Understanding and Generating Image Descriptions. Kulkarni et al. CVPR 2011. [pdf]
-
*Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, A. Gupta and L. Davis, ECCV 2008. [pdf]
-
*Learning Sign Language by Watching TV (using weakly aligned subtitles), P. Buehler, M. Everingham, and A. Zisserman. CVPR 2009. [pdf] [data] [web]
-
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth. ECCV 2002. [pdf] [data]
-
The Mathematics of Statistical Machine Translation: Parameter Estimation. P. Brown, S. Della Pietro, V. Della Pietra, R. Mercer. Association for Computational Linguistics, 1993. [pdf] (background for Duygulu et al paper)
- How Many Words is a Picture Worth? Automatic Caption Generation for News Images. Y. Feng and M. Lapata. ACL 2010. [pdf]
-
Matching words and pictures. K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. JMLR, 3:1107–1135, 2003. [pdf]
-
Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation. L. Jie, B. Caputo, and V. Ferrari. NIPS 2009. [pdf]
-
Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. ECML 2008. [pdf]
- Systematic Evaluation of Machine Translation Methods for Image and Video Annotation, P. Virga, P. Duygulu, CIVR 2005. [pdf]
- Localizing Objects and Actions in Videos Using Accompanying Text. Johns Hopkins University Summer Workshop Report. J. Neumann et al. 2010. [pdf] [web]
|
Papers: Chris Expts: Jae, Naga
|
|
Oct 19
|
Interactive learning and recognition
Human-in-the-loop learning, active annotation collection, crowdsourcing
|
-
*Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. CVPR 2011. [pdf]
-
*Visual Recognition with Humans in the Loop. Branson S., Wah C., Babenko B., Schroff F., Welinder P., Perona P., Belongie S. ECCV 2010. [pdf] [Caltech/UCSD Visipedia project] [data]
-
*The Multidimensional Wisdom of Crowds. Welinder P., Branson S., Belongie S., Perona, P. NIPS 2010. [pdf] [code]
-
*What’s It Going to Cost You? : Predicting Effort vs. Informativeness for Multi-Label Image Annotations. S. Vijayanarasimhan and K. Grauman. CVPR 2009 [pdf] [data] [code]
-
iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance, D. Batra, A. Kowdle, D. Parikh, J. Luo and T. Chen. CVPR 2010. [pdf] [web]
-
Labeling Images with a Computer Game. L. von Ahn and L. Dabbish. CHI, 2004.
- Who's Vote Should Count More: Optimal Integration fo Labels from Labelers of Unknown Expertise. J. Whitehill et al. NIPS 2009. [pdf]
-
Utility Data Annotation with Amazon Mechanical Turk. A. Sorokin and D. Forsyth. Wkshp on Internet Vision, 2008.
-
Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. CVPR 2010. [pdf] [code]
-
Multiclass Recognition and Part Localization with Humans in the Loop. C. Wah et al. ICCV 2011. [pdf]
-
Multi-Level Active Prediction of Useful Image Annotations for Recognition. S. Vijayanarasimhan and K. Grauman. NIPS 2008. [pdf]
-
Active Learning from Crowds. Y. Yan, R. Rosales, G. Fung, J. Dy. ICML 2011. [pdf]
- Proactive Learning: Cost-Sensitive Active Learning with Multiple Imperfect Oracles. P. Donmez and J. Carbonell. CIKM 2008. [pdf]
-
Inactive Learning? Difficulties Employing Active Learning in Practice. J. Attenberg and F. Provost. SIGKDD 2011. [pdf]
-
Annotator Rationales for Visual Recognition. J. Donahue and K. Grauman. ICCV 2011. [pdf]
-
Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. CVPR 2011. [pdf] [web]
-
Actively Selecting Annotations Among Objects and Attributes. A. Kovashka, S. Vijayanarasimhan, and K. Grauman. ICCV 2011 [pdf]
- Supervised Learning from Multiple Experts: Whom to Trust When Everyone Lies a Bit. V. Raykar et al. ICML 2009. [pdf]
-
Multi-class Active Learning for Image Classification. A. J. Joshi, F. Porikli, and N. Papanikolopoulos. CVPR 2009. [pdf]
-
GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page]
-
Active Learning for Piecewise Planar 3D Reconstruction. A. Kowdle, Y.-J. Chang, A. Gallagher and T. Chen. CVPR 2011 [pdf] [web]
- Amazon Mechanical Turk
- Using Mechanical Turk with LabelMe
|
Papers: Brian, Harsh Expts: Yunsik
|
Proposal extended outline due Friday Oct 21, 5 PM
|
IV. Activity in images and video
|
Oct 26
|
Pictures of people
Finding people and their poses, automatic face tagging
|
-
*Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations, L. Bourdev and J. Malik. ICCV 2009 [pdf] [code]
-
*Understanding Images of Groups of People, A. Gallagher and T. Chen, CVPR 2009. [pdf] [web] [data]
-
*Real-Time Human Pose Recognition in Parts from a Single Depth Image. J. Shotton et al. CVPR 2011. [pdf] [video]
-
*"'Who are you?' - Learning Person Specific Classifiers from Video, J. Sivic, M. Everingham, and A. Zisserman, CVPR 2009. [pdf] [data] [KLT tracking code]
-
Contextual Identity Recognition in Personal Photo Albums. D. Anguelov, K.-C. Lee, S. Burak, Gokturk, and B. Sumengen. CVPR 2007. [pdf]
-
Fast Pose Estimation with Parameter Sensitive Hashing. G. Shakhnarovich, P. Viola, T. Darrell, ICCV 2003.[pdf]
-
Finding and Tracking People From the Bottom Up. D. Ramanan, D. A. Forsyth. CVPR 2003. [pdf]
-
Where’s Waldo: Matching People in Images of Crowds. R. Garg, D. Ramanan, S. Seitz, N. Snavely. CVPR 2011. [pdf]
-
Autotagging Facebook: Social Network Context Improves Photo Annotation, by Z. Stone, T. Zickler, and T. Darrell. CVPR Internet Vision Workshop 2008. [pdf]
-
Efficient Propagation for Face Annotation in Family Albums. L. Zhang, Y. Hu, M. Li, and H. Zhang. MM 2004. [pdf]
-
Progressive Search Space Reduction for Human Pose Estimation. Ferrari, V., Marin-Jimenez, M. and Zisserman, A. CVPR 2008. [pdf] [web] [code]
- Leveraging Archival Video for Building Face Datasets, by D. Ramanan, S. Baker, and S. Kakade. ICCV 2007. [pdf]
-
Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004. [pdf] [web]
-
Face Discovery with Social Context. Y. J. Lee and K. Grauman. BMVC 2011. [pdf]
-
“Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006. [pdf] [web] [data]
-
Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities. Yao, B., Fei-Fei, L. CVPR 2010.
-
A Face Annotation Framework with Partial Clustering and Interactive Labeling. R. X. Y. Tian,W. Liu, F.Wen, and X. Tang. CVPR 2007. [pdf] [web]
-
From 3D Scene Geometry to Human Workspace. A. Gupta et al. CVPR 2011. [pdf] [web]
-
Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. M. Andriluka et al. CVPR 2009. [pdf] [code]
|
Papers: Sunil, Larry Expts: Nishant, Jung
|
|
Nov 2
|
Activity recognition
Recognizing and localizing human actions in video
|
-
*Actions in Context, M. Marszalek, I. Laptev, C. Schmid. CVPR 2009. [pdf] [web] [data]
-
*A Hough Transform-Based Voting Framework for Action Recognition. A. Yao, J. Gall, L. Van Gool. CVPR 2010. [pdf] [code/data]
-
*Beyond Actions: Discriminative Models for Contextual Group Activities. T. Lian, Y. Wang, W. Yang, and G. Mori. NIPS 2010. [pdf] [data]
-
Objects in Action: An Approach for Combining Action Understanding and Object Perception. A. Gupta and L. Davis. CVPR, 2007. [pdf] [data]
-
Learning Realistic Human Actions from Movies. I. Laptev, M. Marszałek, C. Schmid and B. Rozenfeld. CVPR 2008. [pdf] [data]
-
Understanding Egocentric Activities. A. Fathi, A. Farhadi, J. Rehg. ICCV 2011. [pdf]
-
Exploiting Human Actions and Object Context for Recognition Tasks. D. Moore, I. Essa, and M. Hayes. ICCV 1999. [pdf]
-
A Scalable Approach to Activity Recognition Based on Object Use. J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. ICCV 2007. [pdf]
-
Recognizing Actions at a Distance. A. Efros, G. Mori, J. Malik. ICCV 2003. [pdf] [web]
-
Activity Recognition from First Person Sensing. E. Taralova, F. De la Torre, M. Hebert CVPR 2009 Workshop on Egocentric Vision [pdf]
-
Action Recognition from a Distributed Representation of Pose and Appearance, S. Maji, L. Bourdev, J. Malik, CVPR 2011. [pdf] [code]
-
Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. CVPR 2010. [pdf]
-
Temporal Causality for the Analysis of Visual Events. K. Prabhakar, S. Oh, P. Wang, G. Abowd, and J. Rehg. CVPR 2010. [pdf] [Georgia Tech Computational Behavior Science project]
-
Modeling Activity Global Temporal Dependencies using Time Delayed Probabilistic Graphical Model. Loy, Xiang & Gong ICCV 2009. [pdf]
-
What's Going on?: Discovering Spatio-Temporal Dependencies in Dynamic Scenes. D. Kuettel et al. CVPR 2010. [pdf]
-
Learning Actions From the Web. N. Ikizler-Cinbis, R. Gokberk Cinbis, S. Sclaroff. ICCV 2009. [pdf]
- Content-based Retrieval of Functional Objects in Video Using Scene Context. S. Oh, A. Hoogs, M. Turek, and R. Collins. ECCV 2010. [pdf]
|
Papers: Qiming, Yunsik Expts: Lu Xia
|
|
V. Dealing with lots of data/categories
|
Nov 9
|
Scaling with a large number of categories
Sharing features between classes, transfer, taxonomy, learning from few examples, exploiting class relationships
|
-
*Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K. Murphy, W. Freeman, PAMI 2007. [pdf] [code]
-
*What Does Classifying More than 10,000 Image Categories Tell Us? J. Deng, A. Berg, K. Li and L. Fei-Fei. ECCV 2010. [pdf]
-
*Discriminative Learning of Relaxed Hierarchy for Large-scale Visual Recognition. T. Gao and Daphne Koller. ICCV 2011. [pdf] [code]
-
Comparative Object Similarity for Improved Recognition with Few or Zero Examples. G. Wang, D. Forsyth, and D. Hoeim. CVPR 2010. [pdf]
-
Learning and Using Taxonomies for Fast Visual Categorization, G. Griffin and P. Perona, CVPR 2008. [pdf] [data]
-
Cross-Generalization: Learning Novel Classes from a Single Example by Feature Replacement. CVPR 2005. [pdf]
-
80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition, by A. Torralba, R. Fergus, and W. Freeman. PAMI 2008. [pdf] [web]
-
Constructing Category Hierarchies for Visual Recognition, M. Marszalek and C. Schmid. ECCV 2008. [pdf] [web] [Caltech256]
-
Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. L. Fei-Fei, R. Fergus, and P. Perona. CVPR Workshop on Generative-Model Based Vision. 2004. [pdf] [Caltech101]
-
Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis. CVPR 2007 [pdf]
-
Exploiting Object Hierarchy: Combining Models from Different Category Levels, A. Zweig and D. Weinshall, ICCV 2007 [pdf]
-
Incremental Learning of Object Detectors Using a Visual Shape Alphabet. Opelt, Pinz, and Zisserman, CVPR 2006. [pdf]
-
Sequential Learning of Reusable Parts for Object Detection. S. Krempp, D. Geman, and Y. Amit. 2002 [pdf]
-
ImageNet: A Large-Scale Hierarchical Image Database, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf] [data]
-
Semantic Label Sharing for Learning with Many Categories. R. Fergus et al. ECCV 2010. [pdf]
-
Learning a Tree of Metrics with Disjoint Visual Features. S. J. Hwang, K. Grauman, F. Sha. NIPS 2011.
|
Papers: Cho-Jui, Si Si Expts: Lu Pan
|
|
Nov 16
|
Large-scale search and mining
Scalable retrieval algorithms for massive databases, mining for themes
|
-
*VisualRank: Applying PageRank to Large-Scale Image Search. Y. Jing and S. Baluja. PAMI 2008. [pdf]
-
*Kernelized Locality Sensitive Hashing for Scalable Image Search, by B. Kulis and K. Grauman, ICCV 2009 [pdf] [code]
-
*Video Mining with Frequent Itemset Configurations. T. Quack, V. Ferrari, and L. Van Gool. CIVR 2006. [pdf]
-
Learning Binary Projections for Large-Scale Image Search. K. Grauman and R. Fergus. Chapter (draft) to appear in Registration, Recognition, and Video Analysis, R. Cipolla, S. Battiato, and G. Farinella, Editors. [pdf]
-
World-scale Mining of Objects and Events from Community Photo Collections. T. Quack, B. Leibe, and L. Van Gool. CIVR 2008. [pdf]
-
Interest Seam Image. X. Zhang, G. Hua, L. Zhang, H. Shum. CVPR 2010. [pdf]
-
Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, C. Lampert, ICCV 2009. [pdf] [code]
-
Geometric Min-Hashing: Finding a (Thick) Needle in a Haystack, O. Chum, M. Perdoch, and J. Matas. CVPR 2009. [pdf]
-
FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur, and S. Nayar. ECCV 2008. [pdf]
-
Efficiently Searching for Similar Images. K. Grauman. Communications of the ACM, 2009. [CACM link]
-
Fast Image Search for Learned Metrics, P. Jain, B. Kulis, and K. Grauman, CVPR 2008. [pdf]
-
Small Codes and Large Image Databases for Recognition, A. Torralba, R. Fergus, and Y. Weiss, CVPR 2008. [pdf]
-
Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf]
|
Papers: Naga, Jae Expts: Si Si
|
|
Nov 23
|
Summarization
Video synopsis, discovering repeated objects, visualization
|
-
*Webcam Synopsis: Peeking Around the World, by Y. Pritch, A. Rav-Acha, A. Gutman, and S. Peleg, ICCV 2007. [pdf] [web]
-
*Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code]
-
*Summarizing Visual Data Using Bi-Directional Similarity. D. Simakov, Y. Caspi, E. Shechtmann, M. Irani. CVPR 2008. [pdf] [video]
-
Fast Unsupervised Ego-Action Learning for First-Person Sports Video. K. Kitani, T. Okabe, Y. Sato, A. Sugimoto. CVPR 2011. [pdf]
-
Scene Summarization for Online Image Collections. I. Simon, N. Snavely, S. Seitz. ICCV 2007. [pdf] [web]
-
VideoCut: Removing Irrelevant Frames by Discovering the Object of Interest. D. Liu, G. Hua, T. Chen. ECCV 2010. [pdf]
-
Video Epitomes. V. Cheung, B. J. Frey, and N. Jojic. CVPR 2005. [pdf] [web] [code]
-
Making a Long Video Short. A. Rav-Acha, Y. Pritch, and S. Peleg. CVPR 2006. [pdf]
-
Structural Epitome: A Way to Summarize One's Visual Experience. N. Jojic, A. Perina, V. Murino. NIPS 2010. [pdf] [data]
-
Video Abstraction: A Systematic Review and Classification. B. Truong and S. Venkatesh. ACM 2007. [pdf]
- Shape Discovery from Unlabeled Image Collections. Y. J. Lee and K. Grauman. CVPR 2009. [pdf]
|
Papers: Lu Pan Expts: Sunil, Chris
|
Final paper drafts due Wed Nov 23
|
Nov 30
|
Final project presentations in class
|
|
|
Final papers due Tues Dec 6, 5 PM
|