'Computer Vision'에 해당되는 글 42건

2014.10.09 Generating the feature of GLOC
2014.09.30 Self-Tuning Spectral Clustering 64bit mex files
2013.06.26 Logistic Regression에 대한 간단한 설명
2013.06.12 Kristen Grauman - Visual Object Recognition and Image Search
2013.06.05 Kristen Grauman - Special Topics in Computer Vision, Spring 2010 1
2013.06.05 Kristen Grauman - Computer Vision Spring 2011
2013.06.05 Kristen Grauman - Visual Recognition, Fall 2011
2013.06.05 Kristen Grauman - Visual Recognition, Fall 2012
2012.06.26 [AI Shack] SIFT: Scale Invariant Feature Transform
2012.06.26 [AI Shack] SIFT: Scale Invariant Feature Transform
2012.06.26 [AI Shack] SIFT: Scale Invariant Feature Transform
2012.06.26 [AI Shack] SIFT: Scale Invariant Feature Transform

Generating the feature of GLOC

Computer Vision 2014. 10. 9. 14:36

Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling.
http://vis-www.cs.umass.edu/GLOC/

http://vis-www.cs.umass.edu/lfw/part_labels/

The authors' feature Code:
The following code is used to generate the features.

[gloc_features.zip] (md5sum 4bab12e8bea70ada9a7024f9166f9109)

However, it produces some error in my MS VS2010 on Windows 7 (64-bit).

I modified itgenerate_features1.cpp

It needs OpenCV lib and it needs 'Command Argument' as follows.

parts_all.txt

LABClusterFile

G:\Research\faceSeg\database\LFW\lfw_unfunneled\lfw_unfunneled

G:\Research\faceSeg\database\LFW\lfw_superpixels_fine

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_spseg_features

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images_generate

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_superpixels_mat

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_tex

G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_pb

1. parts_all.txt: File List. Provided in the authors homepage.

2. LABClusterFile: LAB Cluster File. It will be created. Just type in 'Command Argument'

3. G:\Research\faceSeg\database\LFW\lfw_unfunneled\lfw_unfunneled: LFW image Directory

4. G:\Research\faceSeg\database\LFW\lfw_superpixels_fine: Superpixel Directory. The folder of Superpixel PPM files.

5. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_spseg_features: Features Directory. Feature will be save in this directory.

6. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images: Face and hair Ground Truth PPM files directory.

7. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_gt_images_generate: will be generated.

8. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_superpixels_mat: Superpixel label files in '.dat' extension.

9. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_tex: texture files generated from running 'generate_textures.m'.

10. G:\Research\faceSeg\database\LFW\lfw_unfunneled\parts_lfw_unfunneled_pb: boundary infor. files generated from running 'generate_PB.m'.

'Computer Vision' 카테고리의 다른 글

gaze tracker (0)	2016.05.09
TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12
Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30
Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12

Posted by uniqueone

Self-Tuning Spectral Clustering 64bit mex files

Computer Vision 2014. 9. 30. 10:43

Self-Tuning Spectral Clustering

-->http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html

I tried to operate the ZPclustering code, but it shows me some errors as follows, and it is because I'm using 64-bit matlab (on Windows 7).

------------------------------------------

>> test_segimage
Building affinity matrix took 0.063324 second

Error using dist2aff
Function "mxGetIr_700" is obsolete.
(64-bit mex files using sparse matrices must be rebuilt with the "-largeArrayDims" option. See the R2006b release notes for more details.)

Error in segment_image (line 65)
tic; W = dist2aff(D,SS); ttt = toc;

Error in test_segimage (line 11)
[mask] = segment_image(IM,R,G1,'SS','KM',0.1);

------------------------------------------

I modified the mex files

dist2aff.cppevrot.cppscale_dist.cppzero_diag.cpp

And, I typed like follows in matlab command window:

>> mex -O -largeArrayDims -c dist2aff.cpp
>> mex -O -largeArrayDims -c scale_dist.cpp
>> mex -O -largeArrayDims -c zero_diag.cpp
>> mex -O -largeArrayDims -c evrot.cpp

>> mex -O -largeArrayDims dist2aff.obj
>> mex -O -largeArrayDims scale_dist.obj
>> mex -O -largeArrayDims zero_diag.obj
>> mex -O -largeArrayDims evrot.obj

After that, it opertes well.

'Computer Vision' 카테고리의 다른 글

TCPM (Tree Structured Parts Model) Tutorial (0)	2016.04.12
Generating the feature of GLOC (0)	2014.10.09
Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12
Kristen Grauman - Special Topics in Computer Vision, Spring 2010 (1)	2013.06.05

Posted by uniqueone

Logistic Regression에 대한 간단한 설명

Computer Vision 2013. 6. 26. 14:34

http://dogmas.tistory.com/trackback/141

Logistic Regression에 대한 간단한 설명

Linear regression은 종속변수가 일정한 양을 나타낼 경우가 대부분이지만 종속변수가 0과 1만을 갖는 변수일때에는 logistic regression을 사용하는 것이 좋다.

예를들면, 어떤 대학교 법과대학을 졸업한 학생을 대상으로 학점, 재산, 나이, 사법고시 합격 여부를 조사한다면 학점과 재산과 나이는 일정한 양을 나타내지만 사법고시 합격 여부는 합격은 1로 나타내고 불합격은 0으로 나타내는 binary variable이 된다.

다음과 같은 선형 모델을 생각해보자.

여기서 Y는 0과 1만을 갖는 종속변수이고, x는 독립변수이며, e는 에러를 나타낸다.

Y가 Bernoulli random variable이고 확률은 다음과 같다고 가정해보자

이렇게되면 위 선형모델식에서 에러는 normal distribution을 갖지 못하고 에러의 분산도 상수가 아니라 Y가 1일 확률에 따라 변하게 된다. 더군다나 Y의 범위가 0에서 1이므로 일반적인 linear regression을 사용할 수 없다.

경험적으로 Y가 binary variable이면 그 형태가 S자 임을 알 수 있으므로 다음과 같은 logit reponse function을 이용한다.

또는

이를 고쳐쓰면 아래와 같이 쓸수있다.

위 식에서 우변을 odds ratio라고 부른다.

만약 어떤 값 "x = x1"에 대하여 odds ratio가 2라면 x가 x1일때 Y가 1일 확률이 Y가 0일 확률의 2배가 된다는 것을 의미한다. 또한 x가 1만큼 증가함에 따라 odds ratio는 exp(b1)만큼 증가함을 알수 있다.

Logistic regression 예제

위와 같은 데이터에 대하여 logistic regression을 수행하였을때

Odds ratio = 0.84

라고 가정하여보자.

의 값은 standard normal distribution을 따르므로 H0: b1 = 0 을 테스트하면 p = 0.04이며 이는 통계적으로 significant한 값이다. 따라서 온도를 1 내릴때마다 O-ring failure의 확률값은 O-ring success 확률 대비 0.84만큼 증가함을 보여준다.

'Computer Vision' 카테고리의 다른 글

Generating the feature of GLOC (0)	2014.10.09
Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12
Kristen Grauman - Special Topics in Computer Vision, Spring 2010 (1)	2013.06.05
Kristen Grauman - Computer Vision Spring 2011 (0)	2013.06.05

Posted by uniqueone

Kristen Grauman - Visual Object Recognition and Image Search

Computer Vision 2013. 6. 12. 10:09

http://www.cs.utexas.edu/~grauman/courses/trento2011/

Visual Recognition and Search.htm

'Computer Vision' 카테고리의 다른 글

Self-Tuning Spectral Clustering 64bit mex files (0)	2014.09.30
Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Special Topics in Computer Vision, Spring 2010 (1)	2013.06.05
Kristen Grauman - Computer Vision Spring 2011 (0)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2011 (0)	2013.06.05

Posted by uniqueone

Kristen Grauman - Special Topics in Computer Vision, Spring 2010

Computer Vision 2013. 6. 5. 11:17

http://www.cs.utexas.edu/~grauman/courses/spring2010/schedule.html

CS395T: Special Topics in Computer Vision, Spring 2010

Object Recognition

Course overview Useful links Syllabus Detailed schedule eGradebook Blackboard

Meets: Wednesdays 3:30-6:30 pm
ACES 3.408
Unique # 54470

Instructor: Kristen Grauman
Email: grauman@cs
Office: CSA 114

TA: Sudheendra Vijayanarasimhan
Email: svnaras@cs
Office: CSA 106

When emailing us, please put CS395 in the subject line.

Announcements:

See the schedule for current reading assignments.

Project paper draft s due Friday April 30.

Course overview:

Topics: This is a graduate seminar course in computer vision. We will survey and discuss current vision papers relating to object recognition, auto-annotation of images, and scene understanding. The goals of the course will be to understand current approaches to some important problems, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research.

See the syllabus for an outline of the main topics we'll be covering.

Requirements: Students will be responsible for writing paper reviews each week, participating in discussions, completing one programming assignment, presenting once or twice in class (depending on enrollment, and possibly done in teams), and completing a project (done in pairs).

Note that presentations are due one week before the slot your presentation is scheduled. This means you will need to read the papers, prepare experiments, make plans with your partner, create slides, etc. more than one week before the date you are signed up for. The idea is to meet and discuss ahead of time, so that we can iterate as needed the week leading up to your presentation.

More details on the requirements and grading breakdown are here. Information on the projects and project proposals is here.

Prereqs: Courses in computer vision and/or machine learning (378 Computer Vision and/or 391 Machine Learning, or similar); ability to understand and analyze conference papers in this area; programming required for experiment presentations and projects.

Please talk to me if you are unsure if the course is a good match for your background. I generally recommend scanning through a few papers on the syllabus to gauge what kind of background is expected. I don't assume you are already familiar with every single algorithm/tool/image feature a given paper mentions, but you should feel comfortable following the key ideas.

Syllabus overview:

Schedule and papers:

Note: * = required reading.
Additional papers are provided for reference, and as a starting point for background reading for projects.
Paper presentations: focus on starred papers (additionally mentioning ideas from others is ok but not necessary).
Experiment presentations: Pick from only among the starred papers.

Date

Topics

Papers and links

Presenters

Items due

Jan 20

Course intro handout

Topic preferences due via email by Monday Jan 25

I. Single-object recognition fundamentals: representation, matching, and classification

Jan 27

Recognizing specific objects:

Invariant local features, instance recognition, bag-of-words models

sift

*Object Recognition from Local Scale-Invariant Features, Lowe, ICCV 1999. [pdf] [code] [other implementations of SIFT] [IJCV]
*Local Invariant Feature Detectors: A Survey, Tuytelaars and Mikolajczyk. Foundations and Trends in Computer Graphics and Vision, 2008. [pdf] [Oxford code] [Read pp. 178-188, 216-220, 254-255]
*Video Google: A Text Retrieval Approach to Object Matching in Videos, Sivic and Zisserman, ICCV 2003. [pdf] [demo]

Scalable Recognition with a Vocabulary Tree, D. Nister and H. Stewenius, CVPR 2006. [pdf]
SURF: Speeded Up Robust Features, Bay, Ess, Tuytelaars, and Van Gool, CVIU 2008. [pdf] [code]
Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, J. Matas, O. Chum, U. Martin, and T. Pajdla, BMVC 2002. [pdf]
A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid. CVPR 2003 [pdf]
Oxford group interest point software
Andrea Vedaldi's code, including SIFT, MSER, hierarchical k-means.
INRIA LEAR team's software, including interest points, shape features
Semantic Robot Vision Challenge links

lecture slides [ppt] [pdf]

Feb 3

Recognition via classification and global models:

Global appearance models for category and scene recognition, sliding window detection, detection as a binary decision.

hog

*Histograms of Oriented Gradients for Human Detection, Dalal and Triggs, CVPR 2005. [pdf] [video] [code] [PASCAL datasets]
*Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce, CVPR 2006. [pdf] [15 scenes dataset] [libpmk] [Matlab]
*Rapid Object Detection Using a Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001. [pdf] [code]

Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, Oliva and Torralba, IJCV 2001. [pdf] [Gist code]
Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, ECCV International Workshop on Statistical Learning in Computer Vision, 2004. [pdf]
Pedestrian Detection in Crowded Scenes, Leibe, Seemann, and Schiele, CVPR 2005. [pdf]
Pyramids of Histograms of Oriented Gradients (pHOG), Bosch and Zisserman. [code]
Eigenfaces for Recognition, Turk and Pentland, 1991. [pdf]
Sampling Strategies for Bag-of-Features Image Classification. E. Nowak, F. Jurie, and B. Triggs. ECCV 2006. [pdf]
A Trainable System for Object Detection, C. Papageorgiou and T. Poggio, IJCV 2000. [pdf]
Object Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. CVPR 2005. [pdf]
LIBPMK feature extraction code, includes dense sampling
LIBSVM library for support vector machines

lecture slides [ppt] [pdf]

Feb 10
Class begins at 5 pm today.

Objects composed of parts:

Part-based models for category recognition, and local feature matching for correspondence-based recognition

parts

*A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008. [pdf] [code]
*Combined Object Categorization and Segmentation with an Implicit Shape Model, by B. Leibe, A. Leonardis, and B. Schiele. ECCV Workshop on Statistical Learning in Computer Vision, 2004. [pdf] [code] [IJCV extended version]
*Learning a Dense Multi-View Representation for Detection, Viewpoint Classification and Synthesis of Object Categories, H. Su, M. Sun, L. Fei-Fei, S. Savarese. ICCV 2009. [pdf]

Shape Matching and Object Recognition with Low Distortion Correspondences, A. Berg, T. Berg, and J. Malik, CVPR 2005. [pdf] [web]
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification, Frome, Singer, Sha, Malik. ICCV 2007. [pdf]
Matching Local Self-Similarities Across Images and Videos, Shechtman and Irani, CVPR 2007. [pdf]
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features, Grauman and Darrell. ICCV 2005. [pdf] [web] [code]
Shape Matching and Object Recognition Using Shape Contexts. S. Belongie, J. Malik, J. Puzicha. PAMI 2002. [pdf]
Multiple Component Learning for Object Detection, Dollar, Babenko, Belongie, Perona, and Tu, ECCV 2008. [pdf]
Object Class Recognition by Unsupervised Scale Invariant Learning, by R. Fergus, P. Perona, and A. Zisserman. CVPR 2003. [pdf] [datasets]
Efficient Matching of Pictorial Structures. P. Felzenszwalb and D. Huttenlocher. CVPR 2000. [pdf] [related code]
A Boundary-Fragment-Model for Object Detection, Opelt, Pinz, and Zisserman, ECCV 2006. [pdf]

Implementation assignment due Friday Feb 12, 5 PM

Feb 17

Region-based models:

Regions as parts, multi-label segmentation, integrated classification and segmentation

regions

*Recognition Using Regions. C. Gu, J. Lim, P. Arbelaez, J. Malik, CVPR 2009. [pdf] [slides] [seg code]
*Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code]
*Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon, and S. Ullman. CVPR workshop 2004. [pdf] [data]

Extracting Subimages of an Unknown Category from a Set of Images, S. Todorovic and N. Ahuja, CVPR 2006. [pdf]
Class-Specific, Top-Down Segmentation, E. Borenstein and S. Ullman, ECCV 2002. [pdf]
Object Recognition by Integrating Multiple Image Segmentations, C. Pantofaru, C. Schmid, and M. Hebert, ECCV 2008 [pdf]
Image Parsing: Unifying Segmentation, Detection, and Recognition. Tu, Z., Chen, Z., Yuille, A.L., Zhu, S.C. ICCV 2003 [pdf]
Robust Higher Order Potentials for Enforcing Label Consistency, P. Kohli, L. Ladicky, and P. Torr. CVPR 2008.
Co-segmentation of Image Pairs by Histogram Matching --Incorporating a Global Constraint into MRFs, C. Rother, V. Kolmogorov, T. Minka, and A. Blake. CVPR 2006. [pdf]
An Efficient Algorithm for Co-segmentation, D. Hochbaum, V. Singh, ICCV 2009. [pdf]
Normalized Cuts and Image Segmentation, J. Shi and J. Malik. PAMI 2000. [pdf] [code]
Greg Mori's superpixel code
Berkeley Segmentation Dataset and code
Pedro Felzenszwalb's graph-based segmentation code
Michael Maire's segmentation code and paper
Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf] [code, Matlab interface by Shai Bagon]
David Blei's Topic modeling code

papers: John [pdf]
demo: Sudheendra [ppt]

II. Beyond single objects: recognizing categories in context and learning their properties

Feb 24

Context:

Inter-object relationships, objects within scenes, geometric context, understanding scene layout

context

*Discriminative Models for Multi-Class Object Layout, C. Desai, D. Ramanan, C. Fowlkes. ICCV 2009. [pdf] [slides] [SVM struct code] [data]
*TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. J. Shotton, J. Winn, C. Rother, A. Criminisi. ECCV 2006. [pdf] [web] [data]
*Geometric Context from a Single Image, by D. Hoiem, A. Efros, and M. Hebert, ICCV 2005. [pdf] [web] [code]
*Contextual Priming for Object Detection, A. Torralba. IJCV 2003. [pdf] [web] [code]

Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006. [pdf] [web]
Decomposing a Scene into Geometric and Semantically Consistent Regions, S. Gould, R. Fulton, and D. Koller, ICCV 2009. [pdf] [slides]
Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008. [pdf] [code]
An Empirical Study of Context in Object Detection, S. Divvala, D. Hoiem, J. Hays, A. Efros, M. Hebert, CVPR 2009. [pdf] [web]
Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.[ pdf]
Context Based Object Categorization: A Critical Survey. C. Galleguillos and S. Belongie. [pdf]
What, Where and Who? Classifying Events by Scene and Object Recognition, L.-J. Li and L. Fei-Fei, ICCV 2007. [pdf]
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Unsupervised Framework, L-J. Li, R. Socher, L. Fei-Fei, CVPR 2009. [pdf]

Labelme Database
Scene Understanding Symposium
Stanford STAIR vision library - includes CRF example

Piyush [ppt]
Robert [pdf]

Mar 3

Attributes:

Visual properties, learning from natural language descriptions, intermediate representations

*Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, C. Lampert, H. Nickisch, and S. Harmeling, CVPR 2009 [pdf] [web] [data]
*Describing Objects by Their Attributes, A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, CVPR 2009. [pdf] [web] [data]
*Attribute and Simile Classifiers for Face Verification, N. Kumar, A. Berg, P. Belhumeur, S. Nayar. ICCV 2009. [pdf] [web] [data]

Learning Visual Attributes, V. Ferrari and A. Zisserman, NIPS 2007. [pdf]
Learning Color Names for Real-World Applications, J. van de Weijer, C. Schmid, J. Verbeek, and D. Larlus. IEEE TIP 2009. [pdf] [web]
Learning Models for Object Recognition from Natural Language Descriptions, J. Wang, K. Markert, and M. Everingham, BMVC 2009.[pdf]

Brian [ppt]
Adam [pdf]

Friday
Mar 5

Prof. David Forsyth, UIUC
Forum for AI Talk
11 AM in ACES 2.302

Monday
Mar 8

Project proposal abstract due

Mar 10

Actions and objects/scenes:

Recognizing human actions and objects simultaneously, objects and scenes as context for the activity

actions

*Actions in Context, M. Marszalek, I. Laptev, C. Schmid. CVPR 2009. [pdf] [web]
*Objects in Action: An Approach for Combining Action Understanding and Object Perception. A. Gupta and L. Davis. CVPR, 2007. [pdf] [data]

Exploiting Human Actions and Object Context for Recognition Tasks. D. Moore, I. Essa, and M. Hayes. ICCV 1999. [pdf]
A Scalable Approach to Activity Recognition Based on Object Use. J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. ICCV 2007. [pdf]
Towards Using Multiple Cues for Robust Object Recognition, S. Aboutalib and M. Veloso, AAMAS 2007. [pdf]

Aibo [ppt]

Mar 17

Spring break (no class)

III. Scalability issues in category learning, detection, and search

Mar 24

Too many pixels!

Bottom-up and top-down saliency measures to prioritize features, object importance, saliency in visual search tasks

*A Model of Saliency-based Visual Attention for Rapid Scene Analysis. L. Itti, C. Koch, and E. Niebur. PAMI 1998 [pdf]
*Some Objects are More Equal Than Others: Measuring and Predicting Importance, M. Spain and P. Perona. ECCV 2008. [pdf]
*Optimal Scanning for Faster Object Detection, N. Butko, J. Movellan. CVPR 2009. [pdf]

Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010. [pdf]
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search, C. Lampert, M. Blaschko, T. Hofmann. CVPR 2008. [pdf]
Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video. S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstrack,S. Chung, A. Ng. IJCAI 2007. [pdf]
Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006. [pdf] [web]
Determining Patch Saliency Using Low-Level Context, D. Parikh, L. Zitnick, and T. Chen. ECCV 2008. [pdf]
Learning to Predict Where Humans Look, T. Judd, K. Ehinger, F. Durand, A. Torralba. ICCV 2009. [pdf] [web]
Visual Recognition and Detection Under Bounded Computational Resources, S. Vijayanarasimhan and A. Kapoor. CVPR 2010.
Torralba Global Features and Attention
The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search, G. Zelinsky, W. Zhang, B. Yu, X. Chen, D. Samaras, NIPS 2005. [pdf]
Amazon Mechanical Turk
Using Mechanical Turk with LabelMe

Anush [pdf]

Project update and extended outline due Friday Mar 26

Mar 31

Too many categories!

Scalable recognition with many object categories

shared

*Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K. Murphy, W. Freeman, PAMI 2007. [pdf] [code]
*Cross-Generalization: Learning Novel Classes from a Single Example by Feature Replacement. CVPR 2005. [pdf]
*Constructing Category Hierarchies for Visual Recognition, M. Marszalek and C. Schmid. ECCV 2008. [pdf] [web] [Caltech256]

Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. L. Fei-Fei, R. Fergus, and P. Perona. CVPR Workshop on Generative-Model Based Vision. 2004. [pdf] [Caltech101]
Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis. CVPR 2007 [pdf]
Exploiting Object Hierarchy: Combining Models from Different Category Levels, A. Zweig and D. Weinshall, ICCV 2007 [pdf]
Learning and Using Taxonomies for Fast Visual Categorization, G. Griffin and P. Perona, CVPR 2008. [pdf]
Incremental Learning of Object Detectors Using a Visual Shape Alphabet. Opelt, Pinz, and Zisserman, CVPR 2006. [pdf]
Sequential Learning of Reusable Parts for Object Detection. S. Krempp, D. Geman, and Y. Amit. 2002 [pdf]
ImageNet: A Large-Scale Hierarchical Image Database, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf] [data]

Rui [ppt]
Patrick [ppt]

Week of Mar 29 - Apr 2:
Individual project update meetings (by appt)

Apr 7

Too many images!

Scalable image search with large databases

hash

*Kernelized Locality Sensitive Hashing for Scalable Image Search, by B. Kulis and K. Grauman, ICCV 2009 [pdf] [code]
*Geometric Min-Hashing: Finding a (Thick) Needle in a Haystack, O. Chum, M. Perdoch, and J. Matas. CVPR 2009. [pdf]
*Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, C. Lampert, ICCV 2009. [pdf] [code] [code]

80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition, by A. Torralba, R. Fergus, and W. Freeman. PAMI 2008. [pdf] [web]
Fast Image Search for Learned Metrics, P. Jain, B. Kulis, and K. Grauman, CVPR 2008. [pdf]
Small Codes and Large Image Databases for Recognition, A. Torralba, R. Fergus, and Y. Weiss, CVPR 2008. [pdf]
Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf]
LSH homepage
Nearest Neighbor Methods in Learning and Vision, Shakhnarovich, Darrell, and Indyk, editors.

Muhibur [ppt]

IV: Recognition and "everyday" visual data

Apr 14

Landmarks, locations, and tourist photographers:

Location recognition, cues from tourist photos, photographer biases, retrieval for landmarks, browsing and visualization

*Landmark Classification in Large-Scale Image Collections. Y. Li, D. Crandall, D. Huttenlocher. ICCV 2009. [pdf]
*Image Sequence Geolocation with Human Travel Priors, E. Kalogerakis, O. Vesselova, J. Hays, A. Efros, A. Hertzmann. ICCV 2009. [pdf] [web]
*Scene Summarization for Online Image Collections. I. Simon, N. Snavely, S. Seitz. ICCV 2007. [pdf] [web]

Mapping the World's Photos, D. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg, WWW 2009. [pdf] [web]
Im2GPS: Estimating Geographic Information from a Single Image, J. Hays and A. Efros. CVPR 2008. [pdf] [web]
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval, Chum, Philbin, Sivic, Isard, and Zisserman, ICCV 2007. [pdf]
Scene Segmentation Using the Wisdom of Crowds, by I. Simon and S. Seitz. ECCV 2008. [pdf]
Photo Tourism: Exploring Photo Collections in 3D, by N. Snavely, S. Seitz, and R. Szeliski, SIGGRAPH 2006. [pdf] [web]
Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs, by X. Li, C. Wu, C. Zach, S. Lazebnik, and J. Frahm, ECCV 2008. [pdf] [web]
City-Scale Location Recognition, G. Schindler, M. Brown, and R. Szeliski, CVPR 2007. [pdf]
Parsing Images of Architectural Scenes, A. Berg, F. Grabler, J. Malik. ICCV 2007. [pdf]
I Know What You Did Last Summer: Object-Level Auto-annotation of Holiday Snaps, S. Gammeter, L. Bossard, T.Quack, L. van Gool, ICCV 2009. [pdf]
CVPR 2009 Workshop on Visual Place Categorization
Code for downloading Flickr images, by James Hays
UW Community Photo Collections homepage

Sarah [ppt]
Suyog [pdf]

Apr 21

Alignment with text:

Discovering the correspondence between words (and other language constructs) to images or video, using captions or subtitles as weak labels.

lang

*"'Who are you?' - Learning Person Specific Classifiers from Video, J. Sivic, M. Everingham, and A. Zisserman, CVPR 2009. [pdf]
*Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, A. Gupta and L. Davis, ECCV 2008. [pdf]
*Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth. ECCV 2002. [pdf] [data]

The Mathematics of Statistical Machine Translation: Parameter Estimation. P. Brown, S. Della Pietro, V. Della Pietra, R. Mercer. Association for Computational Linguistics, 1993. [pdf]
Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation. L. Jie, B. Caputo, and V. Ferrari. NIPS 2009. [pdf]
Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004. [pdf] [web]
Learning Sign Language by Watching TV (using weakly aligned subtitles), P. Buehler, M. Everingham, and A. Zisserman. CVPR 2009. [pdf] [data]
“Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006. [pdf] [web] [data]
Using Closed Captions to Train Activity Recognizers that Improve Video Retrieval, S. Gupta and R. Mooney. CVPR Visual and Contextual Learning Workshop, 2009. [pdf]
Systematic Evaluation of Machine Translation Methods for Image and Video Annotation, P. Virga, P. Duygulu, CIVR 2005. [pdf]
Subrip for subtitle extraction
Reuters captioned photos
Sonal Gupta's data for commentary+video

Anish [pdf]
Chao-Yeh [ppt]

Friday
April 23

Prof. Martial Hebert, CMU
Forum for AI Talk
11 AM, TAY 3.128

Apr 28

Pictures of people:

Faces, consumer photo collections, tagging

faces

*Understanding Images of Groups of People, A. Gallagher and T. Chen, CVPR 2009. [pdf]
*Contextual Identity Recognition in Personal Photo Albums. D. Anguelov, K.-C. Lee, S. Burak, Gokturk, and B. Sumengen. CVPR 2007. [pdf]
*A Face Annotation Framework with Partial Clustering and Interactive Labeling. R. X. Y. Tian,W. Liu, F.Wen, and X. Tang. CVPR 2007. [pdf] [web]

Autotagging Facebook: Social Network Context Improves Photo Annotation, by Z. Stone, T. Zickler, and T. Darrell. CVPR Internet Vision Workshop 2008. [pdf]
Efficient Propagation for Face Annotation in Family Albums. L. Zhang, Y. Hu, M. Li, and H. Zhang. MM 2004. [pdf]
Using Group Prior to Identify People in Consumer Images, A. Gallagher, T. Chen, CVPR Workshop on Semantic Learning Applications in Multimedia, 2007. [pdf]
Leveraging Archival Video for Building Face Datasets, by D. Ramanan, S. Baker, and S. Kakade. ICCV 2007. [pdf]
Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004. [pdf] [web]
Face detection code in OpenCV
Gallagher's Person Dataset

MingJun

Friday
April 30

Final paper drafts due

May 5

Course wrap-up
Project presentations, part I

Presentations due

May 13

Final papers due

Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12
Kristen Grauman - Computer Vision Spring 2011 (0)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2011 (0)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2012 (0)	2013.06.05

Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12
Kristen Grauman - Special Topics in Computer Vision, Spring 2010 (1)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2011 (0)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2012 (0)	2013.06.05

CS395T: Visual Recognition, Fall 2011

Course overview Useful links Syllabus Detailed schedule Blackboard

Meets: Wednesdays 4:00-7:00 pm
ACES 3.408

Instructor: Kristen Grauman
Email: grauman@cs
Office: ACES 3.446

Office hours: by appointment

When emailing me, please put CS395 in the subject line.

Announcements:

See the schedule for weekly reading assignments.

Project paper drafts due Nov 23. Details on projects are here.

Course overview:

See the syllabus for an outline of the main topics we'll be covering.

Note that presentations are due one week before the slot your presentation is scheduled. This means you will need to read the papers, prepare experiments, create slides, etc. more than one week before the date you are signed up for. The idea is to meet and discuss ahead of time, so that we can iterate as needed the week leading up to your presentation.

More details on the requirements and grading breakdown are here.

Prereqs: Courses in computer vision and/or machine learning (378/376 Computer Vision and/or 391 Machine Learning, or similar); ability to understand and analyze conference papers in this area; programming required for experiment presentations and projects.

Syllabus overview:

Important dates:

Monday, Aug 29: paper topic preferences due
Friday, Sept 16: implementation assignment due
Friday, Oct 7: project proposals due
Wednesday, Nov 23: final project paper drafts due
Tuesday, Dec 6: final papers due

Date	Topics	Papers and links	Presenters	Items due
Aug 24	Course intro		[slides]	Topic preferences due via email by Monday August 29
I. Single-object recognition fundamentals: representation, matching, and classification
Aug 31	Recognizing specific objects: Invariant local features, instance recognition, bag-of-words models	Object Recognition from Local Scale-Invariant Features, Lowe, ICCV 1999. [pdf] [code] [other implementations of SIFT] [IJCV] Local Invariant Feature Detectors: A Survey, Tuytelaars and Mikolajczyk. Foundations and Trends in Computer Graphics and Vision, 2008. [pdf] [Oxford code] [Read pp. 178-188, 216-220, 254-255] *Video Google: A Text Retrieval Approach to Object Matching in Videos, Sivic and Zisserman, ICCV 2003. [pdf] [demo] For more background on feature extraction: Szeliski book: Sec 3.2 Linear filtering, 4.1 Points and patches, 4.2 Edges Scalable Recognition with a Vocabulary Tree, D. Nister and H. Stewenius, CVPR 2006. [pdf] SURF: Speeded Up Robust Features, Bay, Ess, Tuytelaars, and Van Gool, CVIU 2008. [pdf] [code] Bundling Features for Large Scale Partial-Duplicate Web Image Search. Z. Wu, Q. Ke, M. Isard, and J. Sun. CVPR 2009. [pdf] Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, J. Matas, O. Chum, U. Martin, and T. Pajdla, BMVC 2002. [pdf] City-Scale Location Recognition, G. Schindler, M. Brown, and R. Szeliski, CVPR 2007. [pdf] Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf] I Know What You Did Last Summer: Object-Level Auto-annotation of Holiday Snaps, S. Gammeter, L. Bossard, T.Quack, L. van Gool, ICCV 2009. [pdf] Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. O. Chum et al. CVPR 2007. [pdf] A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid. CVPR 2003 [pdf] Oxford group interest point software Andrea Vedaldi's VLFeat code, including SIFT, MSER, hierarchical k-means. INRIA LEAR team's software, including interest points, shape features Semantic Robot Vision Challenge links CVPR 2009 Workshop on Visual Place Categorization Code for downloading Flickr images, by James Hays UW Community Photo Collections homepage FLANN - Fast Library for Approximate Nearest Neighbors. Marius Muja et al. Google Goggles Kooaba	[slides]
Sept 7	Recognition via classification and global models: Global appearance models for category and scene recognition, sliding window detection, detection as a binary decision.	A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008. [pdf] [code] Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce, CVPR 2006. [pdf] [15 scenes dataset] [libpmk] [Matlab] *Rapid Object Detection Using a Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001. [pdf] [code] Histograms of Oriented Gradients for Human Detection, Dalal and Triggs, CVPR 2005. [pdf] [video] [code] [PASCAL datasets] Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, Oliva and Torralba, IJCV 2001. [pdf] [Gist code] Locality-Constrained Linear Coding for Image Classification. J. Wang, J. Yang, K. Yu, and T. Huang CVPR 2010. [pdf] [code] Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, ECCV International Workshop on Statistical Learning in Computer Vision, 2004. [pdf] Pedestrian Detection in Crowded Scenes, Leibe, Seemann, and Schiele, CVPR 2005. [pdf] Pyramids of Histograms of Oriented Gradients (pHOG), Bosch and Zisserman. [code] Eigenfaces for Recognition, Turk and Pentland, 1991. [pdf] Sampling Strategies for Bag-of-Features Image Classification. E. Nowak, F. Jurie, and B. Triggs. ECCV 2006. [pdf] Beyond Sliding Windows: Object Localization by Efficient Subwindow Search. C. Lampert, M. Blaschko, and T. Hofmann. CVPR 2008. [pdf] [code] A Trainable System for Object Detection, C. Papageorgiou and T. Poggio, IJCV 2000. [pdf] Object Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. CVPR 2005. [pdf] LIBPMK feature extraction code, includes dense sampling LIBSVM library for support vector machines PASCAL VOC Visual Object Classes Challenge	[slides]
Sept 14	Regions and mid-level representations Segmentation, grouping, surface estimation	Constrained Parametric Min-Cuts for Automatic Object Segmentation. J. Carreira and C. Sminchisescu. CVPR 2010. [pdf] [code] Geometric Context from a Single Image, by D. Hoiem, A. Efros, and M. Hebert, ICCV 2005. [pdf] [web] [code] *Contour Detection and Hierarchical Image Segmentation. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. PAMI 2011. [pdf] [data and code] From Contours to Regions: An Empirical Evaluation. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. CVPR 2009. [pdf] [code] Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. CVPR 2011. [pdf] [code] Object Recognition as Ranking Holistic Figure-Ground Hypotheses. F. Li, J. Carreira, and C. Sminchisescu. CVPR 2010. [pdf] Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code] Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon, and S. Ullman. CVPR workshop 2004. [pdf] [data] Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. CVPR 2011. [pdf] [code] [data] Extracting Subimages of an Unknown Category from a Set of Images, S. Todorovic and N. Ahuja, CVPR 2006. [pdf] Learning Mid-level Features for Recognition. Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. CVPR, 2010. Class-Specific, Top-Down Segmentation, E. Borenstein and S. Ullman, ECCV 2002. [pdf] Object Recognition by Integrating Multiple Image Segmentations, C. Pantofaru, C. Schmid, and M. Hebert, ECCV 2008 [pdf] Image Parsing: Unifying Segmentation, Detection, and Recognition. Tu, Z., Chen, Z., Yuille, A.L., Zhu, S.C. ICCV 2003 [pdf] GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page] Recognition Using Regions. C. Gu, J. Lim, P. Arbelaez, J. Malik, CVPR 2009. [pdf] [code] Robust Higher Order Potentials for Enforcing Label Consistency, P. Kohli, L. Ladicky, and P. Torr. CVPR 2008. Co-segmentation of Image Pairs by Histogram Matching --Incorporating a Global Constraint into MRFs, C. Rother, V. Kolmogorov, T. Minka, and A. Blake. CVPR 2006. [pdf] Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [data] An Efficient Algorithm for Co-segmentation, D. Hochbaum, V. Singh, ICCV 2009. [pdf] Normalized Cuts and Image Segmentation, J. Shi and J. Malik. PAMI 2000. [pdf] [code] Greg Mori's superpixel code Berkeley Segmentation Dataset and code Pedro Felzenszwalb's graph-based segmentation code Michael Maire's segmentation code and paper Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf] [code, Matlab interface by Shai Bagon] David Blei's Topic modeling code	[slides] Expts: Brian, Cho-Jui	Implementation assignment due Friday Sept 16, 5 PM
II. Beyond single objects: scenes and properties
Sept 21	Context and scenes Multi-object scenes, inter-object relationships, understanding scenes' spatial layout, 3d context	Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces. D. Lee, A. Gupta, M. Hebert, and T. Kanade. NIPS 2010. [pdf] [code] Multi-Class Segmentation with Relative Location Prior. S. Gould, J. Rodgers, D. Cohen, G. Elidan and D. Koller. IJCV 2008. [pdf] [code] *Using the Forest to See the Trees: Exploiting Context for Visual Object Detection and Localization. Torralba, Murphy, and Freeman. CACM 2009. [pdf] [related code] Contextual Priming for Object Detection, A. Torralba. IJCV 2003. [pdf] [web] [code] TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. J. Shotton, J. Winn, C. Rother, A. Criminisi. ECCV 2006. [pdf] [web] [data] [code] Recognition Using Visual Phrases. M. Sadeghi and A. Farhadi. CVPR 2011. [pdf] Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry. V. Hedau, D. Hoiem, and D. Forsyth. ECCV 2010 [pdf] [code and data] Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, A. Gupta, A. Efros, and M. Hebert. ECCV 2010. [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [code] Geometric Reasoning for Single Image Structure Recovery. D. Lee, M. Hebert, and T. Kanade. CVPR 2009. [pdf] [web] [code] Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006. [pdf] [web] Discriminative Models for Multi-Class Object Layout, C. Desai, D. Ramanan, C. Fowlkes. ICCV 2009. [pdf] [slides] [SVM struct code] [data] Closing the Loop in Scene Interpretation. D. Hoiem, A. Efros, and M. Hebert. CVPR 2008. [pdf] Decomposing a Scene into Geometric and Semantically Consistent Regions, S. Gould, R. Fulton, and D. Koller, ICCV 2009. [pdf] [slides] Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008. [pdf] [code] An Empirical Study of Context in Object Detection, S. Divvala, D. Hoiem, J. Hays, A. Efros, M. Hebert, CVPR 2009. [pdf] [web] Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.[ pdf] Context Based Object Categorization: A Critical Survey. C. Galleguillos and S. Belongie. [pdf] What, Where and Who? Classifying Events by Scene and Object Recognition, L.-J. Li and L. Fei-Fei, ICCV 2007. [pdf] Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Unsupervised Framework, L-J. Li, R. Socher, L. Fei-Fei, CVPR 2009. [pdf] Labelme Database Scene Understanding Symposium	Papers: Nishant, Jung Expts: Saurajit
Sept 28	Saliency and attention Among all items in the scene, which deserve attention (first)?	A Model of Saliency-based Visual Attention for Rapid Scene Analysis. L. Itti, C. Koch, and E. Niebur. PAMI 1998 [pdf] Learning to Detect a Salient Object. T. Liu et al. CVPR 2007. [pdf] [results] [data] [code by Vicente Ordonez] Figure-Ground Segmentation Improves Handled Object Recognition in Egocentric Video. X. Ren and C. Gu. CVPR 2010 [pdf] [videos] [data] What Do We Perceive in a Glance of a Real-World Scene? L. Fei-Fei, A. Iyer, C. Koch, and P. Perona. Journal of Vision, 2007. [pdf] Interesting Objects are Visually Salient. L. Elazary and L. Itti. Journal of Vision, 8(3):1–15, 2008. [pdf] Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. BMVC 2010. [pdf] [web] [data] Some Objects are More Equal Than Others: Measuring and Predicting Importance, M. Spain and P. Perona. ECCV 2008. [pdf] What Makes an Image Memorable? P. Isola et al. CVPR 2011. [pdf] The Discriminant Center-Surround Hypothesis for Bottom-Up Saliency. D. Gao, V.Mahadevan, and N. Vasconcelos. NIPS, 2007. [pdf] Category-Independent Object Proposals. I. Endres and D. Hoiem. ECCV 2010. [pdf] [code] What is an Object? B. Alexe, T. Deselaers, and V. Ferrari. CVPR 2010. [pdf] [code] A Principled Approach to Detecting Surprising Events in Video. L. Itti and P. Baldi. CVPR 2005 [pdf] Optimal Scanning for Faster Object Detection, N. Butko, J. Movellan. CVPR 2009. [pdf] What Attributes Guide the Deployment of Visual Attention and How Do They Do It? J. Wolfe and T. Horowitz. Neuroscience, 5:495–501, 2004. [pdf] Visual Correlates of Fixation Selection: Effects of Scale and Time. B. Tatler, R. Baddeley, and I. Gilchrist. Vision Research, 45:643, 2005. [pdf] Objects Predict Fixations Better than Early Saliency. W. Einhauser, M. Spain, and P. Perona. Journal of Vision, 8(14):1–26, 2008. [pdf] Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010. [pdf] [data] Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video. S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstrack,S. Chung, A. Ng. IJCAI 2007. [pdf] Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006. [pdf] [web] Determining Patch Saliency Using Low-Level Context, D. Parikh, L. Zitnick, and T. Chen. ECCV 2008. [pdf] Visual Recognition and Detection Under Bounded Computational Resources, S. Vijayanarasimhan and A. Kapoor. CVPR 2010. Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. ICCV 2011 [pdf] Contextual Guidance of Eye Movements and Attention in Real-World Scenes: The Role of Global Features on Object Search. A. Torralba, A. Oliva, M. Castelhano, J. Henderson. [pdf] [web] The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search, G. Zelinsky, W. Zhang, B. Yu, X. Chen, D. Samaras, NIPS 2005. [pdf]	Papers: Lu Xia Expts: Larry
Oct 5	Attributes: Visual properties, learning from natural language descriptions, intermediate representations	Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, C. Lampert, H. Nickisch, and S. Harmeling, CVPR 2009 [pdf] [web] [data] Describing Objects by Their Attributes, A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, CVPR 2009. [pdf] [web] [data] *Attribute and Simile Classifiers for Face Verification, N. Kumar, A. Berg, P. Belhumeur, S. Nayar. ICCV 2009. [pdf] [web] [lfw data] [pubfig data] Relative Attributes. D. Parikh and K. Grauman. ICCV 2011. [pdf] [data] A Discriminative Latent Model of Object Classes and Attributes. Y. Wang and G. Mori. ECCV, 2010. [pdf] Learning Visual Attributes, V. Ferrari and A. Zisserman, NIPS 2007. [pdf] Learning Models for Object Recognition from Natural Language Descriptions, J. Wang, K. Markert, and M. Everingham, BMVC 2009.[pdf] FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur, and S. Nayar. ECCV 2008. [pdf] Attribute-Centric Recognition for Cross-Category Generalization. A. Farhadi, I. Endres, D. Hoiem. CVPR 2010. [pdf] Automatic Attribute Discovery and Characterization from Noisy Web Data. T. Berg et al. ECCV 2010. [pdf] [data] Attributes-Based People Search in Surveillance Environments. D. Vaquero, R. Feris, D. Tran, L. Brown, A. Hampapur, and M. Turk. WACV 2009. [pdf] [project page] Image Region Entropy: A Measure of "Visualness" of Web Images Associated with One Concept. K. Yanai and K. Barnard. ACM MM 2005. [pdf] What Helps Where And Why? Semantic Relatedness for Knowledge Transfer. M. Rohrbach, M. Stark, G. Szarvas, I. Gurevych and B. Schiele. CVPR 2010. [pdf] Recognizing Human Actions by Attributes. J. Liu, B. Kuipers, S. Savarese, CVPR 2011. [pdf] Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. CVPR 2011. [pdf] [web]	Papers: Saurajit Expts: Qiming, Harsh	Proposal abstracts due Friday Oct 7, 5 PM
III. External input in recognition
Oct 12	Language and description Discovering the correspondence between words and other language constructs and images, generating descriptions	Baby Talk: Understanding and Generating Image Descriptions. Kulkarni et al. CVPR 2011. [pdf] Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, A. Gupta and L. Davis, ECCV 2008. [pdf] *Learning Sign Language by Watching TV (using weakly aligned subtitles), P. Buehler, M. Everingham, and A. Zisserman. CVPR 2009. [pdf] [data] [web] Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth. ECCV 2002. [pdf] [data] The Mathematics of Statistical Machine Translation: Parameter Estimation. P. Brown, S. Della Pietro, V. Della Pietra, R. Mercer. Association for Computational Linguistics, 1993. [pdf] (background for Duygulu et al paper) How Many Words is a Picture Worth? Automatic Caption Generation for News Images. Y. Feng and M. Lapata. ACL 2010. [pdf] Matching words and pictures. K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. JMLR, 3:1107–1135, 2003. [pdf] Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation. L. Jie, B. Caputo, and V. Ferrari. NIPS 2009. [pdf] Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. ECML 2008. [pdf] Systematic Evaluation of Machine Translation Methods for Image and Video Annotation, P. Virga, P. Duygulu, CIVR 2005. [pdf] Localizing Objects and Actions in Videos Using Accompanying Text. Johns Hopkins University Summer Workshop Report. J. Neumann et al. 2010. [pdf] [web] Subrip for subtitle extraction Reuters captioned photos Sonal Gupta's data for commentary+video	Papers: Chris Expts: Jae, Naga
Oct 19	Interactive learning and recognition Human-in-the-loop learning, active annotation collection, crowdsourcing	Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. CVPR 2011. [pdf] Visual Recognition with Humans in the Loop. Branson S., Wah C., Babenko B., Schroff F., Welinder P., Perona P., Belongie S. ECCV 2010. [pdf] [Caltech/UCSD Visipedia project] [data] The Multidimensional Wisdom of Crowds. Welinder P., Branson S., Belongie S., Perona, P. NIPS 2010. [pdf] [code] What’s It Going to Cost You? : Predicting Effort vs. Informativeness for Multi-Label Image Annotations. S. Vijayanarasimhan and K. Grauman. CVPR 2009 [pdf] [data] [code] iCoseg: Interactive Co-segmentation with Intelligent Scribble Guidance, D. Batra, A. Kowdle, D. Parikh, J. Luo and T. Chen. CVPR 2010. [pdf] [web] Labeling Images with a Computer Game. L. von Ahn and L. Dabbish. CHI, 2004. Who's Vote Should Count More: Optimal Integration fo Labels from Labelers of Unknown Expertise. J. Whitehill et al. NIPS 2009. [pdf] Utility Data Annotation with Amazon Mechanical Turk. A. Sorokin and D. Forsyth. Wkshp on Internet Vision, 2008. Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. CVPR 2010. [pdf] [code] Multiclass Recognition and Part Localization with Humans in the Loop. C. Wah et al. ICCV 2011. [pdf] Multi-Level Active Prediction of Useful Image Annotations for Recognition. S. Vijayanarasimhan and K. Grauman. NIPS 2008. [pdf] Active Learning from Crowds. Y. Yan, R. Rosales, G. Fung, J. Dy. ICML 2011. [pdf] Proactive Learning: Cost-Sensitive Active Learning with Multiple Imperfect Oracles. P. Donmez and J. Carbonell. CIKM 2008. [pdf] Inactive Learning? Difficulties Employing Active Learning in Practice. J. Attenberg and F. Provost. SIGKDD 2011. [pdf] Annotator Rationales for Visual Recognition. J. Donahue and K. Grauman. ICCV 2011. [pdf] Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. CVPR 2011. [pdf] [web] Actively Selecting Annotations Among Objects and Attributes. A. Kovashka, S. Vijayanarasimhan, and K. Grauman. ICCV 2011 [pdf] Supervised Learning from Multiple Experts: Whom to Trust When Everyone Lies a Bit. V. Raykar et al. ICML 2009. [pdf] Multi-class Active Learning for Image Classification. A. J. Joshi, F. Porikli, and N. Papanikolopoulos. CVPR 2009. [pdf] GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page] Active Learning for Piecewise Planar 3D Reconstruction. A. Kowdle, Y.-J. Chang, A. Gallagher and T. Chen. CVPR 2011 [pdf] [web] Amazon Mechanical Turk Using Mechanical Turk with LabelMe	Papers: Brian, Harsh Expts: Yunsik	Proposal extended outline due Friday Oct 21, 5 PM
IV. Activity in images and video
Oct 26	Pictures of people Finding people and their poses, automatic face tagging	Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations, L. Bourdev and J. Malik. ICCV 2009 [pdf] [code] Understanding Images of Groups of People, A. Gallagher and T. Chen, CVPR 2009. [pdf] [web] [data] Real-Time Human Pose Recognition in Parts from a Single Depth Image. J. Shotton et al. CVPR 2011. [pdf] [video] "'Who are you?' - Learning Person Specific Classifiers from Video, J. Sivic, M. Everingham, and A. Zisserman, CVPR 2009. [pdf] [data] [KLT tracking code] Contextual Identity Recognition in Personal Photo Albums. D. Anguelov, K.-C. Lee, S. Burak, Gokturk, and B. Sumengen. CVPR 2007. [pdf] Fast Pose Estimation with Parameter Sensitive Hashing. G. Shakhnarovich, P. Viola, T. Darrell, ICCV 2003.[pdf] Finding and Tracking People From the Bottom Up. D. Ramanan, D. A. Forsyth. CVPR 2003. [pdf] Where’s Waldo: Matching People in Images of Crowds. R. Garg, D. Ramanan, S. Seitz, N. Snavely. CVPR 2011. [pdf] Autotagging Facebook: Social Network Context Improves Photo Annotation, by Z. Stone, T. Zickler, and T. Darrell. CVPR Internet Vision Workshop 2008. [pdf] Efficient Propagation for Face Annotation in Family Albums. L. Zhang, Y. Hu, M. Li, and H. Zhang. MM 2004. [pdf] Progressive Search Space Reduction for Human Pose Estimation. Ferrari, V., Marin-Jimenez, M. and Zisserman, A. CVPR 2008. [pdf] [web] [code] Leveraging Archival Video for Building Face Datasets, by D. Ramanan, S. Baker, and S. Kakade. ICCV 2007. [pdf] Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004. [pdf] [web] Face Discovery with Social Context. Y. J. Lee and K. Grauman. BMVC 2011. [pdf] “Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006. [pdf] [web] [data] Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities. Yao, B., Fei-Fei, L. CVPR 2010. A Face Annotation Framework with Partial Clustering and Interactive Labeling. R. X. Y. Tian,W. Liu, F.Wen, and X. Tang. CVPR 2007. [pdf] [web] From 3D Scene Geometry to Human Workspace. A. Gupta et al. CVPR 2011. [pdf] [web] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. M. Andriluka et al. CVPR 2009. [pdf] [code] Face detection code in OpenCV Gallagher's Person Dataset Face data from Buffy episode, from Oxford Visual Geometry Group CALVIN upper-body detector code	Papers: Sunil, Larry Expts: Nishant, Jung
Nov 2	Activity recognition Recognizing and localizing human actions in video	Actions in Context, M. Marszalek, I. Laptev, C. Schmid. CVPR 2009. [pdf] [web] [data] A Hough Transform-Based Voting Framework for Action Recognition. A. Yao, J. Gall, L. Van Gool. CVPR 2010. [pdf] [code/data] *Beyond Actions: Discriminative Models for Contextual Group Activities. T. Lian, Y. Wang, W. Yang, and G. Mori. NIPS 2010. [pdf] [data] Objects in Action: An Approach for Combining Action Understanding and Object Perception. A. Gupta and L. Davis. CVPR, 2007. [pdf] [data] Learning Realistic Human Actions from Movies. I. Laptev, M. Marszałek, C. Schmid and B. Rozenfeld. CVPR 2008. [pdf] [data] Understanding Egocentric Activities. A. Fathi, A. Farhadi, J. Rehg. ICCV 2011. [pdf] Exploiting Human Actions and Object Context for Recognition Tasks. D. Moore, I. Essa, and M. Hayes. ICCV 1999. [pdf] A Scalable Approach to Activity Recognition Based on Object Use. J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. ICCV 2007. [pdf] Recognizing Actions at a Distance. A. Efros, G. Mori, J. Malik. ICCV 2003. [pdf] [web] Activity Recognition from First Person Sensing. E. Taralova, F. De la Torre, M. Hebert CVPR 2009 Workshop on Egocentric Vision [pdf] Action Recognition from a Distributed Representation of Pose and Appearance, S. Maji, L. Bourdev, J. Malik, CVPR 2011. [pdf] [code] Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. CVPR 2010. [pdf] Temporal Causality for the Analysis of Visual Events. K. Prabhakar, S. Oh, P. Wang, G. Abowd, and J. Rehg. CVPR 2010. [pdf] [Georgia Tech Computational Behavior Science project] Modeling Activity Global Temporal Dependencies using Time Delayed Probabilistic Graphical Model. Loy, Xiang & Gong ICCV 2009. [pdf] What's Going on?: Discovering Spatio-Temporal Dependencies in Dynamic Scenes. D. Kuettel et al. CVPR 2010. [pdf] Learning Actions From the Web. N. Ikizler-Cinbis, R. Gokberk Cinbis, S. Sclaroff. ICCV 2009. [pdf] Content-based Retrieval of Functional Objects in Video Using Scene Context. S. Oh, A. Hoogs, M. Turek, and R. Collins. ECCV 2010. [pdf] Ivan Laptev's Space-Time Interest Points code Hollywood activity dataset UCF activity datasets PASCAL VOC action recognition taster challenge Greg Mori and Ivan Laptev's tutorial on action recognition at ECCV 2010 TRECVID video retrieval challenge UMich Collective Activity dataset	Papers: Qiming, Yunsik Expts: Lu Xia
V. Dealing with lots of data/categories
Nov 9	Scaling with a large number of categories Sharing features between classes, transfer, taxonomy, learning from few examples, exploiting class relationships	Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K. Murphy, W. Freeman, PAMI 2007. [pdf] [code] What Does Classifying More than 10,000 Image Categories Tell Us? J. Deng, A. Berg, K. Li and L. Fei-Fei. ECCV 2010. [pdf] *Discriminative Learning of Relaxed Hierarchy for Large-scale Visual Recognition. T. Gao and Daphne Koller. ICCV 2011. [pdf] [code] Comparative Object Similarity for Improved Recognition with Few or Zero Examples. G. Wang, D. Forsyth, and D. Hoeim. CVPR 2010. [pdf] Learning and Using Taxonomies for Fast Visual Categorization, G. Griffin and P. Perona, CVPR 2008. [pdf] [data] Cross-Generalization: Learning Novel Classes from a Single Example by Feature Replacement. CVPR 2005. [pdf] 80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition, by A. Torralba, R. Fergus, and W. Freeman. PAMI 2008. [pdf] [web] Constructing Category Hierarchies for Visual Recognition, M. Marszalek and C. Schmid. ECCV 2008. [pdf] [web] [Caltech256] Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. L. Fei-Fei, R. Fergus, and P. Perona. CVPR Workshop on Generative-Model Based Vision. 2004. [pdf] [Caltech101] Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis. CVPR 2007 [pdf] Exploiting Object Hierarchy: Combining Models from Different Category Levels, A. Zweig and D. Weinshall, ICCV 2007 [pdf] Incremental Learning of Object Detectors Using a Visual Shape Alphabet. Opelt, Pinz, and Zisserman, CVPR 2006. [pdf] Sequential Learning of Reusable Parts for Object Detection. S. Krempp, D. Geman, and Y. Amit. 2002 [pdf] ImageNet: A Large-Scale Hierarchical Image Database, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf] [data] Semantic Label Sharing for Learning with Many Categories. R. Fergus et al. ECCV 2010. [pdf] Learning a Tree of Metrics with Disjoint Visual Features. S. J. Hwang, K. Grauman, F. Sha. NIPS 2011. SUN Scene dataset of 899 scene classes ImageNet dataset of 15K objects and ImageNet challenge	Papers: Cho-Jui, Si Si Expts: Lu Pan
Nov 16	Large-scale search and mining Scalable retrieval algorithms for massive databases, mining for themes	VisualRank: Applying PageRank to Large-Scale Image Search. Y. Jing and S. Baluja. PAMI 2008. [pdf] Kernelized Locality Sensitive Hashing for Scalable Image Search, by B. Kulis and K. Grauman, ICCV 2009 [pdf] [code] Video Mining with Frequent Itemset Configurations. T. Quack, V. Ferrari, and L. Van Gool. CIVR 2006. [pdf] Learning Binary Projections for Large-Scale Image Search. K. Grauman and R. Fergus. Chapter (draft) to appear in Registration, Recognition, and Video Analysis, R. Cipolla, S. Battiato, and G. Farinella, Editors. [pdf] World-scale Mining of Objects and Events from Community Photo Collections. T. Quack, B. Leibe, and L. Van Gool. CIVR 2008. [pdf] Interest Seam Image. X. Zhang, G. Hua, L. Zhang, H. Shum. CVPR 2010. [pdf] Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, C. Lampert, ICCV 2009. [pdf] [code] Geometric Min-Hashing: Finding a (Thick) Needle in a Haystack, O. Chum, M. Perdoch, and J. Matas. CVPR 2009. [pdf] FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur, and S. Nayar. ECCV 2008. [pdf] Efficiently Searching for Similar Images. K. Grauman. Communications of the ACM*, 2009. [CACM link] Fast Image Search for Learned Metrics, P. Jain, B. Kulis, and K. Grauman, CVPR 2008. [pdf] Small Codes and Large Image Databases for Recognition, A. Torralba, R. Fergus, and Y. Weiss, CVPR 2008. [pdf] Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf] LSH homepage Nearest Neighbor Methods in Learning and Vision, Shakhnarovich, Darrell, and Indyk, editors.	Papers: Naga, Jae Expts: Si Si
Nov 23	Summarization Video synopsis, discovering repeated objects, visualization	Webcam Synopsis: Peeking Around the World, by Y. Pritch, A. Rav-Acha, A. Gutman, and S. Peleg, ICCV 2007. [pdf] [web] Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code] *Summarizing Visual Data Using Bi-Directional Similarity. D. Simakov, Y. Caspi, E. Shechtmann, M. Irani. CVPR 2008. [pdf] [video] Fast Unsupervised Ego-Action Learning for First-Person Sports Video. K. Kitani, T. Okabe, Y. Sato, A. Sugimoto. CVPR 2011. [pdf] Scene Summarization for Online Image Collections. I. Simon, N. Snavely, S. Seitz. ICCV 2007. [pdf] [web] VideoCut: Removing Irrelevant Frames by Discovering the Object of Interest. D. Liu, G. Hua, T. Chen. ECCV 2010. [pdf] Video Epitomes. V. Cheung, B. J. Frey, and N. Jojic. CVPR 2005. [pdf] [web] [code] Making a Long Video Short. A. Rav-Acha, Y. Pritch, and S. Peleg. CVPR 2006. [pdf] Structural Epitome: A Way to Summarize One's Visual Experience. N. Jojic, A. Perina, V. Murino. NIPS 2010. [pdf] [data] Video Abstraction: A Systematic Review and Classification. B. Truong and S. Venkatesh. ACM 2007. [pdf] Shape Discovery from Unlabeled Image Collections. Y. J. Lee and K. Grauman. CVPR 2009. [pdf] Detecting and Sketching the Common. S. Bagon, O. Brostovski, M. Galun, M. Irani. CVPR 2010. [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [code] Unsupervised Object Discovery: A Comparison. T. Tuytelaars et al. IJCV 2009. [pdf]	Papers: Lu Pan Expts: Sunil, Chris	Final paper drafts due Wed Nov 23
Nov 30	Final project presentations in class			Final papers due Tues Dec 6, 5 PM

Logistic Regression에 대한 간단한 설명 (0)	2013.06.26
Kristen Grauman - Visual Object Recognition and Image Search (0)	2013.06.12
Kristen Grauman - Special Topics in Computer Vision, Spring 2010 (1)	2013.06.05
Kristen Grauman - Computer Vision Spring 2011 (0)	2013.06.05
Kristen Grauman - Visual Recognition, Fall 2012 (0)	2013.06.05

CS395T: Visual Recognition, Fall 2012

Course overview Useful links Syllabus Detailed schedule Blackboard

Meets: Fridays 1-4 pm in ACES 3.408

Instructor: Kristen Grauman
Email: grauman@cs
Office: ACES 3.446
Office hours: by appointment (send email)

TA: Austin Waters
Email: austin@cs
Office hours: by appointment (send email)

When emailing us, please put CS395 in the subject line.

Announcements:

See the schedule for weekly reading assignments.

Project extended outlines due Wed Oct 31. See handout for guidelines.

Course overview:

Topics: This is a graduate seminar course in computer vision. We will survey and discuss current vision papers relating to object and activity recognition, auto-annotation of images, and scene understanding. The goals of the course will be to understand current approaches to some important problems, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research.

See the syllabus for an outline of the main topics we'll be covering.

Requirements: Students will be responsible for writing paper reviews each week, participating in discussions, completing two programming assignments, presenting once or twice in class (depending on enrollment), and completing a project (done in pairs).

More details on the requirements and grading breakdown are here.

Syllabus overview:

A. Object recognition fundamentals

Local features and matching for object instances
Large-scale image/object search and mining

Classification and detection for object categories
Mid-level representations
B. Beyond modeling individual objects

Context and scenes
Dealing with many categories
Describing objects with attributes
Importance and saliency

C. Human-centered recognition

Pictures of people
Activity recognition
Egocentric cameras
Human-in-the-loop interactive systems

Important dates:

Wednesday, Sept 5: paper topic preferences due
Friday, Sept 21: first coding assignment due
Friday Oct 5: second coding assignment due
Wednesday, Oct 17: project proposal abstracts due
Wednesday, Oct 31: project extended outlines due
Friday Dec 7: final papers due

Schedule and papers:

Note: * = required reading.
Additional papers are provided for reference, and as a starting point for background reading for projects.
Paper presentations: Cover the starred papers.
Experiment presentations: Pick one from among the starred papers.

Date	Topics	Papers and links	Presenters	Items due
Aug 31	Course intro		[slides]	Topic preferences due via email to Austin (austin@cs) by Wed Sept 5 at 5 pm
A. Object recognition fundamentals
Sept 7	Local features and matching for object instances: Invariant local features, instance recognition, visual vocabularies and bag-of-words	Object Recognition from Local Scale-Invariant Features, Lowe, ICCV 1999. [pdf] [code] [other implementations of SIFT] [IJCV] Selected pages from: Local Invariant Feature Detectors: A Survey, Tuytelaars and Mikolajczyk. Foundations and Trends in Computer Graphics and Vision, 2008. [pdf] [Oxford code] [Read pp. 178-188, 216-220, 254-255] *Video Google: A Text Retrieval Approach to Object Matching in Videos, Sivic and Zisserman, ICCV 2003. [pdf] [demo] For more background on feature extraction: Szeliski book: Sec 3.2 Linear filtering, 4.1 Points and patches, 4.2 Edges Scalable Recognition with a Vocabulary Tree, D. Nister and H. Stewenius, CVPR 2006. [pdf] SURF: Speeded Up Robust Features, Bay, Ess, Tuytelaars, and Van Gool, CVIU 2008. [pdf] [code] Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, J. Matas, O. Chum, U. Martin, and T. Pajdla, BMVC 2002. [pdf] A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid. CVPR 2003 [pdf] Oxford group interest point software Andrea Vedaldi's VLFeat code, including SIFT, MSER, hierarchical k-means. INRIA LEAR team's software, including interest points, shape features FLANN - Fast Library for Approximate Nearest Neighbors. Marius Muja et al. Google Goggles Kooaba	[outline] [filters] [local features] [matching and spatial verification]
Sept 14	Large-scale image/object search and mining: Scalable retrieval algorithms, mining for visual themes, particularly for object instances	Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. O. Chum et al. CVPR 2007. [pdf] [Oxford buildings dataset] Discovering Favorite Views of Popular Places with Iconoid Shift. T. Weyand and B. Leibe. ICCV 2011. [pdf] [Paris 500K dataset] Supervised Hashing with Kernels. W. Liu, J. Wang, R. Ji, Y. Jiang, S.-F. Chang. CVPR 2012 [pdf] Kernelized Locality Sensitive Hashing for Scalable Image Search, by B. Kulis and K. Grauman, ICCV 2009 [pdf] [code] [80M Tiny Images data] Image Webs: Computing and Exploiting Connectivity in Image Collections. K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. Guibas. CVPR 2010. [pdf] World-scale Mining of Objects and Events from Community Photo Collections. T. Quack, B. Leibe, and L. Van Gool. CIVR 2008. [pdf] Total Recall II: Query Expansion Revisited. O. Chum, A. Mikulik, M. Perdoch, and J. Matas. CVPR 2011. [pdf] Geometric Min-Hashing: Finding a (Thick) Needle in a Haystack, O. Chum, M. Perdoch, and J. Matas. CVPR 2009. [pdf] Three Things Everyone Should Know to Improve Object Retrieval. R. Arandjelovic and A. Zisserman. CVPR 2012. [pdf] Video Mining with Frequent Itemset Configurations. T. Quack, V. Ferrari, and L. Van Gool. CIVR 2006. [pdf] Bundling Features for Large Scale Partial-Duplicate Web Image Search. Z. Wu, Q. Ke, M. Isard, and J. Sun. CVPR 2009. [pdf] Improving Image-based Localization by Active Correspondence Search. T. Sattler, B. Leibe, L. Kobbelt. ECCV 2012. [pdf] Learning Binary Projections for Large-Scale Image Search. K. Grauman and R. Fergus. Chapter to appear in Registration, Recognition, and Video Analysis, R. Cipolla, S. Battiato, and G. Farinella, Editors. [pdf] Learning Query-dependent Preﬁlters for Scalable Image Retrieval. L. Torresani, M. Szummer, and A. Fitzgibbon. CVPR 2009. [pdf] Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, C. Lampert, ICCV 2009. [pdf] [code] Efficiently Searching for Similar Images. K. Grauman. Communications of the ACM*, 2009. [CACM link] Fast Image Search for Learned Metrics, P. Jain, B. Kulis, and K. Grauman, CVPR 2008. [pdf] Small Codes and Large Image Databases for Recognition, A. Torralba, R. Fergus, and Y. Weiss, CVPR 2008. [pdf] Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf] [approx k-means code] City-Scale Location Recognition, G. Schindler, M. Brown, and R. Szeliski, CVPR 2007. [pdf] LSH homepage Nearest Neighbor Methods in Learning and Vision, Shakhnarovich, Darrell, and Indyk, editors. CVPR 2009 Workshop on Visual Place Categorization Code for downloading Flickr images, by James Hays UW Community Photo Collections homepage INRIA Holiday images dataset NUS-WIDE tagged image dataset of 269K images MIRFlickr dataset	[outline] [wrap-up on instance recognition, large-scale search]
Sept 21	Classification and detection for object categories Global appearance models for category and scene recognition; sliding window detection, voting-based detection, detection as a binary decision problem.	A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008. [pdf] [code] Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce, CVPR 2006. [pdf] [15 scenes dataset] [libpmk] [Matlab] *Class-specific Hough Forests for Object Detection. J. Gall and V. Lempitsky. CVPR 2009. [pdf] [slides] [code] Robust Object Detection with Interleaved Categorization and Segmentation. B. Leibe, A. Leonardis, and B. Schiele. IJCV 2008. [pdf] [code] The Devil is in the Details: an Evaluation of Recent Feature Encoding Methods. K. Chatfield, V. Lempitsky, A. Vedaldi, A. Zisserman. BMVC 2011. [pdf] [code] Rapid Object Detection Using a Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001. [pdf] [code] Histograms of Oriented Gradients for Human Detection, Dalal and Triggs, CVPR 2005. [pdf] [video] [code] [PASCAL datasets] Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, Oliva and Torralba, IJCV 2001. [pdf] [Gist code] Locality-Constrained Linear Coding for Image Classification. J. Wang, J. Yang, K. Yu, and T. Huang CVPR 2010. [pdf] [code] Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, ECCV International Workshop on Statistical Learning in Computer Vision, 2004. [pdf] Pedestrian Detection in Crowded Scenes, Leibe, Seemann, and Schiele, CVPR 2005. [pdf] Pyramids of Histograms of Oriented Gradients (pHOG), Bosch and Zisserman. [code] Sampling Strategies for Bag-of-Features Image Classification. E. Nowak, F. Jurie, and B. Triggs. ECCV 2006. [pdf] Beyond Sliding Windows: Object Localization by Efficient Subwindow Search. C. Lampert, M. Blaschko, and T. Hofmann. CVPR 2008. [pdf] [code] Diagnosing Error in Object Detectors. D. Hoiem et al. ECCV 2012. [pdf] VLFeat code LIBPMK feature extraction code, includes dense sampling LIBSVM library for support vector machines PASCAL VOC Visual Object Classes Challenge	[outline] [slides part 1] Heath-expt Nona-paper	HW1 due Friday Sept 21, 11:59 pm
Sept 28	Mid-level representations Segmentation into regions, grouping, surface estimation	Constrained Parametric Min-Cuts for Automatic Object Segmentation. J. Carreira and C. Sminchisescu. CVPR 2010. [pdf] [code] From Contours to Regions: An Empirical Evaluation. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. CVPR 2009. [pdf] [code and data] [journal paper] *Indoor Segmentation and Support Inference from RGBD Images. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. ECCV 2012. [pdf] [NYU depth dataset] Geometric Context from a Single Image, D. Hoiem, A. Efros, and M. Hebert, ICCV 2005. [pdf] [web] [code] Category Independent Object Proposals. I. Endres and D. Hoiem. ECCV 2010. [pdf] [code/data] Geometric reasoning for single image structure recovery. D. Lee, M. Hebert, T. Kanade. CVPR 2009. [pdf] [code] Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. CVPR 2011. [pdf] [code] Object Recognition as Ranking Holistic Figure-Ground Hypotheses. F. Li, J. Carreira, and C. Sminchisescu. CVPR 2010. [pdf] People Watching: Human Actions as a Cue for Single View Geometry. D. Fouhey et al. ECCV 2012. [pdf] Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. CVPR 2006. [pdf] [code] Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon, and S. Ullman. CVPR workshop 2004. [pdf] [data] Learning Mid-level Features for Recognition. Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. CVPR, 2010. Class-Specific, Top-Down Segmentation, E. Borenstein and S. Ullman, ECCV 2002. [pdf] GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page] Robust Higher Order Potentials for Enforcing Label Consistency, P. Kohli, L. Ladicky, and P. Torr. CVPR 2008. Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [data] Shape Sharing for Object Segmentation. J. Kim and K. Grauman. ECCV 2012. [pdf] Normalized Cuts and Image Segmentation, J. Shi and J. Malik. PAMI 2000. [pdf] [code] Fast SLIC superpixels Greg Mori's superpixel code Berkeley Segmentation Dataset and code Pedro Felzenszwalb's graph-based segmentation code Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf] [code, Matlab interface by Shai Bagon] David Blei's Topic modeling code Berkeley 3D object dataset (kinect)	[outline] [slides] Che-Chun-expt Elad-expt Sanmit-paper Islam-paper Chao-paper
B. Beyond modeling individual objects
Oct 5	Context and scenes Multi-object scenes, inter-object relationships, understanding scenes' spatial layout	Scene Semantics from Long-term Observation of People. V. Delaitre, D. Fouhey, I. Laptev, J. Sivic, A. Gupta, A. Efros. ECCV 2012 [pdf] [web] [pose code] Multi-Class Segmentation with Relative Location Prior. S. Gould, J. Rodgers, D. Cohen, G. Elidan and D. Koller. IJCV 2008. [pdf] [code] *Using the Forest to See the Trees: Exploiting Context for Visual Object Detection and Localization. Torralba, Murphy, and Freeman. CACM 2009. [pdf] [related code] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. CVPR 2010. [pdf] [code] Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces. D. Lee, A. Gupta, M. Hebert, and T. Kanade. NIPS 2010. [pdf] [code] Contextual Priming for Object Detection, A. Torralba. IJCV 2003. [pdf] [web] [code] Object Bank: A High-Level Image Representation for Scene Classiﬁcation & Semantic Feature Sparsiﬁcation. L-J. Li, H. Su, E. Xing, L. Fei-Fei. NIPS 2010. [pdf] [code] RGB-D scene labeling: features and algorithms. X. Ren, L. Bo, and D. Fox. CVPR 2012. [pdf] [code] TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. J. Shotton, J. Winn, C. Rother, A. Criminisi. ECCV 2006. [pdf] [web] [data] [code] Recognition Using Visual Phrases. M. Sadeghi and A. Farhadi. CVPR 2011. [pdf] Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry. V. Hedau, D. Hoiem, and D. Forsyth. ECCV 2010 [pdf] [code and data] Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, A. Gupta, A. Efros, and M. Hebert. ECCV 2010. [pdf] [code] Geometric Reasoning for Single Image Structure Recovery. D. Lee, M. Hebert, and T. Kanade. CVPR 2009. [pdf] [web] [code] Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006. [pdf] [web] Discriminative Models for Multi-Class Object Layout, C. Desai, D. Ramanan, C. Fowlkes. ICCV 2009. [pdf] [slides] [SVM struct code] [data] Closing the Loop in Scene Interpretation. D. Hoiem, A. Efros, and M. Hebert. CVPR 2008. [pdf] Decomposing a Scene into Geometric and Semantically Consistent Regions, S. Gould, R. Fulton, and D. Koller, ICCV 2009. [pdf] [slides] Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008. [pdf] [code] An Empirical Study of Context in Object Detection, S. Divvala, D. Hoiem, J. Hays, A. Efros, M. Hebert, CVPR 2009. [pdf] [web] Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.[ pdf] Context Based Object Categorization: A Critical Survey. C. Galleguillos and S. Belongie. [pdf] What, Where and Who? Classifying Events by Scene and Object Recognition, L.-J. Li and L. Fei-Fei, ICCV 2007. [pdf] Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects. H. Kjellstrom et al. ECCV 2008. [pdf] Modeling mutual context of object and human pose in human-object interaction activities. B. Yao and L. Fei-Fei. CVPR 2010. [pdf] Labelme Database Scene Understanding Symposium Stanford Event Dataset	Jacob-paper Aron-paper Aashish-expt David-expt	HW2 due, Friday Oct 5, 11:59 pm
Oct 12	Dealing with many categories Sharing features between classes, transfer, taxonomy, learning from few examples, exploiting class relationships	Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K. Murphy, W. Freeman, PAMI 2007. [pdf] [code] Hedging Your Bets: Optimizing Accuracy-Speciﬁcity Trade-offs in Large Scale Visual Recognition. J. Deng, J. Krause, A. Berg, L. Fei-Fei. CVPR 2012 [pdf] [supp] [ILSVRC data] *Tabula Rasa: Model Transfer for Object Category Detection. Y. Atar and A. Zisserman. CVPR 2011. [pdf] [HoG code] What Does Classifying More than 10,000 Image Categories Tell Us? J. Deng, A. Berg, K. Li and L. Fei-Fei. ECCV 2010. [pdf] Discriminative Learning of Relaxed Hierarchy for Large-scale Visual Recognition. T. Gao and Daphne Koller. ICCV 2011. [pdf] [code] Comparative Object Similarity for Improved Recognition with Few or Zero Examples. G. Wang, D. Forsyth, and D. Hoeim. CVPR 2010. [pdf] Learning and Using Taxonomies for Fast Visual Categorization, G. Griffin and P. Perona, CVPR 2008. [pdf] [data] 80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition, by A. Torralba, R. Fergus, and W. Freeman. PAMI 2008. [pdf] [web] Constructing Category Hierarchies for Visual Recognition, M. Marszalek and C. Schmid. ECCV 2008. [pdf] [web] [Caltech256] Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. L. Fei-Fei, R. Fergus, and P. Perona. CVPR Workshop on Generative-Model Based Vision. 2004. [pdf] [Caltech101] Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis. CVPR 2007 [pdf] Exploiting Object Hierarchy: Combining Models from Different Category Levels, A. Zweig and D. Weinshall, ICCV 2007 [pdf] Incremental Learning of Object Detectors Using a Visual Shape Alphabet. Opelt, Pinz, and Zisserman, CVPR 2006. [pdf] ImageNet: A Large-Scale Hierarchical Image Database, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf] [data] Semantic Label Sharing for Learning with Many Categories. R. Fergus et al. ECCV 2010. [pdf] Learning a Tree of Metrics with Disjoint Visual Features. S. J. Hwang, K. Grauman, F. Sha. NIPS 2011. SUN Scene dataset of 899 scene classes ImageNet dataset of 15K objects and ImageNet challenge	Elad-paper Gary-expt	Wed Oct 17 project proposal abstract due
Oct 19	Describing objects with attributes Visual properties, learning from natural language descriptions, intermediate representations	Describing Objects by Their Attributes, A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, CVPR 2009. [pdf] [web] [data] FaceTracer: A Search Engine for Large Collections of Images with Faces. N. Kumar, P. Belhumeur, and S. Nayar. ECCV 2008. [pdf] [code, data, demo] *Relative Attributes. D. Parikh and K. Grauman. ICCV 2011. [pdf] [code/data] Attribute and Simile Classifiers for Face Verification, N. Kumar, A. Berg, P. Belhumeur, S. Nayar. ICCV 2009. [pdf] [web] [lfw data] [pubfig data] Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, C. Lampert, H. Nickisch, and S. Harmeling, CVPR 2009 [pdf] [web] [data] A Joint Learning Framework for Attribute Models and Object Descriptions. D. Mahajan, S. Sellamanickam, V. Nair. ICCV 2011. [pdf] WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, K. Grauman. CVPR 2012. [pdf] [data] SUN Attribute Database: Discovering, Annotating, and Recognizing Scene Attributes. G. Patterson and J. Hays. CVPR 2012. [pdf] [data] Multi-Attribute Spaces: Calibration for Attribute Fusion and Similarity Search. W. Scheirer, N. Kumar, P. Belhumeur, T. Boult. CVPR 2012 [pdf] A Discriminative Latent Model of Object Classes and Attributes. Y. Wang and G. Mori. ECCV, 2010. [pdf] Learning Visual Attributes, V. Ferrari and A. Zisserman, NIPS 2007. [pdf] Learning Models for Object Recognition from Natural Language Descriptions, J. Wang, K. Markert, and M. Everingham, BMVC 2009.[pdf] Attribute-Centric Recognition for Cross-Category Generalization. A. Farhadi, I. Endres, D. Hoiem. CVPR 2010. [pdf] Automatic Attribute Discovery and Characterization from Noisy Web Data. T. Berg et al. ECCV 2010. [pdf] [data] Attributes-Based People Search in Surveillance Environments. D. Vaquero, R. Feris, D. Tran, L. Brown, A. Hampapur, and M. Turk. WACV 2009. [pdf] [project page] Image Region Entropy: A Measure of "Visualness" of Web Images Associated with One Concept. K. Yanai and K. Barnard. ACM MM 2005. [pdf] What Helps Where And Why? Semantic Relatedness for Knowledge Transfer. M. Rohrbach, M. Stark, G. Szarvas, I. Gurevych and B. Schiele. CVPR 2010. [pdf] Recognizing Human Actions by Attributes. J. Liu, B. Kuipers, S. Savarese, CVPR 2011. [pdf] Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. CVPR 2011. [pdf] [web] Animals with Attributes dataset aYahoo and aPascal attributes datasets Attribute discovery dataset of shopping categories Public Figures Face database with attributes Relative attributes data WhittleSearch relative attributes data SUN Scenes attribute dataset Cross-category object recognition (CORE) dataset Leeds Butterfly Dataset FaceTracer database from Columbia Caltech-UCSD Birds dataset Database of human attributes	Aashish-paper Girish-paper Sanmit-expt Nona-expt
Oct 26	Importance and saliency Among all items in the scene, which deserve attention (first)? What makes images interesting or memorable?	Understanding and Predicting Importance in Images. A. Berg et al. CVPR 2012. [pdf] [UIUC sentence dataset] [ImageClef dataset] Learning to Detect a Salient Object. T. Liu et al. CVPR 2007. [pdf] [results] [data] [code] *What Makes an Image Memorable? P. Isola, J. Xiao, A. Torralba, A. Oliva. CVPR 2011. [pdf] [web] [code/data] What Do We Perceive in a Glance of a Real-World Scene? L. Fei-Fei, A. Iyer, C. Koch, and P. Perona. Journal of Vision, 2007. [pdf] A Model of Saliency-based Visual Attention for Rapid Scene Analysis. L. Itti, C. Koch, and E. Niebur. PAMI 1998 [pdf] Interesting Objects are Visually Salient. L. Elazary and L. Itti. Journal of Vision, 8(3):1–15, 2008. [pdf] Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. BMVC 2010. [pdf] [web] [data] Some Objects are More Equal Than Others: Measuring and Predicting Importance, M. Spain and P. Perona. ECCV 2008. [pdf] The Discriminant Center-Surround Hypothesis for Bottom-Up Saliency. D. Gao, V.Mahadevan, and N. Vasconcelos. NIPS, 2007. [pdf] What is an Object? B. Alexe, T. Deselaers, and V. Ferrari. CVPR 2010. [pdf] [code] A Principled Approach to Detecting Surprising Events in Video. L. Itti and P. Baldi. CVPR 2005 [pdf] What Attributes Guide the Deployment of Visual Attention and How Do They Do It? J. Wolfe and T. Horowitz. Neuroscience, 5:495–501, 2004. [pdf] Visual Correlates of Fixation Selection: Effects of Scale and Time. B. Tatler, R. Baddeley, and I. Gilchrist. Vision Research, 45:643, 2005. [pdf] Objects Predict Fixations Better than Early Saliency. W. Einhauser, M. Spain, and P. Perona. Journal of Vision, 8(14):1–26, 2008. [pdf] Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. CVPR 2010. [pdf] Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video. S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstrack,S. Chung, A. Ng. IJCAI 2007. [pdf] Determining Patch Saliency Using Low-Level Context, D. Parikh, L. Zitnick, and T. Chen. ECCV 2008. [pdf] Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. ICCV 2011 [pdf] Contextual Guidance of Eye Movements and Attention in Real-World Scenes: The Role of Global Features on Object Search. A. Torralba, A. Oliva, M. Castelhano, J. Henderson. [pdf] [web] The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search, G. Zelinsky, W. Zhang, B. Yu, X. Chen, D. Samaras, NIPS 2005. [pdf]	Islam-expt Chao-expt Che-Chun-paper Niveda-paper	Wed Oct 31: project extended outlines due
C. Human-centered recognition
Nov 2	Pictures of people Finding people, predicting their poses and attributes, automatic face tagging	Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations, L. Bourdev and J. Malik. ICCV 2009 [pdf] [code] Real-Time Human Pose Recognition in Parts from a Single Depth Image. J. Shotton et al. CVPR 2011. [pdf] [video] [web] Where’s Waldo: Matching People in Images of Crowds. R. Garg, D. Ramanan, S. Seitz, N. Snavely. CVPR 2011. [pdf] [web] Understanding Images of Groups of People, A. Gallagher and T. Chen, CVPR 2009. [pdf] [web] [data] Parsing Clothing in Fashion Photographs. K. Yamaguchi et al. CVPR 2012. [pdf] [data] Contextual Identity Recognition in Personal Photo Albums. D. Anguelov, K.-C. Lee, S. Burak, Gokturk, and B. Sumengen. CVPR 2007. [pdf] Recognizing Proxemics in Personal Photos. Y. Yang, S. Baker, A. Kannan, D. Ramanan. CVPR 2012. [pdf] Who are you? - Learning Person Specific Classifiers from Video, J. Sivic, M. Everingham, and A. Zisserman, CVPR 2009. [pdf] [data] [KLT tracking code] Describing Clothing by Semantic Attributes. A. Gallagher et al. ECCV 2012. [pdf] Describing People: A Poselet-Based Approach to Attribute Classification. L. Bourdev, S. Maji, J. Malik. ICCV 2011. [pdf] Weakly Supervised Learning of Interactions between Humans and Objects. Prest et al. PAMI 2012. [pdf] Finding and Tracking People From the Bottom Up. D. Ramanan, D. A. Forsyth. CVPR 2003. [pdf] Autotagging Facebook: Social Network Context Improves Photo Annotation, by Z. Stone, T. Zickler, and T. Darrell. CVPR Internet Vision Workshop 2008. [pdf] Efficient Propagation for Face Annotation in Family Albums. L. Zhang, Y. Hu, M. Li, and H. Zhang. MM 2004. [pdf] Progressive Search Space Reduction for Human Pose Estimation. Ferrari, V., Marin-Jimenez, M. and Zisserman, A. CVPR 2008. [pdf] [web] [code] Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004. [pdf] [web] Face Discovery with Social Context. Y. J. Lee and K. Grauman. BMVC 2011. [pdf] “Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006. [pdf] [web] [data] Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. M. Andriluka et al. CVPR 2009. [pdf] [code] Exploring Photobios. I. Kemelmacher-Shlizerman, E. Shechtman, R. Garg, S. Seitz. SIGGRAPH 2011. [pdf] Face detection code in OpenCV Gallagher's Person Dataset Face data from Buffy episode, from Oxford Visual Geometry Group CALVIN upper-body detector code UMass Labeled Faces in the Wild FaceTracer database from Columbia Database of human attributes	Deepti-paper Randall-expt Aron-expt Dinesh-paper
Nov 9	Activity recognition Recognizing and localizing human actions in video or static images	Learning Realistic Human Actions from Movies. I. Laptev, M. Marszałek, C. Schmid and B. Rozenfeld. CVPR 2008. [pdf] [data] [code] A Unified Framework for Multi-Target Tracking and Collective Activity Recognition. W. Choi and S. Savarese. ECCV 2012. [pdf] [web] [video] [data] *Detecting Actions, Poses, and Objects with Relational Phraselets. C. Desai and D. Ramanan. ECCV 2012. [pdf] [data] [code] Beyond Actions: Discriminative Models for Contextual Group Activities. T. Lian, Y. Wang, W. Yang, and G. Mori. NIPS 2010. [pdf] [data] Efficient Activity Detection with Max-Subgraph Search. C.-Y. Chen and K. Grauman. CVPR 2012. [pdf] [project page] [code] Action Bank: a High-Level Representation of Activity in Video. S. Sadanand and J. Corso. CVPR 2012 [pdf] [code/data] A Hough Transform-Based Voting Framework for Action Recognition. A. Yao, J. Gall, L. Van Gool. CVPR 2010. [pdf] [code/data] Actions in Context, M. Marszalek, I. Laptev, C. Schmid. CVPR 2009. [pdf] [web] [data] Objects in Action: An Approach for Combining Action Understanding and Object Perception. A. Gupta and L. Davis. CVPR, 2007. [pdf] [data] Exemplar-based Action Recognition in Video. G. Willems, J. Becker, T. Tuytelaars, and L. V. Gool. BMVC, 2009. A Scalable Approach to Activity Recognition Based on Object Use. J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. ICCV 2007. [pdf] Recognizing Actions at a Distance. A. Efros, G. Mori, J. Malik. ICCV 2003. [pdf] [web] Action Recognition from a Distributed Representation of Pose and Appearance, S. Maji, L. Bourdev, J. Malik, CVPR 2011. [pdf] [code] Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. CVPR 2010. [pdf] Temporal Causality for the Analysis of Visual Events. K. Prabhakar, S. Oh, P. Wang, G. Abowd, and J. Rehg. CVPR 2010. [pdf] [Georgia Tech Computational Behavior Science project] What's Going on?: Discovering Spatio-Temporal Dependencies in Dynamic Scenes. D. Kuettel et al. CVPR 2010. [pdf] Learning Actions From the Web. N. Ikizler-Cinbis, R. Gokberk Cinbis, S. Sclaroff. ICCV 2009. [pdf] Ivan Laptev's Space-Time Interest Points code Hollywood activity dataset Stanford 40 Actions still image dataset Stanford People Playing Musical Instrument dataset UCF activity datasets PASCAL VOC action recognition taster challenge Greg Mori and Ivan Laptev's tutorial on action recognition at ECCV 2010 TRECVID video retrieval challenge UMich Collective Activity dataset	Girish-expt Gary-paper David-paper
Nov 16	Egocentric cameras Analyzing data from wearable, mobile cameras; "first person" vision	Social Interactions: A First-Person Perspective. A. Fathi, J. Hodgins, J. Rehg. CVPR 2012 [pdf] [data] Recognizing Activities of Daily Living in First-Person Camera Views. H. Pirsiavash and D. Ramanan. CVPR 2012. [pdf] [data/code] *Novelty Detection from an Egocentric Perspective. O. Aghazadeh, J. Sullivan, and S. Carlsson. CVPR 2011 [pdf] [web/data] Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. CVPR 2012. [pdf] [web] Understanding Egocentric Activities. A. Fathi, A. Farhadi, J. Rehg. ICCV 2011. [pdf] [data] Learning to Recognize Objects in Egocentric Activities. A. Fathi, X. Ren, J. Rehg. CVPR 2011. [pdf] Figure-Ground Segmentation Improves Handled Object Recognition in Egocentric Video. X. Ren and C. Gu. CVPR 2010 [pdf] [videos] [data] Egocentric Recognition of Handled Objects: Benchmark and Analysis. X. Ren and M. Philipose. Egovision Workshop 2009. [pdf] [data] Activity Recognition from First Person Sensing. E. Taralova, F. De la Torre, M. Hebert CVPR 2009 Workshop on Egocentric Vision [pdf] Close-Range Human Detection for Head-Mounted Cameras. D. Mitzel and B. Leibe. BMVC 2012. [pdf] Structural Epitome: A Way to Summarize One’s Visual Experience. N. Jojic, A. Perina, and V. Murino. NIPS 2010. [pdf] Fast Unsupervised Ego-Action Learning for First-Person Sports Video. K. Kitani, T. Okabe, Y. Sato, and A. Sugimoto. CVPR 2011. [pdf] Wearable Hand Activity Recognition for Event Summarization. W. Mayol and D. Murray. International Symposium on Wearable Computers. IEEE, 2005. [pdf] Illumination-free Gaze Estimation Method for First-Person Vision Wearable Device. A. Tsukada, M. Shino, M. Devyver, T. Kanade. ICCV Workshop 2011. [pdf] Egovision workshop at CVPR 2012	Jake-expt Randall-paper Dinesh-expt
Nov 30	Human-in-the-loop interactive systems Human-in-the-loop learning, active annotation collection, crowdsourcing	Multiclass Recognition and Part Localization with Humans in the Loop. C. Wah et al. ICCV 2011. [pdf] [Caltech/UCSD Visipedia project] [data] What’s It Going to Cost You? : Predicting Effort vs. Informativeness for Multi-Label Image Annotations. S. Vijayanarasimhan and K. Grauman. CVPR 2009 [pdf] [data] [code] *The Multidimensional Wisdom of Crowds. Welinder P., Branson S., Belongie S., Perona, P. NIPS 2010. [pdf] [code] Visual Recognition with Humans in the Loop. Branson S., Wah C., Babenko B., Schroff F., Welinder P., Perona P., Belongie S. ECCV 2010. [pdf] Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. CVPR 2011. [pdf] WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, K. Grauman. CVPR 2012. [pdf] [data] Crowdclustering. R. Gomes, P. Welinder, A. Krause, P. Perona. NIPS 2011. [pdf] Adaptively Learning the Crowd Kernel. O. Tamuz, C. Liu, S. Belongie, O. Shamir, A. Kalai. ICML 2011 [pdf] LeafSnap: A Computer Vision System for Automatic Plant Species Identification. N. Kumar et al. ECCV 2012. [pdf] Interactive Object Detection. A. Yao, J. Gall, C. Leistner, L. Van Gool. CVPR 2012. [pdf] Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces. C. Vondrick, D. Ramanan, D. Patterson. ECCV 2010. [pdf] [data/code] Video Annotation and Tracking with Active Learning. C. Vondrick, D. Patterson, D. Ramanan. NIPS 2011. [pdf] [code] Active Frame Selection for Label Propagation in Videos. S. Vijayanarasimhan and K. Grauman. ECCV 2012. [pdf] Annotator Rationales for Visual Recognition. J. Donahue and K. Grauman. ICCV 2011. [pdf] Attributes for Classifier Feedback. A. Parkash and D. Parikh. ECCV 2012. [pdf] Combining Self Training and Active Learning for Video Segmentation. A. Fathi, M. Balcan, X. Ren, J. Rehg. BMVC 2011. [pdf] Labeling Images with a Computer Game. L. von Ahn and L. Dabbish. CHI, 2004. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. J. Whitehill et al. NIPS 2009. [pdf] Utility Data Annotation with Amazon Mechanical Turk. A. Sorokin and D. Forsyth. Wkshp on Internet Vision, 2008. Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. CVPR 2010. [pdf] [code] Active Learning from Crowds. Y. Yan, R. Rosales, G. Fung, J. Dy. ICML 2011. [pdf] Proactive Learning: Cost-Sensitive Active Learning with Multiple Imperfect Oracles. P. Donmez and J. Carbonell. CIKM 2008. [pdf] Inactive Learning? Difficulties Employing Active Learning in Practice. J. Attenberg and F. Provost. SIGKDD 2011. [pdf] Actively Selecting Annotations Among Objects and Attributes. A. Kovashka, S. Vijayanarasimhan, and K. Grauman. ICCV 2011 [pdf] Supervised Learning from Multiple Experts: Whom to Trust When Everyone Lies a Bit. V. Raykar et al. ICML 2009. [pdf] Multi-class Active Learning for Image Classification. A. J. Joshi, F. Porikli, and N. Papanikolopoulos. CVPR 2009. [pdf] GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004. [pdf] [project page] Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006. [pdf] [web] Amazon Mechanical Turk Using Mechanical Turk with LabelMe	Deepti-expt Heath-paper Niveda-expt
Dec 7	Final project presentations in class			Final papers due

Dates	Topic	Readings and links	Lectures	Assignments, exams
Wed Jan 19	Course intro	Sec 1.1-1.3	Intro [pdf]	Pset 0 out Friday Jan 21
Mon Jan 24	Features and filters	Sec 3.1.1-2, 3.2	Linear filters [ppt] [pdf] [outline]
Wed Jan 26		Sec 3.2.3, 4.2 Seam carving paper Seam carving video	Gradients and edges [ppt] [pdf] [outline]	Pset 0 due Friday Jan 28
Mon Jan 31		Sec 3.3.2-4	Binary image analysis [ppt] [pdf] [outline]	Pset 1 out [class results]
Wed Feb 2		Sec 10.5 Texture Synthesis	Texture [ppt] [pdf] [outline]
Mon Feb 7		Sec 2.3.2 Foundations of Color, B. Wandell Lotto Lab illusions	Color [ppt] [pdf] [outline]	Pset 0 grades and solutions returned in class

Wed Feb 9	Grouping and fitting	Sec 5.2-5.4 k-means demo	Segmentation and clustering [ppt] [pdf] [outline]
Mon Feb 14		Sec 4.3.2 Hough Transform demo Excerpt from Ballard & Brown	Hough transform [ppt] [pdf] [outline]	Pset 1 due Monday Feb 14 Pset 2 out
Wed Feb 16			Hough transform [ppt] [pdf] [outline]	Pset 1 due Monday Feb 14 Pset 2 out
Mon Feb 21		Sec 5.1.1	Deformable contours [ppt] [pdf] [outline]
Wed Feb 23		Sec 2.1.1, 2.1.2, 6.1.1	Alignment and 2d image transformations [ppt] [pdf] [outline]	Pset 1 grades and solutions returned in class

Mon Feb 28	Multiple views and motion	Sec 3.6.1, 6.1.4	Homography and image warping [ppt] [pdf] [outline]
Wed Mar 2		Sec 4.1	Local invariant features 1 [ppt] [pdf] [outline]	Pset 2 due Wednesday Mar 2
Mon Mar 7		(Sec 4.1)	Local invariant features 2 [ppt] [pdf] [outline]
Wed Mar 9				Midterm exam Pset 2 grades and solutions returned in class
Spring break				Pset 3 out [class results]
Mon Mar 21		Sec 11.1.1, 11.2-11.5	Image formation (and local feature matching wrap-up) [ppt] [pdf] [outline]
Wed Mar 23		Sec 11.1.1, 11.2-11.5 Epipolar geometry demo Audio camera, O'Donovan et al.	Stereo 1: Epipolar geometry [ppt] [pdf] [outline]
Mon Mar 28		Virtual viewpoint video, Zitnick et al.	Stereo 2: Correspondence and calibration [ppt] [pdf] [outline]


Wed Mar 30	Recognition	Grauman & Leibe Ch 1-4 (3 is review)	Indexing local features [ppt] [pdf] [outline]	Pset 3 due Wed March 30
Mon April 4		Grauman & Leibe Ch 5, 6 Szeliski 14.3 Video Google demo by Sivic et al., paper	Instance recognition [ppt] [pdf] [outline]
Wed April 6		Grauman & Leibe Ch 7, 8.1, 9.1, 11.1 Szeliski 14.1	Intro to category recognition [ppt] [pdf] [outline]	Pset 3 grades and solutions returned in class Pset 4 out
Mon April 11		Grauman & Leibe Ch 7, 8.1, 9.1, 11.1 Szeliski 14.1 Viola-Jones face detection paper (for additional reference)	Face detection [ppt] [pdf] [outline]
Wed April 13		Grauman & Leibe 11.3, 11.4 Szeliski 14.4	Discriminative classifiers for image recognition [ppt] [pdf] [outline]
Mon April 18		Grauman & Leibe 11.3, 11.4 Szeliski 14.4	Part-based models [ppt] [pdf] [outline]

Wed April 20	Video processing	8.4, 12.6.4	Motion [ppt] [pdf] [outline]	Pset 4 due Wed April 20
Mon April 25		8.4, 12.6.4 Davis & Bobick paper: The Representation and Recognition of Action Using Temporal Templates Stauffer & Grimson paper: Adaptive Background Mixture Models for Real-Time Tracking.	Background subtraction, Action recognition [ppt] [pdf] [outline]	Pset 5 out
Wed April 27		5.1.2, 4.1.4	Tracking [ppt] [pdf]	Pset 4 grades and solutions returned
Mon May 2			Course wrap-up and review
Wed May 4
				Pset 5 due Sun May 8
Mon May 16 2-5 pm				Final exam in JGB 2.102

[AI Shack] SIFT: Scale Invariant Feature Transform (0)	2012.06.26
[AI Shack] SIFT: Scale Invariant Feature Transform (0)	2012.06.26
[AI Shack] SIFT: Scale Invariant Feature Transform (0)	2012.06.26

'Computer Vision'에 해당되는 글 42건

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

Logistic Regression에 대한 간단한 설명

Logistic regression 예제

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

CS395T: Special Topics in Computer Vision, Spring 2010

Object Recognition

Announcements:

Course overview:

Other useful links:

'Computer Vision' 카테고리의 다른 글

'Computer Vision' 카테고리의 다른 글

CS395T: Visual Recognition, Fall 2011

Announcements:

Course overview:

Other useful links:

'Computer Vision' 카테고리의 다른 글

CS395T: Visual Recognition, Fall 2012

Announcements:

Course overview:

Other useful links:

'Computer Vision' 카테고리의 다른 글

Locate maxima/minima in DoG images

Find subpixel maxima/minima

Example

Summary

'Computer Vision > SIFT' 카테고리의 다른 글

Laplacian of Gaussian

The Con

The Benefits

Side effects

Example

Summary

'Computer Vision > SIFT' 카테고리의 다른 글

Scale spaces

Scale spaces in SIFT

The technical details

Summary

'Computer Vision > SIFT' 카테고리의 다른 글

Why care about SIFT

The algorithm

What do I do with SIFT features?

'Computer Vision > SIFT' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바