Computer Vision (CS-GY 6643)

Fall 2025

An important goal of signal processing and artificial intelligence (AI) is to equip computers with the capability of interpreting visual inputs. Computer vision is an area that deals with the construction of explicit, meaningful measurements and descriptions of physical objects from images. It includes many techniques from image processing, pattern recognition, geometric modeling, cognitive processing, and machine and deep learning. This course explores the core concepts and practical techniques of image processing and computer vision at an intermediate level of depth and application.

This is a graduate level course requiring working knowledge of linear algebra, data structures and proficiency in programming (python). Advanced undergraduates may enroll upon permission from instructor.

Course information:

Time: Thursdays 11am-1:30pm

Place: Room 202, 370 Jay St

Slack channel: nyucomputervision.slack.com

Course team:

Instructor: Prof. Erdem Varol

Email: ev2240@nyu.edu

Office Hours: On Zoom,15 min appointments.

TA: Visweswar Sirish Parupudi

Email: vsp7230@nyu.edu

Office Hours: Thursday 1-5pm in-person 4-5 pm on Zoom.

TA: Subhrajit Dey

Email: sd5963@nyu.edu

Office Hours: Wednesday 1-2pm on Zoom.

TA: Nalini Ramanathan

Email: nar8991@nyu.edu

Office Hours: Mondays 2-3pm on Zoom.

Grading breakdown: Individual homework 15%, In-class midterm 25%, Programming Projects 60%
Note: 20% of the Programming Project Grade will be competition based.

Online Discussion: Preferred course communication will be via Slack, so please join our site from using this link: nyucomputervision.slack.com. All questions should also be posted to Slack (not sent via emails). We prefer that lectures or homework questions are asked publicly, since they will often help your classmates. Slack also supports private questions through direct messages for things relevant only to you.

Python and Jupyter: Demos and labs in this class use Python, run through Jupyter notebooks. Jupyter lets you create and edit documents with live Python code and rich comments and images. We suggest that students run their Jupyter notebooks via Google Colaboratory, and we will share them via Colab.

Assignments: Individual homework and programming projects must be turned in to Brightspace by the specified deadline (11:59pm of the due date). Programming projects should be turned in as evaluated Jupyter notebooks. Do not clear the output before turning in. Note that some projects may include a competitive component or involve ranking based on performance, we will guide you on the submission for the same. For written problem sets, we encourage using LaTeX or Markdown (with math support.) You can use this template for LaTeX. While there is a learning curve, these tools typically save students time in the end! If you do write problems by hand, scan and upload as a PDF. Discussion is allowed on homework, but solutions and code must be written independently. See the syllabus for policies. We have a zero tolerance policy for copied code or solutions: any students with duplicate or very similar material will receive a zero on the offending assignment.

Late policy: Every hour that a project is late (rounded down) will cause 1% penalization of the total allotted grade. For example, a project or homework that is 11 hours 45 minutes late will have a maximum possible score of 89%.

Textbooks: Computer Vision (2nd edition) by Szeliski will accompany the lectures that we cover and specific chapters from this book will be mentioned under reading materials in the schedule below. Textbook is freely available digitally at https://szeliski.org/Book/download.php.

Tutorials: Linear algebra, Google Colab, Python

Schedule


Date	Topic	Material	Homework/Projects
September 4, 2025	Intro and survey of topics, Image Formation and Filtering	Lecture 1 slides, Python tutorial, Google Colab Setup, Code Lab	Homework 0 out on Brightspace: Join slack nyucomputervision.slack.com (Due September 11, 11:59pm)
September 11, 2025	Non-linear filtering and edge detection	Lecture 2 Slides , Code Lab , Szeliski 2.2, 2.3, 3.1, 3.2	Homework 1 out; Project 1 out; Kaggle Invitation Link: Join Here
September 18, 2025	Image recognition, Feature detection and matching	Lecture 3 Slides , Code Lab	Homework 1 due - Solutions
September 25, 2025	Contour tracking and Hough transforms	Lecture 4 Slides , Code Lab , Szeliski 6, 7	Project 1 due
October 2, 2025	Image alignment	Lecture 5 Slides , Code Lab	Homework 2 out
October 9, 2025	Segmentation	Lecture 6 Slides , Code Lab	Homework 2 due
October 16, 2025	Midterm exam (in class)		Project 2 out
October 23, 2025	Machine Learning ,Backprop with MLP, Neural Networks	Lecture 7 Slides , Code Lab
October 30, 2025	Convolutional Neural Networks; YOLO	Lecture 8 slides, Colab for Lecture 8	Homework 3 out ; Project 2 due
November 6, 2025	Motion models, depth estimation and optical flow	Lecture 9 slides, Colab for Lecture 9	Homework 3 due; Project 3 out
November 13, 2025	Self Supervised Learning, CLIP Embeddings	Lecture 10 Slides , Code Lab
November 20, 2025	Attention Mechanism, Vision Transformer	Lecture 11 Slides , Code Lab	Project 3 due; Project 4 out
November 27, 2025	THANKSGIVING BREAK (No class)
December 4, 2025	Vision Language Models	Lecture 12 Slides , Code Lab
December 11, 2025	Additional Topics	Wrap Up Slides
December 18, 2025	No Class		Project 4 due

Essential reads

Textbooks:

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.
Milan Sonka, Vaclav Hlavac, and Roger Boyle, Image Processing, Analysis, and Machine Vision, 4th Ed, 2015
David A. Forsyth and J Ponce, Computer Vision: A Modern Approach, 2012

Papers

(Under construction)

Courses