COMP 5112: Algorithms for Data Science (Fall 2020)

Weekly Schedule



Instructor: Anil Maheshwari
Office: Herzberg Building 5125B

Lectures: Online lectures twice a week. I am planning to hold lectures on Wednesdays and Fridays at 2:30PM. First one will be on Friday September 11th, and the link will be provided via cuLearn. The schedule may change once I know your time zones. The recorded videos will likely be posted in cuLearn.  The recordings are not to be shared with anybody.

Office hours: During the breaks and immediately after the online lectures.         

Course objectives: 

To learn some of the algorithmic techniques to handle data science problems.
Topics may include:

These topics may be adjusted based on the background and interest of the students in the class.

Required Background:

We will cover a spectrum of techniques from the design and analysis of algorithms. It is assumed that you have a very good grasp on:
Note that there will not be sufficient time to review the background material to a satisfactory level during the course. (In nutshell you must have a background that is equivalent to the following Carleton Courses:  COMP 1805, COMP 2402, COMP 3804, and a course in Linear Algebra.)

Reference Material:

Useful References related to various topics. This will get modified as we go along the course.

Grading Scheme: (Tentative)

- Assignments/Exercises: 30% (including a Take Home Exam)

Assignment 1 is posted. Please see cuLearn
Assignment 2: Will likely be posted in 1st week of December and will be due around December 15.

- Scribe Notes: 15% (you will be responsible to type in content of (part of) one of the lectures in a LaTeX template and add some relevant exercises). Please have a look at the folder
where I have posted the cmu-scribe.sty tailored for us, and a sample (l1.tex and l1.pdf). If you have issues using this style file, please let me know.   (Here is a course from CMU with good examples of scribe notes: ) 

- Project: 55% (initial proposal 10% + short report 30% + presentation 15%). Outline is as follows:

  1. Pick a topic. (Look at references under "Reference Material"  and conferences in related areas and also use Google Scholar) to see who refers to those papers etc.)
  2. Initial Proposal: Submit a 1-2 page draft.  What is the topic you chose? Why? What problem(s) you will look at? What you plan to do? Outline of sections of your report? Main References. Due (in pdf format) by October 2nd.
  3. Report (Initial Draft): Due by November 9. The report format will likely be a research article. Its best to use LaTeX (e.g. see Overleaf). The sections will include:
    1. Introduction (Motivation, Problem Statement, Related Work, Short Summary of what you did).
    2. Preliminaries (In case you need to discuss some notation, definitions, etc. as background)
    3. Main Section - How did you solve the problem; State Algorithm; State its Analysis; State its Correctness. (NOTE: Ideally this can be expressed in terms of multiple questions and answers - similar to the Question on Hypercube routing in Assignment 1)
    4. Experiments (in case you performed any simulation etc.)
    5. Conclusions  (Summary + What did you learn? + What do you think can be done in future?)
    6. References
  1. Recorded Presentation and Final Report: Due on November 30. Presentation is a pre-recorded Video of approximately 10 minutes duration describing your project (this will be posted on cuLearn under course web-page). Report will be approximately 6 page long and will also be posted on cuLearn.  (Take Home Exam will have questions from these reports/presentations).
  2. Note: As some of you have asked, you may exceed 6-page limit if you think it is necessary. But please don't make it 20 pages. You may use a double column format - for example the style file from Canadian Conference in Computational Geometry Style File from here:

Weekly Schedule (Tentative):

Week #
Raw Slides
Week 1 Lecture 1 (Sept. 11)
What is this course about? 
Course Logistics
Random Sampling

Lecture 2 (Sept. 16)
Count-Min Sketch

Course Introduction

Randomized Sampling

Count-Min Sketch

References are in slides.

Topics on Algorithm Design
(Section 10.1)+
References in the slides

Lecture 1

Week 2
Lecture 3 (Sept. 18)
Probability Basics
Balls and Bins Experiments

Lecture 4 (Sept. 23)
Bloom Filters
Frequency Moments (Introduction)


Bloom Filters


References in Slide
(Mitzenmacher+Upfal, Section 5.5)

Topics on Algorithm Design
(Section 10.2) + References in Slides

Week 3+4
Lectures 5 + 6  (Sept 25 & 30 )
Estimating Frequency Moments in a Data Stream

Lecture 7 (Oct 2)
Counting 1s in Sliding Window



Slides + Reference in Slide + Topics in Algorithm Design (Section 10.3)

Slides + Topics in Algorithm Design (Section 10.4)

Week 4+5
Lecture 8+9 (Oct. 7+9)
Locality-Sensitive Hashing
LSH Families


Week 5-6
Lecture 10:
Fingerprint via LSH
Polynomial Identity Testing and
its applications to bipartite matching.

Geometric Algorithms
- Locality-Sensitive Ordering








Week 7
Locality Sensitive Ordering +  Dimensionality Reduction d-reduction
(Isometric Embeddings - Universality of L_\infty;
Distortion: Embedding in L_\infty with distortion D



Week 8+9
J-L Lemma +

Online Algorithms
- Classical Examples
- Matching
- Analysis via Primal-Dual LP
J-L Lemma Proof

Introduction to Online Algorithms via Online Bipartite Matching


Dimensionality Reduction-I

Week 9
Online Bipartite Matching, Fractional Matching
Greedy Algorithm - Primal Dual Analysis; Fractional Matching - Waterlevel Algorithm



Dimensionality Reduction- II


Week 10-11
Randomized Algorithm for Bipartite Matching
Balance Algorithm - Adwords 
Randomized + Balance


Online -L18+19
Week 11-12
Regret Minimization
- Multiplicative Weights Algorithm
Multiplicative-Weight Update Method



Week 13
Project Presentations

Important Considerations:

Late assignments/reports/projects are not accepted. All submissions are handled electronically (i.e., through cuLearn) and there is no "grace period" with respect to a deadline. Technical problems do not exempt you from this requirement. You are advised to:

University Policies

We follow all the rules and regulations set by Carleton University, Dean of Science, and the School of Computer Science regarding accommodating students with any kind of need(s). Please consult with the appropriate authorities to see how you can be accommodated and please follow their procedures. For information about Carleton's academic year, including registration and withdrawal dates, see Carleton's Academic Calendar. Following is a standard list of recommendations that we have been advised to provide you with respect to whom to contact for the appropriate action(s):

Pregnancy Obligation. 
Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, visit Equity Services.

Religious Obligation. 
Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, visit Equity Services.

Academic Accommodations for Students with Disabilities
If you have a documented disability requiring academic accommodations in this course, please contact the Paul Menton Centre for Students with Disabilities (PMC) at 613-520-6608 or for a formal evaluation or contact your PMC coordinator to send your instructor your Letter of Accommodation at the beginning of the term. You must also contact the PMC no later than two weeks before the first in-class scheduled test or exam requiring accommodation (if applicable). After requesting accommodation from PMC, meet with your instructor as soon as possible to ensure accommodation arrangements are made. For more details, visit the Paul Menton Centre website.

Survivors of Sexual Violence. 
As a community, Carleton University is committed to maintaining a positive learning, working and living environment where sexual violence will not be tolerated, and survivors are supported through academic accommodations as per Carleton's Sexual Violence Policy. For more information about the services available at the university and to obtain information about sexual violence and/or support, visit:

Accommodation for Student Activities.
 Carleton University recognizes the substantial benefits, both to the individual student and for the university, that result from a student participating in activities beyond the classroom experience. Reasonable accommodation must be provided to students who compete or perform at the national or international level. Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, see the policy.

Student Academic Integrity Policy. 
Every student should be familiar with the Carleton University student academic integrity policy. A student found in violation of academic integrity standards may be awarded penalties which range from a reprimand to receiving a grade of F in the course or even being expelled from the program or University. Examples of punishable offences include: plagiarism and unauthorized co-operation or collaboration.

Plagiarism. As defined by Senate, "plagiarism is presenting, whether intentional or not, the ideas, expression of ideas or work of others as one's own". Such reported offences will be reviewed by the office of the Dean of Science.

Unauthorized Co-operation or Collaboration. 
Senate policy states that "to ensure fairness and equity in assessment of term work, students shall not co-operate or collaborate in the completion of an academic assignment, in whole or in part, when the instructor has indicated that the assignment is to be completed on an individual basis". For this course, the following holds:

  1. This is the first offering of this course! I will seek your advise in terms of any particular algorithms that you will like to see in this course.
  2. All communications will be done via cuLearn forums. If you have a question that may benefit the whole class,  please post it in the forum or ask during the class/office hour. (The discussion that are of private nature can be done by sending me an e-mail.)
  3. I am planning to take online lectures twice a week. Once I know the time zones of all the students, we will draw up a lecture schedule. Online lectures will be recorded (subject to all the technological issues involved). I am planning  to make the recordings available on cuLearn. They won't be made available on any other outlets. You need a  cuLearn account to access them. Please do not post or resend the links to the recorded video to anybody (thanks).
  4. I am planning to hold my office hours immediately before/after/during the online lectures. 
  5. I am new to online lectures and there will be several bumps! I am still hoping that you will appreciate the beautiful algorithmic techniques.
  6. For uOttawa students: Please try to figure out how to obtain the cuLearn account. (I can't help you with the administrative aspect - please contact graduate and departmental administrators.)
  7. For your benefit I will be posting some of the videos from COMP 3801 class. Those are gentle introduction to a subset of the topics in this course. The reason for posting them is that you may review them beforehand to get somewhat familiar with the contents and the style of this course. This course will be at a much faster pace. We will cover significantly more and advanced material. Think of the COMP 3801 videos as the background preparation.
  8. Learn (La)TeX as soon as possible - will help you to write your thesis later!
  9. Assignment 1 is posted. Please see cuLearn
  10. Assignment 2 will be posted on December 4th evening on cuLearn. It will be due in about 10 days.
  11. Project reports and videos have been posted on cuLearn.
  12. Friday December 4th will likely be the last lecture for this course.
  13. Thanks a lot for taking this course.