COMP 5112/COMP4900G: Algorithms for Data Science (Fall 2024 Term)

Weekly Schedule      

Grading Scheme

Seminars


Instructor: Anil Maheshwari
Office: Herzberg Building 5125B
E-mail: anil@scs.carleton.ca


Lectures: Lectures on Mondays and Wednesdays 10:05 - 11:25 AM. See public class schedule/Brightspace for the room location in Tory.
Office hours  Mondays 08:30-09:45 AM (HP 5125b). All general announcements will be made during the class and/or via the Brightspace system.


Teaching Assistant: Office hours in HP4125 on Thursdays from 13:00-14:00

Course objectives:   

To learn some of the algorithmic techniques to handle data science problems. Emphasis is on providing correctness proofs, establishing competitive ratios, and analyzing computational complexity for each of the algorithms discussed during the course.

Topics may include:

These topics may be adjusted based on the background, interests of the students, and the amount of lecture time available.


Required Background:
 

We will cover a spectrum of techniques from the design and analysis of algorithms. It is assumed that you have an excellent grasp on:
Note that there will not be sufficient time to review the background material to a satisfactory level during the course. (In nutshell you must have a background that is equivalent to the following Carleton Courses:  COMP 1805, COMP 2402, COMP 3804, and a course in Linear Algebra.)

Reference Material:

Useful references related to various topics. This will get modified as we go along in the course.


Grading Scheme: (Tentative)

There are three components:

Outline is as follows:

  1. Pick a topic. (Look at references under "Reference Material"  and conferences in related areas. Also, use Google Scholar to see who refers to those papers etc.) You may look for papers/citations in recent proceedings of the ACM-SIAM Symposium on Discrete Algorithms conference and SIGKDD Conference for relevant topics.
  2. Initial Proposal: Submit one page draft.  What is the topic you chose? Why? What problem(s) you will look at? What you plan to do? Outline of sections of your report? Main References. Due (in pdf format) by September 29. Your 3-minute presentation is scheduled on October 1 during the class time slot.
  3. Final Project Presentation:  Scheduled in last 3-4 weeks of the class. Presentation is for approximately 15 minutes duration. 
  4. Project Report: Due couple of days  before your presentation. The report format will likely be a research article. Its best to use LaTeX (e.g. see Overleaf). The sections will include:
      1. Introduction (Motivation, Problem Statement, Related Work, Short Summary of what you did).
      2. Preliminaries (In case you need to discuss some notation, definitions, etc. as background)
      3. Main Section - How did you solve the problem; State Algorithm; State its Analysis; State its Correctness.
      4. Experiments (in case you performed any simulation etc.)
      5. Conclusions  (Summary + What did you learn? + What do you think can be done in future?)
      6. References
      7. Report will be approximately 6 pages long and will be posted on the course web-page. Final Exam will have some questions from these reports.  
      8. You may use a double column format - for example the style file from Canadian Conference in Computational Geometry Style File from here:  http://vga.usask.ca/cccg2020/CCCG2020-tex-template.zip
      9. It will also help the community if you update/create the relevant Wikipedia page relevant to your project. You will be suitably rewarded with bonus marks. 


Fall 2025 Schedule

Sep 03: Course Logistics + MWU Method
Sep 08: MWU (Cont.)  Randomized Schemes

Sep 10: Greedy Algorithms
Sep 15: Proofs of MWIS and Greedy Peeling for Densest Subgraphs

Sep 17: Locality-Sensitive Hashing
Sep 22: LSH Continued (Metric Space, Sensitive Family of Functions)

Sep 24:
Local Search - k-Median and Weighted MaxCut Sep 29: Project Proposal Due, Geometric Hitting Sets, Online Algorithms for Matching
Oct 01: 3-Minute Proposal Presentation 

Oct 06:
Geometric Hitting Sets + Online Algorithms for Matching
Oct 08: Online Algorithms for Matching

Oct 15: Separator Theorems in Graphs (Bobby Miraftab)
Oct 27: In Class Test

Oct 29: Online Algorithms for Matching (Contd.)

Nov 03: Online Algorithms for Matching (Contd.)

Nov 05:
Approximation Algorithms (using Metric LPs) 
Nov 10: Approximation Algorithms (using Metric LPs)

Nov 12:
Seminar Talks + Clustering
Nov 17: Seminar Talks + Clustering

Nov 19:
Seminar Talks  + Clustering
Nov 24: Seminar Talks + Dimensionality Reduction

Nov 26: Seminar Talks + Dimensionality Reduction
 
Dec 01: Dimensionality Reduction (to L_\infty norms)

Dec 03: Dimensionality Reduction (JL Lemma)



Partial List of Seminar From Previous Terms

Research Article
Talk Slides
M. Belkin and P. Niyogi.
Towards a theoretical foundation for Laplacian-based manifold methods.
Proceedings of the Conference on Learning Theory (COLT), 2005.
Laplacian Eigenmaps for Dimensionality Reduction
O. Durmaz and H. Sakir Bilge,
Fast image similarity search by distributed locality sensitive hashing,
Pattern Recognition Letters, Volume 128, 2019, Pages 361-369. https://doi.org/10.1016/j.patrec.2019.09.025
Fast image similarity search using distributed LSH
Muniz, M., & Flamand, T. (2022).
A weighted network clustering approach in the NBA.
Journal of Sports Analytics, 8(4), 251-275. https://doi.org/10.3233/JSA-220584
NBA Analytics - Clustering Analysis
Deng, S., Li, J., & Rabani, Y. (2023).
Generalized Unrelated Machine Scheduling Problem.
Society for Industrial  and Applied Mathematics, 2898-2916.
Approximation Algorithm for a Generalized Load-Balancing Problem
Chaplick, S., De, M., Ravsky, A., & J. Spoerhase,
Approximation schemes for geometric coverage problems.
26th Annual European Symposium on Algorithms (ESA 2018), August 20-22, 2018, Helsinki, Finland.
PTAS for the Maximum Coverage problem in geometric and graph settings
N. Shahbazi, S. Sintos, and A. Asudeh.
Fair-count-min: Frequency estimation under equal group-wise approximation factor.
arXiv preprint arXiv:2505.18919, 2025.
Fair-Count-Min Sketch
M. G. Herold, E. Kipouridis, and J. Spoerhase.
Clustering to minimize cluster-aware norm objectives,
2024.
Clustering to Minimize Cluster-Aware Norm Objectives
From Notes
Online Algorithms for Bipartite Matching

Twin Width
R. A. Moser.
A constructive proof of the Lovasz local lemma.
Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), pages 343-350, Bethesda, MD, USA, 2009. ACM.
LLL Talk
Alon, N., Yuster, R., Zwick, U. (1995).
Color-coding.
Journal of the ACM, 42(4), 844-856.
Color Coding
C. Chekuri, K. Quanrud, and M. R. Torres. 2022.
Densest subgraph: supermodularity,
iterative peeling, and flow.
Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 1531-1555
Densest Subgraph Problem
D. Dorfman et al.,
Minimum-cost paths for electric cars,
Symposium on Simplicity in Algorithms (SOSA). SIAM. 2024, pp. 100-101.
Min Cost Paths
J. Fakcharoenphol, S. Rao, and K. Talwar.
A tight bound on approximating arbitrary metrics by tree metrics.
Journal of Computer And System Sciences, 69:485-497, 2004.
Tree Metrics
M. Dorigo and T. Stutzle.
Ant Colony Optimization.
The MIT Press, 2004.
TSP Ant Colony
S. Chakraborty, N. V. Vinodchandran, and K. S. Meel.
Distinct Elements in Streams: An Algorithm for the (Text) Book. 
30th Annual European Symposium on Algorithms (ESA 2022), volume 244 of Leibniz International Proceedings in Informatics (LIPIcs), pages 34:1-34:6,
CVM
M. Cygan, F.V. Fomin, L. Kowalik, D. Lokshtanov, D. Marx, M. Pilipczuk, M. Pilipczuk, and S. Saurabh.
Parameterized Algorithms.
Springer International Publishing, 2015.
D. Lokshtanov, P. Misra, M. S. Ramanujan, S. Saurabh, and M. Zehavi.
FPT-approximation for FPT Problems, pages 199-218.
FPT-Approximations
S. Baswana and A. Pandey.
Sensitivity oracles for all-pairs mincuts.
Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 581-609. SIAM, 2022.
Sensitivity-Oracle
K. G. Larsen and J. Nelson.
Optimality of the Johnson-Lindenstrauss lemma.
58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017
Lower-Bound








Contents From Previous Offerings that weren't covered in Fall 2025 offering.

Locality-Sensitive Orderings

FPT Algorithm


Undergraduate & Graduate Academic Advisor:

The Undergraduate Advisor for the School of Computer Science is available in Room 5302 HP; or by email at scs.ug.advisor@scs.carleton.ca.  The undergraduate advisor can assist with information about prerequisites and preclusions, course substitutions/equivalencies, understanding your academic audit and the remaining requirements for graduation. The undergraduate advisor will also refer students to appropriate resources such as the Science Student Success Centre, Learning Support Services and Writing Tutorial Services.

Similarly, we have Graduate Advisors. Depending on your program of study and the university, you may contact the relevant graduate advisors.


University Policies

We follow all the rules and regulations set by Carleton University, Dean of Science, and the School of Computer Science regarding accommodating students with any kind of need(s). Please consult with the appropriate authorities to see how you can be accommodated and please follow their procedures. For information about Carleton's academic year, including registration and withdrawal dates, see Carleton's Academic Calendar. Following is a standard list of recommendations that we have been advised to provide you with respect to whom to contact for the appropriate action(s). It is possible that some of these are out of date, please consult the latest recommendations from the Science Faculty.

Pregnancy Obligation. 
Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, visit Equity Services.

Religious Obligation. 
Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, visit Equity Services. 

Academic Accommodations for Students with Disabilities
If you have a documented disability requiring academic accommodations in this course, please contact the Paul Menton Centre for Students with Disabilities (PMC) at 613-520-6608 or pmc@carleton.ca for a formal evaluation or contact your PMC coordinator to send your instructor your Letter of Accommodation at the beginning of the term. You must also contact the PMC no later than two weeks before the first in-class scheduled test or exam requiring accommodation (if applicable). After requesting accommodation from PMC, meet with your instructor as soon as possible to ensure accommodation arrangements are made. For more details, visit the Paul Menton Centre website.
 
Survivors of Sexual Violence. 
As a community, Carleton University is committed to maintaining a positive learning, working and living environment where sexual violence will not be tolerated, and survivors are supported through academic accommodations as per Carleton's Sexual Violence Policy. For more information about the services available at the university and to obtain information about sexual violence and/or support, visit: carleton.ca/sexual-violence-support.

Accommodation for Student Activities.
 Carleton University recognizes the substantial benefits, both to the individual student and for the university, that result from a student participating in activities beyond the classroom experience. Reasonable accommodation must be provided to students who compete or perform at the national or international level. Please contact your instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details, see the policy.

Student Academic Integrity Policy. 
Every student should be familiar with the Carleton University student academic integrity policy. A student found in violation of academic integrity standards may be awarded penalties which range from a reprimand to receiving a grade of F in the course or even being expelled from the program or University. Examples of punishable offences include: plagiarism and unauthorized co-operation or collaboration.

Plagiarism. As defined by Carleton University Senate, "plagiarism is presenting, whether intentional or not, the ideas, expression of ideas or work of others as one's own". Such reported offences will be reviewed by the office of the Dean of Science.  More information and standard sanction guidelines can be found here: https://science.carleton.ca/students/academic-integrity/.

Unauthorized Co-operation or Collaboration. 
Senate policy states that "to ensure fairness and equity in assessment of term work, students shall not co-operate or collaborate in the completion of an academic assignment, in whole or in part, when the instructor has indicated that the assignment is to be completed on an individual basis". 
For this course, the following also holds:

Important Considerations:

Late submissions are not accepted. Submissions are handled electronically using the Brightspace system and there is no "grace period" with respect to a deadline. Technical problems do not exempt you from this requirement. You are advised to: