This is a previous offering of this course.  For Winter 2022 offering, visit here.

General Information

The course will be lecture-based and will also offer some hands-on tutorials. The project component will be flexible and will involve data collection, manipulation, and analysis. For further details on the course content, please refer to its outline (pdf). This course is offered by the School of Computer Science at the Carleton University.

Seminars are held every Thursday from 11:35 AM to 2:25 PM over Zoom.

Instructors Majid Komeili Elio Velazquez Michael Genkin
Online Class Zoom Zoom Zoom
Email majid.komeili@carleton.ca Elio.Velazquez@carleton.ca Michael.Genkin@carleton.ca
Office hours by appointment by appointment by appointment

Announcements

Content Overview

The course covers topics relevant to data science: working with data, exploratory data analysis, data mining, machine learning. The concepts are illustrated using the R language. Students also receive hands-on tutorials (e.g., Tableau, IBM Cognos Analytics). Students will be evaluated by their course projects.

Tentative Schedule

It is important to note that this schedule is evolving and will change based on how the class is progressing.

Course Information

Evaluation

Method of Delivery

Blended delivery; Students are expected to participate during the synchronous meeting time, including lectures and other presentations. There will be additional activities such as project for completion outside of class time. Classes will be recorded subject to any technical issue. Presentations by guest speakers will be recorded subject to their consent. Students are expected to have high-speed internet access, and a computer with microphone.

Paper Presentations

Each group needs to choose a conference publication on the topic of Data Science to present in class (15-minute talk). A 6-12 page conference proceeding (e.g., IEEE International Conference on Data Science, SIGKDD/KDD Conference, etc.) will be approved by the instructor. Presentations will be scheduled throughout the term during class time. Paper selection due January 31, 2021. Late submissions will be penalized 10% per day.

Project Proposal

The project forms an integral part of this course. The project is to be completed in group of two students.

You have two options: you can choose to mine and analyze one of the provided datasets or come up with an idea of your own that relates to the course material. In either case, the project topic will require the instructor's approval.

Before you undertake your project you will need to submit a proposal for approval. The proposal should be short (max 2 page PDF). You may use the ACM format. The proposal should include a problem statement, the motivation for the project, a description of the data your will be working on, and a set of objectives you aim to accomplish. This will be due on January 31, 2021 by 11:59 PM via Email. Late submissions will be penalized 10% per day.

Presentation Outlines

This has two parts: 1) A one-page abstract to be submitted to the DATA Day poster competiotion (in PDF format). Submitted abstracts will be reviewed by a committee. 2) A very first draft of your poster that shows the structure of your poster and preliminary content. This will be submitted to your instructor via email. The deadline is March 11. Note that the Data Day committee will not consider late submissions.

Poster Presentation

You will present your project's poster during the poster presentation on Data Day on March 30. An independent jury will evaluate posters and select winners. Groups that are among the top three, will receive five bonus marks. The poster should be submitted to the Data Day competition by March 23. Note that the Data Day committee will not consider late submissions.

Project Presentation

Each group will have the opportunity to present their poster in class on April 8. This presentation should take the form of a 15 minute (hard maximum) conference-style talk and describe the motivation for your work, what you did, and what you found. If a demo is the best way to describe what you did, feel free to include one in the middle of the talk.

The proposed structure of your presentation:

  1. Introduction (describe the problem and motivation)
  2. Research questions
  3. Methodology: data collection, data cleanup, data mining, data analysis (statistics, machine learning), etc.
  4. Results (achieved, preliminary, or anticipated)
  5. Implications (why does this study matter? how can your findings be used?)
  6. Conclusion (summary, main contributions)

Project Report

The required length of the written report varies from project to project (8-10 pages, double column format); all reports must be formatted according to the ACM format and submitted as a PDF. This will be due on April 15 by 11:59 PM via email. Late submissions will not be considered.

Datasets

Resources

The following books are suggested but not required: The following books are good references for data mining and machine learning algorithms: The following are good references for R (just to name a few):

University Policies

Academic Integrity

Academic Integrity is everyone’s business because academic dishonesty affects the quality of every Carleton degree. Each year students are caught in violation of academic integrity and found guilty of plagiarism and cheating. In many instances they could have avoided failing an assignment or a course simply by learning the proper rules of citation. See the academic integrity for more information.

Academic Accommodations for Students with Disabilities

The Paul Menton Centre for Students with Disabilities (PMC) provides services to students with Learning Disabilities (LD), psychiatric/mental health disabilities, Attention Deficit Hyperactivity Disorder (ADHD), Autism Spectrum Disorders (ASD), chronic medical conditions, and impairments in mobility, hearing, and vision. If you have a disability requiring academic accommodations in this course, please contact PMC at 613-520-6608 or pmc@carleton.ca for a formal evaluation. If you are already registered with the PMC, contact your PMC coordinator to send me your Letter of Accommodation at the beginning of the term, and no later than two weeks before the first in-class scheduled test or exam requiring accommodation (if applicable). After requesting accommodation from PMC, meet with me to ensure accommodation arrangements are made. Please consult the PMC website for the deadline to request accommodations for the formally-scheduled exam (if applicable).

Religious Obligation

Write to the instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details visit the Equity Services website.