DATA 5000

An Introduction to the World of Data Science

Content Overview

This course covers essential topics in data science: working with data, exploratory data analysis, data mining, and machine learning. Concepts are illustrated using Python, supplemented by hands-on tutorials in tools like Tableau,IBM Cognos, and others. Student learning is evaluated primarily through a semester-long course project.

The course is lecture-based with a flexible project component involving data collection, manipulation, and analysis. This course is offered by the School of Computer Science at Carleton University.

Instructor

Ahmed El-Roby
ahmed.elroby@carleton.ca

Logistics

Seminars: Thursdays, 11:35 AM - 2:25 PM
Location: via Zoom (link on Discord)

Tentative Schedule

  • Sep 11: Lecture 1: What is Data Science?
  • Sep 18: Lecture 2: Working with Data.
  • Sep 25: Lecture 3: Visualization and Exploration.
  • Oct 2: Lecture 4: Machine Learning I.
  • Oct 9: Lecture 5: Machine Learning II.
  • Oct 16: Project Proposal Presentations.
  • Oct 23: No Class - Fall Break.
  • Oct 30: IBM Cognos Analytics Tutorial by Matthew Denham.
  • Nov 6: Paper Presentations.
  • Nov 13: Exploring Next-Gen Cloud Data Analytics: Analytics, Trends & Innovation by Mohamed Sharaf.
  • Nov 20: Tableau Tutorial by Josh Gillmore (TBC).
  • Nov 27: Guest Lectures by X and Y (TBC).
  • Dec 4: Final Project Presentations.
*This schedule is tenatative. Any changes to the schedule will be shared over Discord.*

Evaluation

15%

Project Proposal Presentation

Oct 16

15%

Paper Presentation

Nov 6

10%

Project Presentation

Dec 4

10%

Presentation Discussion

Oct 16 and Dec 4

50%

Final Project Report

Dec 11

Course Project

Each group will choose a Data Science conference paper to present. The talk will be 15 minutes, with paper selections due by Oct 17.

The presentation should at least include:
  • Overview of the Problem: What is the problem? Why is it important? Has it been solved before? If yes, why are the existing approaches not sufficient?
  • Overview of the Solution: What is the proposed solution?
  • Datasets Used: Discuss the used datasets; their source, how they were constructed, their statistics, and how you are using them.

Present your findings in a 20-minute conference-style talk on Dec 4. The presentation should discuss the same points mentioned above in the project proposal presentation (an updated version), plus your experiments and findings. Submit a final 8-10 page report in ACM Conference Proceedings format by Dec 11 (11:59 PM). The report is submitted via email to the instructor (email is at the top of this page).

Datasets

University Policies

Academic Integrity

Every student must be familiar with Carleton's Academic Integrity policy. Plagiarism or unauthorized collaboration will be sanctioned. The use of any AI system (e.g., ChatGPT) for projects is considered academic misconduct, with an exception for grammar-checking tools. Please review the official policies on the university website.

View Accommodation Policies Academic Integrity at Carleton University