Welcome to the web page of COMP 5118 - Trends in Big Data Management. This is a grad-level course for students in Carleton University and the University of Ottawa. Each year we focus on some research topics in the general field of data management. These research topics change from one course offering to another depending on what's new and hot. This term, we focus on the following topics: Question Answering, Knowledge Graphs, Data Cleaning, Data Integration, Graph Processing, Data Lake Management, Crowdsourcing, Data Exploration and Training via Weak Supervision. Check the schedule below to see the list of papers that we will discuss this term. Most of the papers we will be covering during the term are published in top-tier conferences, and are very recent. This should give us a rough idea of what the research community of data management is currently working on. Psst, this will also (hopefully) give you ideas for the course project, which you should take very seriously.
The class is on Mondays from 2:35 pm to 5:25 pm. The class will take place via Zoom. Links for each class will be posted on this page in the schedule table below.
Herzberg Laboratories 5433
1125 Colonel By Dr
Ottawa, Ontario K1S 5B6
613-520-2600 ext. 4254
myFirstName.myLastNameWithoutHyphen@carleton.ca
In this course, students will be reading and reviewing papers for each class. During the class, some students will be presenting the papers for the week, they and the rest of the class (including me) will be discussing these papers and our take on them. There is also a term-long project, which is worth the biggest chunk of your grade. Following is the marks breakdown:
The research project could be any of the following:
The project can be done individually or in groups. However, the assessment will take into consideration how many students are in the group. E.g., if one student demonstrates contributions in her/his project that is equal to the contributions for a team of three students, students should expect a high variance in grades.
The project deliverables will be:
There will be 16 presentation throughout the term. This workload may not be evenly distributed over the students doing this class. Therefore, the student who presents one more presentation than average will get a bonus. Each presentation should be 30 to 45 minutes long, followed by a 30 to 45 minutes of discussion of the paper. The presenter should not only present the details of the paper, but also suggest the discussion points at the end of his/her presentation.
The paper reviews are due at 1:00 PM on the day of the class. The format for the review is fixed: Summary of the paper, three or more strong points, three or more weak points, and any additional comments you may have on this paper. The number of fields required is small, but you are expected to be elaborative. Theoretically, if your review is written in a Word document, it should be at least one page long in 12 pt. Your two worst reviews will not count towards your grade.
Here are a few comments to consider when you write your reviews:
This is a seminar-based class, meaning that your participation in the class is essential. You are encouraged to ask questions, answer other students questions, give comments over the papers we discuss, etc.
Date | Topics | Papers | Speakers |
---|---|---|---|
September 14th | Course Introduction & Recent Game Changers in Data Managament | N/A | Ahmed El-Roby |
September 21st | Graph Processing Question Answersing |
1. Tina Yazdizadeh 2. Olusegun Odufuwa |
|
September 28th | Web Tables |
1. Di Wu |
|
October 5th | Natural Language Interfaces |
1. Zixun Xiang 2. Ben Deng |
|
October 12th | NO CLASS (Thanksgiving) |
||
October 19th | Data Cleaning Crowdsourcing |
1. Marzieyh Zamiri 2. Hiba El Tahan |
|
October 26th | NO CLASS (Fall Break) |
||
November 2nd | Data Integration Knowledge Graph Mining |
1. Keerthana Muthu Subash 2. Hanping Zhang |
|
November 9th | Data Lakes Management Knowledge Graphs |
1. Johnny Ma 2. Anthony Tasca |
|
November 16th | Data Exploration and Preparation |
1. Rahatara Ferdousi 2. Ishtiaque Hossain |
|
November 23rd | Training Data with Weak Supervision |
1. Keerthana Muthu Subash 2. Zixun Xiang |
|
November 30th | Web Tables |
1. David Chiumera | |
December 7th | Guest Speaker | Distributed Evaluation of Subgraph Queries Using Worst-case Optimal Low-Memory Dataflows |
Khaled Ammar (Borealis AI) |