Welcome to the web page of COMP 5118 - Trends in Big Data Management. This is a grad-level course for students in Carleton University and the University of Ottawa. Each year we focus on some research topics in the general field of data management. These research topics change from one course offering to another depending on what's new and hot. This term, we focus on the following topics: Question Answering, Knowledge Graphs, Data Cleaning, Data Integration, Graph Processing, Data Lake Management, Crowdsourcing, Data Exploration and Training via Weak Supervision. Check the schedule below to see the list of papers that we will discuss this term. Most of the papers we will be covering during the term are published in top-tier conferences, and are very recent. This should give us a rough idea of what the research community of data management is currently working on. Psst, this will also (hopefully) give you ideas for the course project, which you should take very seriously.
The class is on Tuesday from 11:35 am to 2:25 pm. The class will take place via Zoom. Links for each class will be posted on this page in the schedule table below.
Herzberg Laboratories 5433
1125 Colonel By Dr
Ottawa, Ontario K1S 5B6
613-520-2600 ext. 4254
myFirstName.myLastNameWithoutHyphen@carleton.ca
In this course, students will be reading and reviewing papers for each class. During the class, some students will be presenting the papers for the week, they and the rest of the class (including me) will be discussing these papers and our take on them. There is also a term-long project, which is worth the biggest chunk of your grade. Following is the grade breakdown:
The research project could be any of the following:
The project can be done individually or in groups. However, the assessment will take into consideration how many students are in the group. E.g., if one student demonstrates contributions in her/his project that is equal to the contributions for a team of three students, students should expect a high variance in grades.
The project deliverables will be:
There will be 22 presentations throughout the term. This workload may not be evenly distributed over the students doing this class. Therefore, the student who presents one more presentation than average will get a bonus. Each presentation should be 30 to 45 minutes long, followed by a 30 to 45 minutes of discussion of the paper. The presenter should not only present the details of the paper, but also suggest the discussion points at the end of his/her presentation.
The paper reviews are due at 11:00 AM on the day of the class. The format for the review is fixed: Summary of the paper, three or more strong points, three or more weak points, and any additional comments you may have on this paper. The number of fields required is small, but you are expected to be elaborative. Theoretically, if your review is written in a Word document, it should be at least one page long in 12 pt. Your two worst reviews will not count towards your grade.
Here are a few comments to consider when you write your reviews:
This is a seminar-based class, meaning that your participation in the class is essential. You are encouraged to ask questions, answer other students questions, give comments over the papers we discuss, etc.
Date | Topics | Papers | Speakers |
---|---|---|---|
January 11 | Course Introduction & Recent Game Changers in Data Managament | N/A | Ahmed El-Roby |
January 18 | Graph Processing Internet of Things |
1. Vivek Thaker 2. Yanan Mao |
|
January 25 | Question Answering Data Integration |
1. Evelyn Yang 2. Jiahe Geng |
|
February 1 | Question Answering |
1. Taoseef Ishtiak 2. Samin Azhan |
|
February 8 | Blockchains Text-to-SQL |
1. Yaqing Zhu 2. Elmira Adeeb |
|
February 15 | Web Tables Data Cleaning |
1. Mohammad Zarei 2. Raha Rashid |
|
February 22 | NO CLASS (Winter Break) |
||
March 1 | Sentiment Analysis Data Discovery |
1. Amirali Madani 2. Elmira Adeeb |
|
March 8 | Resource Allocation Natural Language Processing |
1. Masoumeh Haghighi 2. Robin Redhu |
|
March 15 | AI in Geospatial Applications |
1. Zoya Shahcheraghi 2. Aagyapal Kaur |
|
March 22 | AI in Healthcare |
1. Evelyn Yang 2. Yanan Mao |
|
April 5 | AI in Football |
1. Satyadev Abhiram Pandravada 2. Satyadev Abhiram Pandravada |
|
April 12 | Data Discovery NL to SQL |
1. Vivek Thaker 2. Taoseef Ishtiak |