Généralités
Module Description: Big Data
Target Audience: Master 2 IAA
Instructor: Dr. Brahim Benabderrahmane
Course Overview
Welcome to the Big Data module. This course is designed to provide you with a deep theoretical understanding and practical mastery of the technologies required to store, process, and analyze massive datasets. Moving beyond traditional relational databases, we will explore the distributed computing paradigms that power modern data architectures.
The educational content on this platform is organized into chapters. For each chapter, you will find:
📂 Lecture Slides (Cours): The theoretical foundations.
📝 Directed Works (TD): Exercises and problem-solving scenarios with solutions.
💻 Practical Works (TP): Hands-on labs and coding guides.
Course Content Structure
The module follows a progressive learning path covering four main pillars:
Chapter 1: Introduction to Big Data: Understanding the "5 Vs," the limitations of vertical scaling, and the transition to distributed storage and computing.
Chapter 2: Hadoop Systems: Deep dive into the Hadoop ecosystem, including the HDFS architecture for storage and the MapReduce paradigm for batch processing.
Chapter 3: Apache Spark: Mastering in-memory processing. We cover the Spark architecture, RDDs, DataFrames, and the Machine Learning library (MLlib) for high-performance analytics.
Chapter 4: NoSQL & MongoDB: Transitioning from rigid schemas to flexible document stores. We will cover data modeling, CRUD operations, and advanced aggregation pipelines.
Evaluation & Grading Scheme
The final module grade is calculated based on a weighted average of three components:
Component Weight Description Final Exam 60% Comprehensive written examination. Directed Works (TD) 20% Continuous assessment of theoretical understanding. Practical Works (TP) 20% Assessment of technical implementation skills. Detailed Breakdown of Continuous Assessment
1. Directed Works (TD) / 20:
10 points: Written Interrogation (Quiz/Test).
06 points: Student Presentation (Exposé).
04 points: Participation & Attendance.
2. Practical Works (TP) / 20:
12 points: Mini-Project (Implementation of a complete Big Data pipeline).
08 points: Continuous Laboratory Evaluation (Weekly lab performance).
Please ensure you regularly check this page for updated slides and educational resources. Good luck with your semester!
