English   Danish

2023/2024  KAN-CINTO1011U  Big Data Management

English Title
Big Data Management

Course information

Language English
Course ECTS 7.5 ECTS
Type Mandatory (also offered as elective)
Level Full Degree Master
Duration One Semester
Start time of the course Autumn
Timetable Course schedule will be posted at calendar.cbs.dk
Study board
Study Board for BSc/MSc in Business Administration and Information Systems, MSc
Course coordinator
  • Jason Burton - Department of Digitalisation (DIGI)
Main academic disciplines
  • Information technology
  • Statistics and quantitative methods
Teaching methods
  • Face-to-face teaching
Last updated on 31-01-2023

Relevant links

Learning objectives
  • Explain the business value of big data and be able to deploy machine learning techniques to analyze it for broad applications, such as classification, regression, and clustering.
  • Evaluate methods for the testing and assessment of machine learning models and critically reflect on the meaning of findings.
  • Recognize the practical and ethical boundaries of machine learning and big data management.
  • Create a business case by identifying a valuable data set and working collaboratively to apply and justify an appropriate machine learning technique.
Course prerequisites
Students should have a basic understanding of quantitative data analysis and a willingness to work with computational methods.
Examination
Big Data Management:
Exam ECTS 7,5
Examination form Oral exam based on written product

In order to participate in the oral exam, the written product must be handed in before the oral exam; by the set deadline. The grade is based on an overall assessment of the written product and the individual oral performance, see also the rules about examination forms in the programme regulations.
Individual or group exam Oral group exam based on written group product
Number of people in the group 2-4
Size of written product Max. 15 pages
Assignment type Project
Release of assignment Subject chosen by students themselves, see guidelines if any
Duration
Written product to be submitted on specified date and time.
20 min. per student, including examiners' discussion of grade, and informing plus explaining the grade
Grading scale 7-point grading scale
Examiner(s) Internal examiner and second internal examiner
Exam period Autumn
Make-up exam/re-exam
Same examination form as the ordinary exam
Students can submit the same project or they can choose to submit a revised project.
Course content, structure and pedagogical approach

This course is designed to equip students with conceptual and technical knowledge of tools and techniques for analyzing large data sets — namely, machine learning models. Students will study how organizations leverage big data for innovation and value creation, how to implement and evaluate different types of machine learning models, and also gain an understanding of the practical and ethical boundaries that accompany big data applications.

 

The course is planned to run in person and will be comprised of a weekly lecture and a weekly exercise session with Python and Jupyter Notebooks. Students with no prior programming experience will be encouraged to complete a basic online tutorial upon beginning the course (e.g., https://pandas.pydata.org/pandas-docs/version/0.15/10min.html). 

 

The course also includes an independently chosen project on big data management to be completed in groups of 2-4 students. Students will develop a business case, select an appropriate data set, implement machine learning models, and assess them from business and data science perspectives.

 

The course will cover the following main topic areas:

  • Value creation through big data management
  • Machine learning tools and techniques, including classification, regression, and clustering
  • Model evaluation, ethical considerations, and decision analytic thinking
Description of the teaching methods
A combination of in-person lectures and in-person, hands-on exercise sessions.
Feedback during the teaching period
Students will receive feedback in three ways throughout the course. (1) The lecture sessions will incorporate anonymous polls whereby the students can test their understanding of concepts covered previously and then ask questions publicly. (2) During the exercise sessions the students will work in groups and receive peer-to-peer feedback, and also have the opportunity to receive specialised feedback from the professor as they work to ensure understanding of the practical aspects of the course. (3) Finally, students will be given the option of submitting a brief project plan mid-way through the course for the professor to provide written comments on.
Student workload
Lectures 20 hours
Exercises 20 hours
Class Preparation 106 hours
Exam and Preparation for Exam 60 hours
Total 206 hours
Expected literature

The literature can be changed before the semester starts. Students are advised to find the final literature on Canvas before they buy any material.

 

  • Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. " O'Reilly Media, Inc.".
  • Müller, A. C., & Guido, S. (2016). Introduction to machine learning with Python: a guide for data scientists. " O'Reilly Media, Inc.".
  • Bollier, D., & Firestone, C. M. (2010). The promise and peril of big data (pp. 1-66). Washington, DC: Aspen Institute, Communications and Society Program.
Last updated on 31-01-2023