English   Danish

2023/2024  KAN-CDSCV1008U  Applied Machine Learning and Data Engineering in Business Context

English Title
Applied Machine Learning and Data Engineering in Business Context

Course information

Language English
Course ECTS 7.5 ECTS
Type Elective
Level Full Degree Master
Duration One Semester
Start time of the course Autumn
Timetable Course schedule will be posted at calendar.cbs.dk
Max. participants 60
Study board
Master of Science (MSc) in Business Administration and Data Science
Course coordinator
  • Raghava Rao Mukkamala - Department of Digitalisation (DIGI)
Main academic disciplines
  • Information technology
  • Statistics and quantitative methods
Teaching methods
  • Blended learning
Last updated on 01-02-2023

Relevant links

Learning objectives
To achieve the grade 12, students should meet the following learning objectives with no or only minor mistakes or errors:
  • Demonstrate a clear understanding of mathematical concepts behind machine-learning algorithms and have the ability to explain it to the non-technical business audience
  • Critically evaluate various cloud platform capabilities/services and recommend their applicability for a given business area.
  • Exhibit clearly how to improve profit or optimize costs through machine learning in a logical and non-technical manner to the business audience.
  • Demonstrate excellent presentation skills in data storytelling and advise on how algorithms work.
  • Show the scaling of the algorithms on a cloud platform such as AWS/Azure
  • Summarize and conceptualize fundamental machine learning concepts, methods and applications
Course prerequisites
The students are expected to have strong familiarity and a good understanding of Data Mining and Machine Learning concepts. It is expected that the students should have completed a proper machine learning course before taking this course.
Prerequisites for registering for the exam (activities during the teaching period)
Number of compulsory activities which must be approved (see section 13 of the Programme Regulations): 2
Compulsory home assignments
Each student has to get 2 out of 3 home assignments approved in order to participate in the ordinary exam. All 3 mandatory activities are made in groups.

In the first mandatory assignment, the students will download a dataset, choose a suitable algorithm/data analytical method to analyse the dataset and argue why the chosen algorithm/method is suitable for the purpose in a business-friendly manner (max 5 pages). In the second mandatory assignment, the students will develop an end-to-end machine-learning architecture for their data analytics project (from the first mandatory assignment) in the cloud using one of the Azure or AWS cloud infrastructures (max 5 pages). The third mandatory assignment is focussed on developing a 10-slide executive PowerPoint presentation that mainly targetted for communicating the machine-learning approach and data engineering architectures to the business audience and the management using the key principles of business communication.

There will not be any extra attempts provided to the students before the ordinary exam. If a student cannot participate due to documented illness, or if a student does not get the activities approved in spite of making a real attempt, then the student will be given one extra attempt before the re-exam: one individual home assignment (10 pages) which will make up for two mandatory activities.
Examination
Applied Machine Learning and Data Engineering in Business Context:
Exam ECTS 7,5
Examination form Oral exam based on written product

In order to participate in the oral exam, the written product must be handed in before the oral exam; by the set deadline. The grade is based on an overall assessment of the written product and the individual oral performance, see also the rules about examination forms in the programme regulations.
Individual or group exam Individual oral exam based on written group product
Number of people in the group 2-4
Size of written product Max. 15 pages
Assignment type Project
Release of assignment An assigned subject is released in class
Duration
Written product to be submitted on specified date and time.
20 min. per student, including examiners' discussion of grade, and informing plus explaining the grade
Grading scale 7-point grading scale
Examiner(s) Internal examiner and second internal examiner
Exam period Autumn
Make-up exam/re-exam
Same examination form as the ordinary exam
Course content, structure and pedagogical approach

This course aims at making some of the complex mathematical, cloud and data science concepts tangible to the business audience. The primary focus of the course content is to explain data mining, machine learning theories and cloud & data engineering concepts in an animated and business-friendly fashion where students are motivated to understand deeply the assumptions of the models and evaluate their applicability for a given business context.

 

The course provides hands-on experience on how to use machine-learning methods for solving real-world problems in an organizational context using suitable cloud technologies and business communication practices. It simulates the real-world processes that are experienced by the data scientists in the companies to provide a hands-on experience to the students on data-driven decision-making in an organizational setting. This course is ideal for the students who have a strong technical background in machine-learning and looking forward to enriching their skills on data engineering, cloud technologies, and business communication to have a smooth transition into the data scientist/data engineer careers at a later point of time. This is course is offered in collaboration with Capgemini and other Danish companies and therefore, the students will have an opportunity to interact with domain experts from various industry sectors such as Finance, Marketing, Supply chain and Energy.

 

The course is structured in three parts, providing the students with a full overview of methods, techniques, and practices that are currently used in the industry with respect to data-driven decision-making in an organization setting as follows.

 

 1. Machine Learning in Business

  • Applied linear regression
  • Applied PCA
  • Applied predictive and classification algorithms such as SVM, Random Forrest and XGBoost
  • Applied deep learning algorithms in particular to LSTM

 

 2.Data Engineering and Cloud Technologies

   (using Amazon Web Services and Microsoft Azure cloud platforms)

  • Data management including compliance  
  • Ingestion process
  • Storage
  • Extract, Transform, Load (ETL) end-to-end processes
  • ML Computation
  • Visualization

 

  3.Business Communication Practices    

  • How to introduce complex business questions
  • How to design approach, assumption and apply machine learning concepts in a business-friendly manner
  • How to design PowerPoint slides
  • How to clearly demonstrate the business benefits   
Description of the teaching methods
This course is a blended-learning course. Some of the lectures and exercises will be delivered online but there will be some activities, especially few lectures and hands-on exercise workshops will be conducted on campus. The hands-on exercises will be offered in Python/R programming languages. The cloud platforms such as Amazon Web Services and Microsoft Azure will be used for providing hands-on experience on end-to-end cloud architectures and automation of machine-learning processes. In addition to the above, there will be several presentations by the domain experts, data scientists and data engineers working in the Danish industry.
Feedback during the teaching period
In-class and hands-on exercises will be used systematically to test students’ understanding of the course content and increase their ability to reproduce acquired knowledge and skills autonomously. Students will receive continuous in-class feedback on them.

As part of the course, the students will have to take 3 mandatory assignments, out of which one will be a business presentation exercise. The students will receive feedback on these mandatory activities. Moreover, feedback on the hands-on exercises will be also provided in the classroom.
Student workload
Lectures and Exercises 30 hours
Self study 50 hours
Prepare for the class 30 hours
E-learning 20 hours
Project work and report 50 hours
Presentations 20 hours
Exam and prepare 6 hours
Total 206 hours
Expected literature

Due to the rapidly evolving nature of the field, the reading list will be updated and has to be consulted at the start of the semester. Students are advised to check the syllabus on Canvas before buying any material.

 

The teaching material will include

  • Scientific Articles
  • Lecture slides
  • Computational Notebooks
  • E-Learning Resources
  • Readings
  • Hand-on coding and analytics exercises

 

Some suggested textbooks for reference:

 

 

  • Brink, H., Richards, J. W., & Fetherolf, M. (2019). Real-world machine learning. New York, United States: Manning Publications.

 

Last updated on 01-02-2023