Module Learning Outcomes
1. DISCUSS CRITICALLY WHAT IS MEANT BY KNOWLEDGE DISCOVERY AND THE RELATIONSHIP BETWEEN MACHINE LEARNING, DATA MINING, ORGANISATIONAL DECISION MAKING AND DATA SCIENCE Knowledge and Understanding
2. SHOW CLEAR UNDERSTANDING, AND BE ABLE TO EXPLAIN, APPLY AND CRITICALLY EVALUATE THE RESULTS OF MACHINE LEARNING APPLIED TO BUSINESS DATA TO SUPPORT DECISION MAKING Analysis, Problem Solving, Application
3. DEMONSTRATE, EXPLAIN, APPLY AND CRITICALLY EVALUATE THE USE OF BIG DATA ANALYTICS TO SUPPORT BUSINESS DECISION MAKING
Analysis, Problem Solving
4. DISCUSS WHAT IS MEANT BY THE FAMILY OF DATABASE TECHNOLOGIES USUALLY REFERRED TO AS NOSQL; AND BE ABLE TO EXPLAIN THE CHARACTERISTICS, APPLICATIONS, STRENGTHS AND LIMITATIONS OF THIS FAMILY OF DATABASE TECHNOLOGIES
Knowledge and Understanding, Communication
5. DEMONSTRATE UNDERSTANDING OF THE PRINCIPLES OF CORPORATE DATA GOVERNANCE AND BE ABLE TO PROPOSE STRATEGIES FOR THE DESIGN OF CORPORATE DATA SYSTEMS.
Learning, Analysis, Problem Solving, Application
Module Additional Assessment Details
Assignment 1 -
Is a case study to model a business problem, and covers Learning Outcomes 1 to 3.
Assignment 2 -
A supporting presentation to accompany Assignment 1 above, and covers Learning Outcomes 1 to 3.
Assignment 3
Build and demonstrate a NoSQL application (Provide a corporate data solution for a business problem), Learning Outcomes 4 to 5.
Assignment 4-
Undertake a presentation to support Assignment 3 above, Learning Outcomes 4 to 5.
Module Indicative Content
Data Science
This module looks at Machine Learning Algorithms, Data Mining and Big Data Analytics in the context of organisational decision making.
The content includes:
The nature of Knowledge Discovery, and the role and contribution of Machine Learning, Data Mining, Organisational Decision Making and Data Science.
Data Quality and ethics in machine learning and Big Data Analytics
The nature of Big Data and Big Data Analytics and the selection of analysis strategies
Professional issues and obligations in relation to data analysis
Machine Learning and Data Mining algorithms using the Weka environment or optionally the R statistical software package, including
o Classification
o Clustering
o Association Rules
Visualisation and communication of the results of analysis
Big Data technologies such as Hadoop and MapReduce or the replacements for Hadoop and MapReduce as these come on stream
Advanced Data Management
You will develop practical skills in NoSQL technologies and also gain an understanding of corporate data governance and the way in which NoSQL and Relational technologies can be used to develop the data strategy for the organisation.
The content includes:
Design of a NoSQL database. You will work with MongoDB (document oriented datastore) and will look at other NoSQL technologies including CouchDB.
Comparison of NoSQL and relational design; understanding of the design challenge of unstructured data
You will develop practical skills in the use of MongoDB. The module does not assume any prior knowledge of MongoDB or NoSQL but does require an understanding of relational database development. Programming skills in JavaScript and Python would be useful but are not required as the elements needed to use MongoDB are taught from scratch. The elements covered in MongoDB will include
o Use of JavaScript, Python and MongoDB interfaces
o Building collections and defining documents, importing data, referencing and embedding
o Data structures and inheritance
o Querying, manipulating and exporting data
o Working with structured and unstructured data
o Configuration and security
Exploring what is meant by NoSQL and how this relates to and contrasts with relational database technologies
NoSQL and relational use cases
An examination of what is meant by corporate data governance; looking at organisational data strategies and the development of integrated data management solutions.
Professional issues relating to corporate data governance
Module Learning Strategies
1 x 2 hour practical session a week. Theoretical elements will be integrated into the practical sessions.
Module Texts
Cox, I. (2016) Visual Six Sigma: Making Data Analysis Lean 2nd Edition John Wiley and Sons ISBN: 1118905687; 9781118905685
Du, H., (2013) Data Mining Techniques and Applications: an introduction Cengage ISBN 978-84480-891-5
Marr, B. (2016) Big data in practice: how 45 successful companies used big data analytics to deliver extraordinary results Wiley ISBN: 1119231388; 9781119231387
Bengfort B., (), Kim J. (2016) Data analytics with Hadoop: an introduction for data scientists OReilly ISBN: 1491913703; 9781491913703.
Data Protection Act 2018 and GDPR 2018
Ladley J. (2012) Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program Morgan Kaufmann ISBN-13: 978-0124158290 0124158293
MongoDB.com https://docs.mongodb.com/ - the authoritative source for MongoDB documentation.
Module Resourcs
R statistical package - latest stable version
Microsoft Office
Internet access
Access to ISO standards ISO 8000-8:2015 Data Quality
Access to Lab
MongoDB
CouchDB
SQL Server Enterprise
NoSQL podcasts and forum including MongoDB.com and http://nosql-database.org/
Module Special Admissoins Requirements
None