Module Descriptors
DATA ANALYTICS
COMP63039
Key Facts
Digital, Technology, Innovation and Business
Level 6
20 credits
Contact
Leader: Tharaka Ilayperuma
Hours of Study
Scheduled Learning and Teaching Activities: 20
Independent Study Hours: 180
Total Learning Hours: 200
Pattern of Delivery
  • Occurrence A, Stoke Campus, UG Semester 1
  • Occurrence B, Stoke Campus, UG Semester 1
  • Occurrence C, Stoke Campus, UG Semester 1 to UG Semester 2
  • Occurrence D, Stoke Campus, UG Semester 1 to UG Semester 2
Sites
  • Stoke Campus
Assessment
  • A Technical Report 3000 words weighted at 100%
Module Details
INDICATIVE CONTENT
Data Quality
Overview – sources of data, traditional/structured vs. non-traditional/unstructured, storage paradigms
DAMA definitions of data quality, the Data Quality Lifecycle
Consequences of poor data quality
Decision-making processes within organisations
sing data to support decisions

Cleansing and Preparation of Data
Data cleansing, formatting, and filtering techniques in Microsoft Excel
Data cleansing, Extract-Transform-Load / Extract-Load-Transform processes in SQL Server Integration Services
The scientific method; formulating and testing hypotheses
Reproducibility and falsifiability in scientific experiments
Quantitative statistical testing methodologies – Student’s t-test, and p-value calculation
Introduction to WEKA

Core Techniques
Linear regression using WEKA
Association Rule Mining using Apriori in WEKA
Descriptive statistics, grouping and aggregation using Microsoft Excel
Introduction to the R language, using RStudio

Data Visualisation
Design practices in data visualisation
Data analysis and visualisation techniques in R
Connecting to external data sets in Excel
Graphs and charts in Excel; Power Query, Power View
Preparing dynamic reporting for Power BI
Using Shiny for custom reporting visualisations
ADDITIONAL ASSESSMENT DETAILS
1. Technical Report – Given a case study and data set: prepare the data set, identify some hypotheses (questions or suppositions), apply two or more data analysis techniques learned in this module against the data set to test these hypotheses and prepare one or more data visualisations to present the findings.
The report will contain a narrative description of the steps taken, including pre-processing the data, formal statements of the hypotheses and null hypotheses, description of the techniques used to test the hypotheses, the outcomes of testing, screenshots/graphics depicting the outcomes of the analysis and a critical reflection of the process as a whole.
The report should include screenshots and code where appropriate.
There is no specific document format required. The total word count should not exceed 3,000 words +/- 10%.

Appendices may be used for data listings and so forth and do not contribute to the word limit.
LEARNING STRATEGIES
Face-to-face/Online class-based sessions (19 hours)
There are 19 hours of class-based teaching delivery presented face-to-face or online, which will include lectures, practical demonstrations and group work where appropriate.

Assessment Clinic (1 hour)
There is 1 hour of face-to-face/online teaching aimed at helping you complete your assignment. This will include both classroom-led guidance and an allocation of time for one-to-one support with your tutor.

Self-led Learning (180 hours)
You are expected to spend 45 hours in self-led learning. This includes working through supplied tutorials, tools practice and background/guided reading.
LEARNING OUTCOMES
1. Understand the importance and context of data analytics within commercial and non-commercial organisations.

Knowledge and Understanding

2. Problem solve how to cleanse and prepare data for analysis.

Learning problem Solving

3. Apply several primary data analysis techniques to a variety of data sets.

Analysis

4. Use dynamic tooling and self-service BI and MI to provide rich, interactive data exploration environments for users.

Application
RESOURCES
1. Blackboard VLE for module information and learning materials

2. LinkedIn Learning (formerly Lynda.com), user/setup guide available via the Library: https://libguides.staffs.ac.uk/ld.php?content_id=33214004

3. Microsoft Teams for module communication

4. Staffordshire University library access (physical or digital) for access to recommended texts

5. It is strongly recommended that you have your own laptop, PC (Windows 10) or Mac (OS X / Big Sur) with Internet access and ability to install and configure software (administrative access).
If administrative access is unavailable on your computer, e.g. it is a loan computer or does not belong to you, then access to VMWare with a Windows image is recommended so that software can be installed.
VMWare download for students: https://staffsuniversity.sharepoint.com/sites/software/SitePages/VMWare.aspx
Windows 10 for Education available via Staffs OnTheHub (register and search for ‘Windows 10’): https://staffs.onthehub.com

Software List (all software is free of charge):

Microsoft Office (Excel)
Microsoft Office 365 Pro Plus available free to students: https://staffsuniversity.sharepoint.com/sites/software/SitePages/Home.aspx (Windows and Mac)

R and RStudio (Desktop edition)
https://cran.r-project.org/bin/windows/base/ (Windows)
https://cran.r-project.org/bin/macosx/ (Mac)
https://www.rstudio.com/products/rstudio/download/ (Windows and Mac)
You will install Shiny from R.

Microsoft Power BI Desktop
https://www.microsoft.com/en-gb/download/details.aspx?id=58494 (Windows)
Note: At the time of writing, Power BI Desktop is not available on Mac.
If you are a Mac user, you may wish to use a virtual machine to host and run this application (see above).
TEXTS
Recommended (not essential):

Labbe, B. (2019), Hands-On Business Intelligence with Qlik Sense: Implement self-service data analytics with insights and guidance from Qlik Sense experts, Packt Publishing, ISBN-10 : 1789800943
Marr, B. (2020), The Intelligence Revolution: Transforming Your Business with AI, Kogan Page Publishing, ISBN-10 : 1789664349
Dietrich, D. et Al., (2015), Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, John Wiley & Sons, ISBN-13: 978-1118876138.
Holmes, D. E. (2017), Big Data: A Very Short Introduction (Very Short Introductions), OUP Oxford, ISBN-10: 0198779577.
Sharda, R. (2017), Business Intelligence, Analytics, and Data Science: A Managerial Perspective (4th Edition), Pearson, ISBN-10: 0134633288
Salcedo, J. (2019), Machine Learning for Data Mining: Improve your data mining capabilities with advanced predictive modelling, Packt Publishing, ISBN-10:1838828974
Paul et al, (2020) Business Analysis. BCS Learning, ISBN 78-1-780-78017-277-4
Ahlemeyer-Stubbe A. et al, (2014), A Practical Guide to Data Mining for Business and Industry, John Wiley & Sons, ISBN 1119977134.
Park, A. (2020), Data Science for Beginners: This Book Includes: Python Programming, Data Analysis, Machine Learning. A Complete Overview to Master the Art of Data Science from Scratch Using Python, Independent Publishing, ISBN-13: 979-8645845551

Physical copy/copies are available in the Thompson Library (Stoke).

Other reading/papers etc. will be indicated in the module contents.
web descriptor
Making sense of new data is vital to allow organisations to carry out business, respond to changes and take advantage of new opportunities. In this module, you will practice, using formal methods of investigation, data analysis and learn how to extract information and insights using algorithmic approaches. Using tools which include WEKA and R, this module will provide a hands-on introduction to acquiring the skills to practice professional data analysis strategies and techniques that can be applied in a multitude of different environments.