This course will explore the advances in large‐scale data repositories. Students will be exposed to advanced topics in analytical approaches to handling the five Vs — volume, velocity, variability, veracity, and value — of big data. Parallel programming based on the MapReduce paradigm within the Hadoop Ecosystem is used to address these needs.
Data Science Track Curriculum
The courses you will take in the Master of Computer Science Online - Data Science Track are listed below. Based on your previous education, you may need to take some foundation courses prior to beginning these courses. They are listed in the brochure and an enrollment advisor can help determine if they are necessary.
This course focuses on advanced topics in cloud environments (AWS, Google, Azure) and economics, history, differences, and importance of architecture decisions such as the decisions that drive analytics, data lakes, arts, and warehouses. The core concepts of virtual private clouds, instance types, microservices, and storage services will be addressed, in addition to deeper architecture concepts.
Visualization of High Dimensional Data
An introduction to the concepts, techniques, and applications of data warehousing and data mining. Topics include the design and implementation of data warehouse and OLAP operations; data mining concepts and methods, such as association rule mining, pattern mining, classification, and clustering; and the applications of data-mining techniques to complex types of data in various fields. Advanced topics in machine learning and statistics will be covered.
Visualization as a tool for data analysis, recall, inference, and decision‐making. This advanced class will explore high‐dimensional data spaces and tools for visual description and presentation. Principles of effective visualization, including data‐visual mapping, interaction techniques, color theory, cognitive and perceptual psychology, and the human factors of visual depictions of data.