Prerequisite: CIS 321
The course will discuss data management techniques for storing and analyzing very large amounts of data. The emphasis will be on columnar databases and on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data. In addition the discussions will focus on applications of Big Data in internet advertising, healthcare and social network analysis. Topics include: Introduction to the Big Data problem. Current challenges, trends, and applications, Columnar stores, distributed databases, Map-Reduce paradigm and the Hadoop ecosystem, Locality Sensitive Hashing (LSH), Dimensionality reduction, Data streams, unstructured data processing, NoSQL, and NewSQL.