"Education is the most powerful weapon which you can use to change the world"

Oracle Big Data Fundamentals

Introduction

·       Questions About You

·       Course Objectives

·       Course Road Map

·       Oracle Big Data Lite (BDLite) Virtual Machine (VM) Home Page

·       Starting the Oracle BDLite VM and accessing the Practice Files

·       Reviewing the Available Big Data Documentation, Tutorials, and Other Resources

Introducing Oracle Big Data Strategy

·       Characteristics of Big Data

·       Importance of Big Data

·       Big Data Opportunities: Some Examples

·       Big Data Challenges

·       Big Data implementation examples

·       Oracle strategy for Big Data: combining Big Data Processing Engines: Hadoop / NoSQL / RDBMS

Using Oracle Big Data Lite Virtual Machine and Movieplex Application

·       Oracle Big Data Lite VM Used in this Course

·       Oracle Big Data Lite VM Home Page Sections

·       Reviewing the Deployment Guide

·       Downloading and installing Oracle VM VirtualBox and its Extension Pack

·       Downloading and Running 7-zip Files to create Virtual Box Appliance File

·       Importing the Appliance File

·       Staring the Big Data Lite VM and Starting and Stopping Services

·       Introducing the Oracle Movieplex Case Study

Introduction to the Big Data Ecosystem Computer Clusters and Distributed

·       Computing Apache Hadoop

·       Types of Analysis That Use Hadoop

·       Types of Data Generated

·       Apache Hadoop Core Components: HDFS, MapReduce (MR1), and YARN (MR2)

·       Apache Hadoop Ecosystem

·       Cloudera’s Distribution Including Apache Hadoop (CDH)

·       CDH Architecture and Components

Introduction to the Hadoop Distributed File System

·       Hadoop Distributed Filesystem (HDFS) Design Principles, Characteristics, and Key Definitions

·       Sample Hadoop High Availability (HA) Cluster

·       HDFS Files and Blocks

·       Active and Standby Daemons (Services) Functions

·       DataNodes (DN) Daemons Functions

·       Writing a File to HDFS: Example

·       Interacting With Data Stored in HDFS: Hue, Hadoop Client, WebHDFS, and HttpFS

Acquire Data using CLI, Fuse, Flume, and Kafka

·       Reviewing the Command Line Interface (CLI)

·       Viewing File System Contents Using the CLI

·       FS Shell Commands

·       Loading Data Using the CLI

·       Overview of FuseDFS

·       What is Flume?

·       Kafka topics

·       Additional Resources

Acquire and Access Data Using Oracle NoSQL Database

·       What is a NoSQL Database

·       RDBMS Compared to NoSQL

·       HDFS Compared to NoSQL

·       Define Oracle NoSQL Database

·       Oracle NoSQL models: Key-Value and Table

·       Acquiring and Accessing Data in a NoSQL DB

·       Accessing the CLIs (Data, Admin, SQL) Accessing the KVStore

Introduction to MapReduce and YARN Processing Frameworks

·       MapReduce Framework Features, Benefits, and Jobs

·       Parallel Processing with MapReduce

·       Word Count Examples

·       Data Locality Optimization in Hadoop

·       Submitting and Monitoring a MapReduce Job

·       YARN Architecture, Features, and Daemons

·       YARN Application Workflow

·       Hadoop Basic Cluster: MapReduce 1 Versus YARN (MR 2)

Resource Management Using Yarn

·       Job Scheduling in YARN

·       First In, First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler

·       Cloudera Manager Resource Management Features

·       Static Service Pools

·       Working with the Fair Scheduler

·       Cloudera Manager Dynamic Resource Management: Example

·       Submitting and Monitoring a MapReduce Job

·       Using YARN Using the YARN application Command

Overview of Apache Spark

·       Benefits of Using Spark

·       Spark Architecture

·       Spark Application Components: Driver, Master, Cluster Manager, and Executors

·       Running a Spark Application on YARN (yarn-cluster Mode)

·       Resilient Distributed Dataset (RDD)

·       Spark Interactive Shells: spark-shell and pyspark

·       Word Count Example by Using Interactive Scala

·       Monitoring Spark Jobs Using YARN's ResourceManager Web UI

Overview of Apache Hive

·       What is Hive?

·       Use Case: Storing Clickstream Data

·       Hadoop Architecture

·       How is Data Stored in HDFS?

·       Organizing and Describing Data With Hive

·       Big Data SQL on Top of Hive Data

·       Defining Tables Over

·       HDFS Hive Queries

Overview of Cloudera Impala

·       Overview of Cloudera Impala

·       Hadoop: Some Data Access/Processing Options

·       Cloudera Impala

·       Cloudera Impala: Key Features

·       Cloudera Impala: Supported Data Formats

·       Cloudera Impala: Programming Interfaces

·       How Impala Fits Into the Hadoop Ecosystem

·       How Impala Works with Hive

Using Oracle XQuery for Hadoop

·       XML Review

·       Oracle XQuery for Hadoop (OXH)

·       OXH Features

·       OXH Data Flow

·       Using OXH: Installation, Functions, Adapters, and Configuration Properties

·       Running an OXH Query

·       XQuery Transformation and Basic Filtering

·       Viewing the Completed Query in YARN's ResourceManager

Overview of Solr

·       Overview of Solr

·       Apache Solr (Cloudera Search)

·       Cloudera Search: Key Capabilities

·       Cloudera Search: Features

·       Cloudera Search Tasks

·       Indexing in Cloudera Search

·       Types of Indexing

·       The solrctl Command

Integrating Your Big Data

·       Unifying Data: A Typical Requirement

·       Comparing Big Data Processing Engines

·       Introducing Data Unification Options

·       When To Use These Options?

Batch Loading Options

·       Apache Sqoop

·       Oracle Loader for Hadoop

·       Oracle Copy to Hadoop

Using Oracle SQL Connector for HDFS

·       Batch and Dynamic Loading: Oracle SQL Connector for HDFS

·       OSCH Architecture

·       Using OSCH Features

·       Parallelism and Performance

·       Performance Tuning

·       Key Benefits

·       Loading: Choosing a Connector

Using Oracle Data Integrator and Oracle GoldenGate for Big Data

·       ETL and Synchronization: Oracle Data Integrator

·       ODI’s Declarative Design

·       ODI Knowledge Modules (KMs)Simpler Physical Design / Shorter Implementation Time

·       Using ODI with Big Data Heterogeneous Integration with Hadoop Environments

·       Using ODI Studio

·       ODI Studio Components: Overview

·       ODI Studio: Big Data Knowledge Modules

·       Oracle GoldenGate for Big Data

Using Oracle Big Data SQL

·       Barriers to Effective Big Data Adoption

·       Overcoming Big Data Barriers

·       Oracle Big Data SQL: The Hybrid Solution

·       Benefits: Virtualizes data access across Oracle Database, Hadoop and NoSQL stores

·       Using Oracle Big Data SQL

·       Query Performance Overview

·       Deployment Options

Using Oracle Big Data Spatial and Graph

·       Graph and Spatial Analysis: All About Relationships

·       What is Oracle Big Data Spatial and Graph (BDSG)?

·       Strategy (supported platforms, etc)

·       BDSG: Graph Analysis

·       Oracle BDSG: Spatial Analysis

·       Multimedia Analytics Framework

·       Deployment Options for Oracle BDSG

·       Additional Resources

Using Oracle Advanced Analytics

·       Oracle Advanced Analytics (OAA)

·       OAA: Oracle Data Mining

·       OAA: Oracle R Enterprise

Oracle Big Data Deployment Options

·       Introduction to the Oracle Big Data Appliance

·       Running the Oracle BDA Configuration Generation Utility

·       Oracle BDA Mammoth Software Deployment Bundle

·       Using the Oracle BDA mammoth Utility

·       BDA Hardware and Integrated and Optional Software

·       Administering and Securing the Oracle BDA

·       Introduction to the Oracle Big Data Cloud Service

·       Introduction to the Oracle Big Data Cloud Service – Compute Edition