fbpx

Oracle Big Data Fundamentals Ed 2

////Oracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 2

Course ID: P4368 5 Days
   

Oracle Big Data Fundamentals Ed 2

Overview

In the Oracle Big Data Fundamentals course, you learn about big data, the technologies used in processing big data and Oracle’s solution to handle big data. You also learn to use Oracle Big Data Appliance to process big data and obtain a hands-on experience in using Oracle Big Data Lite VM. You identify how to acquire the raw data from a variety of sources and learn to use HDFS and Oracle NoSQL Database to store the data. You learn about data integration options available in Oracle Big Data. These include Oracle Big Data Connectors to move data to and from Oracle Database, Oracle Data Integrator and Oracle GoldenGate for Big Data which provide integration and synchronization capabilities for data unification of relational and Hadoop data, and Oracle Big Data SQL, which enables dynamic, integrated access for all of your data big data, whether it is stored in HDFS, NoSQL, or Oracle Database. Finally, you learn how to analyze your big data using Oracle Big Data SQL, Oracle Advance Analytics, and Oracle Big Data Spatial and Graph.

Learn To:

  • Define Big Data.
  • Describe Oracle’s Integrated Big Data Solution and its components.
  • Define Cloudera’s distribution of Hadoop and its core components and the Hadoop ecosystem.
  • Use the Hadoop Distributed File System (HDFS).
  • Acquire big data using the Command Line Interface, Flume, and Oracle NoSQL Database.
  • Process big data using MapReduce, YARN, Hive, Oracle XQuery for Hadoop, Solr, and Spark.
  • Integrate big data and warehouse data using Sqoop, Oracle Big Data Connectors, Copy to Hadoop, Oracle Data Integrator, and Oracle GoldenGate for big data, and Oracle Big Data SQL.
  • Analyze big data using Oracle Big Data SQL, Oracle Big Data Spatial and Graph, and Oracle Advanced Analytics technologies.
  • Use and manage Oracle Big Data Appliance.
  • Identify the key features and benefits of Oracle Big Data Cloud Service.
  • Identify the key features and benefits of Oracle Big Data Cloud Service – Compute Edition.

Benefits To You

You will benefit from this course as you define the term big data and discuss Oracle’s Big Data solution and use cases. You learn about Apache Hadoop and its core components: HDFS, YARN, and MapReduce. You will also learn about some of the major projects in the Hadoop ecosystem. You will learn how to acquire data into HDFS and Oracle NoSQL Database by using CLI, Flume, and Kafka. To process the data stored in HDFS, you run MapReduce and Spark jobs.

You also explore a range of analysis options, including Oracle Advanced Analytics (OAA) (comprised of Oracle Data Mining and Oracle R Enterprise), and Oracle Big Data Spatial and Graph.

You will learn about the Oracle Big Data Appliance, Oracle Big Data Cloud Service, and Oracle Big Data Cloud Service – Compute Edition. You will study case scenarios where Oracle Big Data stands as the perfect solution.

Description

AUDIENCE

  • Application Developers
  • Database Administrators
  • Database Administrators
  • Database Developers

CERTIFICATION

EXAM

Exam

INVESTMENT

Instructor-led / Virtual Instructor-led

Singapore: Upon Request

PREREQUISITES

Suggested Prerequisite

  • Database Basics and Administration
  • Exposure to Big Data

OBJECTIVES

  • Define Big Data
  • Describe Oracle’s Integrated Big Data Solution and its components
  • Define Cloudera’s distribution of Hadoop and its core components and the Hadoop ecosystem
  • Use the Hadoop Distributed File System (HDFS)
  • Acquire big data using the Command Line Interface, Flume, and Oracle NoSQL Database
  • Process big data using MapReduce, YARN, Hive, Oracle XQuery for Hadoop, Solr, and Spark
  • Integrate big data and warehouse data using Sqoop, Oracle Big Data Connectors, Copy to Hadoop, Oracle Data Integrator, and Oracle GoldenGate for big data, and Oracle Big Data SQL
  • Analyze big data using Oracle Big Data SQL, Oracle Big Data Spatial and Graph, and Oracle Advanced Analytics technologies
  • Use and manage Oracle Big Data Appliance
  • Identify the key features and benefits of Oracle Big Data Cloud Service

COURSE CONTENT

Module 1: Introduction

  • Reviewing the Available Big Data Documentation, Tutorials, and Other Resources
  • Course Road Map
  • Course Objectives
  • Starting the Oracle BDLite VM and accessing the Practice Files
  • Questions About You
  • Oracle Big Data Lite (BDLite) Virtual Machine (VM) Home Page

Module 2: Introducing Oracle Big Data Strategy

  • Big Data implementation examples
  • Importance of Big Data
  • Oracle strategy for Big Data: combining Big Data Processing Engines: Hadoop / NoSQL / RDBMS
  • Characteristics of Big Data
  • Big Data Opportunities: Some Examples
  • Big Data Challenges

Module 3: Using Oracle Big Data Lite Virtual Machine and Movieplex Application

  • Reviewing the Deployment Guide
  • Oracle Big Data Lite VM Home Page Sections
  • Introducing the Oracle Movieplex Case Study
  • Oracle Big Data Lite VM Used in this Course
  • Importing the Appliance File
  • Downloading and Running 7-zip Files to create Virtual Box Appliance File
  • Downloading and installing Oracle VM VirtualBox and its Extension Pack
  • Staring the Big Data Lite VM and Starting and Stopping Services

Module 4: Introduction to the Big Data Ecosystem

  • Cloudera’s Distribution Including Apache Hadoop (CDH)
  • Apache Hadoop
  • Types of Analysis That Use Hadoop
  • CDH Architecture and Components
  • Apache Hadoop Ecosystem
  • Computer Clusters and Distributed Computing
  • Types of Data Generated

Module 5: Introduction to the Hadoop Distributed File System

  • Sample Hadoop High Availability (HA) Cluster
  • HDFS Files and Blocks
  • Hadoop Distributed Filesystem (HDFS) Design Principles, Characteristics, and Key Definitions
  • Interacting With Data Stored in HDFS: Hue, Hadoop Client, WebHDFS, and HttpFS
  • DataNodes (DN) Daemons Functions
  • Writing a File to HDFS: Example

Module 6: Acquire Data using CLI, Fuse, Flume, and Kafka

  • Kafka topics
  • Additional Resources
  • Viewing File System Contents Using the CLI
  • What is Flume?
  • Overview of FuseDFS
  • Loading Data Using the CLI
  • Reviewing the Command Line Interface (CLI)

Module 7: Acquire and Access Data Using Oracle NoSQL Database

  • Oracle NoSQL models: Key-Value and Table
  • Accessing the KVStore
  • What is a NoSQL Database
  • Accessing the CLIs (Data, Admin, SQL)
  • Acquiring and Accessing Data in a NoSQL DB
  • HDFS Compared to NoSQL
  • Define Oracle NoSQL Database

Module 8: Introduction to MapReduce and YARN Processing Frameworks

  • Data Locality Optimization in Hadoop
  • Parallel Processing with MapReduce
  • YARN Architecture, Features, and Daemons
  • Hadoop Basic Cluster: MapReduce 1 Versus YARN (MR 2)
  • MapReduce Framework Features, Benefits, and Jobs
  • YARN Application Workflow
  • Word Count Examples

Module 9: Resource Management Using Yarn

  • Static Service Pools
  • Cloudera Manager Dynamic Resource Management: Example
  • Working with the Fair Scheduler
  • Cloudera Manager Resource Management Features
  • First In, First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler
  • Submitting and Monitoring a MapReduce Job Using YARN
  • Job Scheduling in YARN

Module 10: Overview of Apache Spark

  • Benefits of Using Spark
  • Running a Spark Application on YARN (yarn-cluster Mode)
  • Spark Interactive Shells: spark-shell and pyspark
  • Spark Application Components: Driver, Master, Cluster Manager, and Executors
  • Monitoring Spark Jobs Using YARN’s ResourceManager Web UI
  • Word Count Example by Using Interactive Scala
  • Spark Architecture

Module 11: Overview of Apache Hive

  • What is Hive?
  • How is Data Stored in HDFS?
  • Big Data SQL on Top of Hive Data
  • Organizing and Describing Data With Hive
  • Defining Tables Over HDFS
  • Use Case: Storing Clickstream Data
  • Hive Queries
  • Hadoop Architecture

Module 12: Overview of Cloudera Impala

  • Hadoop: Some Data Access/Processing Options
  • Cloudera Impala: Programming Interfaces
  • How Impala Works with Hive
  • Cloudera Impala
  • How Impala Fits Into the Hadoop Ecosystem
  • Overview of Cloudera Impala
  • Cloudera Impala: Supported Data Formats

Module 13: Using Oracle XQuery for Hadoop

  • XQuery Transformation and Basic Filtering
  • XML Review
  • Viewing the Completed Query in YARN’s ResourceManager
  • Running an OXH Query
  • OXH Features
  • Oracle XQuery for Hadoop (OXH)
  • Using OXH: Installation, Functions, Adapters, and Configuration Properties

Module 14: Overview of Solr

  • Cloudera Search: Features
  • Overview of Solr
  • Apache Solr (Cloudera Search)
  • Cloudera Search Tasks
  • Indexing in Cloudera Search
  • Types of Indexing
  • The solrctl Command

Module 15: Integrating Your Big Data

  • Comparing Big Data Processing Engines
  • Unifying Data: A Typical Requirement
  • Introducing Data Unification Options

Module 16: Batch Loading Options

  • Oracle Copy to Hadoop
  • Oracle Loader for Hadoop

Module 17: Using Oracle SQL Connector for HDFS

  • Using OSCH
  • Performance Tuning
  • Loading: Choosing a Connector
  • Parallelism and Performance
  • Batch and Dynamic Loading: Oracle SQL Connector for HDFS
  • OSCH Architecture
  • Features

Module 18: Using Oracle Data Integrator and Oracle GoldenGate for Big Data

  • Oracle GoldenGate for Big Data
  • ODI’s Declarative Design
  • Using ODI with Big Data Heterogeneous Integration with Hadoop Environments
  • Using ODI Studio
  • ODI Studio: Big Data Knowledge Modules
  • ETL and Synchronization: Oracle Data Integrator
  • ODI Knowledge Modules (KMs)Simpler Physical Design / Shorter Implementation Time
  • ODI Studio Components: Overview

Module 19: Using Oracle Big Data SQL

  • Query Performance Overview
  • Benefits: Virtualizes data access across Oracle Database, Hadoop and NoSQL stores
  • Overcoming Big Data Barriers
  • Barriers to Effective Big Data Adoption
  • Oracle Big Data SQL: The Hybrid Solution
  • Deployment Options

Module 20: Using Oracle Big Data Spatial and Graph

  • BDSG: Graph Analysis
  • Multimedia Analytics Framework
  • Deployment Options for Oracle BDSG
  • Oracle BDSG: Spatial Analysis
  • Graph and Spatial Analysis: All About Relationships
  • Additional Resources
  • Strategy (supported platforms, etc)

Module 21: Using Oracle Advanced Analytics

  • OAA: Oracle Data Mining
  • OAA: Oracle R Enterprise

Module 22: Oracle Big Data Deployment Options

  • BDA Hardware and Integrated and Optional Software
  • Introduction to the Oracle Big Data Cloud Service – Compute Edition
  • Running the Oracle BDA Configuration Generation Utility
  • Administering and Securing the Oracle BDA
  • Introduction to the Oracle Big Data Appliance
  • Oracle BDA Mammoth Software Deployment Bundle
  • Introduction to the Oracle Big Data Cloud Service

What’s Next

Subscribe to our mailing list for special offers and promotions.

Processing...
Thank you! Your subscription has been confirmed. You'll hear from us soon.
ErrorHere