a location of your choosing Microsoft MOC 20775 SQL Server Training - Dynamics Edge

a location of your choosing Microsoft Official Course 20775 SQL Server Training -> Performing Data Engineering on Microsoft HD Insight

Dynamics Edge4.67 4.67 out of 50 stars, based on 80 reviews.*

There are many dates to choose from for Dynamics Edge MOC Class 20775 exemplary SQL Server training course which is the most convenient. Customized training may also be possible.

About this course

The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Who should Attend:

The primary audience for this course is data engineers, data architects, data scientists, and data developers who plan to implement big data engineering workflows on HDInsight.

After completing this course, students will be able to:

Course Outline

Module 1: Getting Started with HDInsight

This module introduces Hadoop, the MapReduce paradigm, and HDInsight.

Lessons

Lab : Working with HDInsight

Module 2: Deploying HDInsight Clusters

This module provides an overview of the Microsoft Azure HDInsight cluster types, in addition to the creation and maintenance of the HDInsight clusters. The module also demonstrates how to customize clusters by using script actions through the Azure Portal, Azure PowerShell, and the Azure command-line interface (CLI). This module includes labs that provide the steps to deploy and manage the clusters.

Lessons

Lab : Managing HDInsight clusters with the Azure Portal

Module 3: Authorizing Users to Access Resources

This module provides an overview of non-domain and domain-joined Microsoft HDInsight clusters, in addition to the creation and configuration of domain-joined HDInsight clusters. The module also demonstrates how to manage domain-joined clusters using the Ambari management UI and the Ranger Admin UI. This module includes the labs that will provide the steps to create and manage domain-joined clusters.

Lessons

Lab : Authorizing Users to Access Resources

Module 4: Loading data into HDInsight

This module provides an introduction to loading data into Microsoft Azure Blob storage and Microsoft Azure Data Lake storage. At the end of this lesson, you will know how to use multiple tools to transfer data to an HDInsight cluster. You will also learn how to load and transform data to decrease your query run time..

Lessons

Lab : Loading Data into your Azure account

Module 5: Troubleshooting HDInsight

In this module, you will learn how to interpret logs associated with the various services of Microsoft Azure HDInsight cluster to troubleshoot any issues you might have with these services. You will also learn about Operations Management Suite (OMS) and its capabilities.

Lessons

Lab : Troubleshooting HDInsight

Module 6: Implementing Batch Solutions

In this module, you will look at implementing batch solutions in Microsoft Azure HDInsight by using Hive and Pig. You will also discuss the approaches for data pipeline operationalization that are available for big data workloads on an HDInsight stack.

Lessons

Lab : Implement Batch Solutions

Module 7: Design Batch ETL solutions for big data with Spark

This module provides an overview of Apache Spark, describing its main characteristics and key features. Before you start, it’s helpful to understand the basic architecture of Apache Spark and the different components that are available. The module also explains how to design batch Extract, Transform, Load (ETL) solutions for big data with Spark on HDInsight. The final lesson includes some guidelines to improve Spark performance.

Lessons

Lab : Design Batch ETL solutions for big data with Spark.

Module 8: Analyze Data with Spark SQL

This module describes how to analyze data by using Spark SQL. In it, you will be able to explain the differences between RDD, Datasets and Dataframes, identify the uses cases between Iterative and Interactive queries, and describe best practices for Caching, Partitioning and Persistence. You will also look at how to use Apache Zeppelin and Jupyter notebooks, carry out exploratory data analysis, then submit Spark jobs remotely to a Spark cluster.

Lessons

Lab : Performing exploratory data analysis by using iterative and interactive queries

Module 9: Analyze Data with Hive and Phoenix

In this module, you will learn about running interactive queries using Interactive Hive (also known as Hive LLAP or Live Long and Process) and Apache Phoenix. You will also learn about the various aspects of running interactive queries using Apache Phoenix with HBase as the underlying query engine.

Lessons

Lab : Analyze data with Hive and Phoenix

Module 10: Stream Analytics

The Microsoft Azure Stream Analytics service has some built-in features and capabilities that make it as easy to use as a flexible stream processing service in the cloud. You will see that there are a number of advantages to using Stream Analytics for your streaming solutions, which you will discuss in more detail. You will also compare features of Stream Analytics to other services available within the Microsoft Azure HDInsight stack, such as Apache Storm. You will learn how to deploy a Stream Analytics job, connect it to the Microsoft Azure Event Hub to ingest real-time data, and execute a Stream Analytics query to gain low-latency insights. After that, you will learn how Stream Analytics jobs can be monitored when deployed and used in production settings.

Lessons

Lab : Implement Stream Analytics

Module 11: Implementing Streaming Solutions with Kafka and HBase

In this module, you will learn how to use Kafka to build streaming solutions. You will also see how to use Kafka to persist data to HDFS by using Apache HBase, and then query this data.

Lessons

Lab : Implementing Streaming Solutions with Kafka and HBase

Module 12: Develop big data real-time processing solutions with Apache Storm

This module explains how to develop big data real-time processing solutions with Apache Storm.

Lessons

Lab : Developing big data real-time processing solutions with Apache Storm

Module 13: Create Spark Streaming Applications

This module describes Spark Streaming; explains how to use discretized streams (DStreams); and explains how to apply the concepts to develop Spark Streaming applications.

Lessons

Lab : Building a Spark Streaming Application

Prerequisites

This course requires that you meet the following prerequisites:

More SQL Server training classes

*NOTE: if an average rating and rating count are shown on this page, they are based on all reviews associated with Dynamics Edge that are shown on the review page, and are not restricted to reviews only for the particular courses offered on this page.