Delve into Big Data essentials, explore data types, and gain insights into Hadoop components like YARN, MapReduce, HDFS, and Spark. Discover foundations to excel in the growing Big Data field.

data.tar.gz

HADOOP_HOME

JAVA_HOME

HDFS_NAMENODE_USER

HDFS_DATANODE_USER

HDFS_SECONDARYNAMENODE_USER

YARN_RESOURCEMANAGER_USER

YARN_NODEMANAGER_USER

HADOOP_CONF_DIR

ZK_HOME

PIG_HOME

AvroWriteExample

AvroReadExample

AvroGeneratedCodeReadExample

AvroGeneratedCodeWriteExample

AvroRPCExample

ParquetReadExampleJob

ParquetWriteExampleJob

ParquetAvroReadExampleJob

ParquetAvroWriteExampleJob

ParquetProjectionReadExampleJob

SequenceFileReadExampleJob

SequenceFileWriteExampleJob

SequenceFileSyncPointExampleJob

TestCarMapperJob

TestCarReducerJob

CarCounterMrProgramJob

MyLiveAppJob

DataNodeWebUI2

YarnWebUI

YarnWebUI-copy

YarnWebUI-copy-copy

JHS-UI

Spark-UI-copy

Spark-History-Server-UI-3

This course offers a one-of-a-kind rich and interactive experience to learn the fundamentals and basics of Big Data. Throughout this course, you will have plenty of opportunities to get your hands dirty with functioning Hadoop clusters.

You will start off by learning about the rise of Big Data as well as the different types of data like structured, unstructured, and semi-structured data. You will then dive into the fundamentals of Big Data such as YARN (yet another resource manager), MapReduce, HDFS (Hadoop Distributed File System), and Spark.

By the end of this course, you will have the foundations in place to start working with Big Data, which is a massively growing field.

Introduction to Big Data and Hadoop

## Exploring MapReduce Runs

In this lesson, we vary the number of reducers and see how that affects the output. Here we'll increase the number of reducers to 3.

Connect to the terminal below and execute the commands. Each command is explained later in the lesson. You can read the explanation first and then execute the commands in the terminal.

# Exploring MapReduce Runs

In this lesson, we vary the number of reducers and see how that affects the output. Here we'll increase the number of reducers to 3.

Connect to the terminal below and execute the commands. Each command is explained later in the lesson. You can read the explanation first and then execute the commands in the terminal.

This lesson continues with the discussion on running a MapReduce job.

Exploring MapReduce Runs

Hadoop

YARN

Map Reduce

HDFS

Spark

Input & Output Formats

Misc

Quiz

Reference: Replication

Reference: Partitioning

Reference: Transactions

Reference: Issues in Distributed Systems

Exploring MapReduce Runs

Exploring MapReduce Runs