Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: sntaclus

You’re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?

A.

You cannot enforce this, since client code can always override this value

B.

Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

C.

Set dfs.block.size to 128 M on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

D.

Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

E.

Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

Your cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. What is the result when you execute: hadoop jar SampleJar MyClass on a client machine?

A.

SampleJar.Jar is sent to the ApplicationMaster which allocates a container for SampleJar.Jar

B.

Sample.jar is placed in a temporary directory in HDFS

C.

SampleJar.jar is sent directly to the ResourceManager

D.

SampleJar.jar is serialized into an XML file which is submitted to the ApplicatoionMaster

You have recently converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map and reduce tasks (resource allocation) tasks when they run jobs: A developer wants to know how specify to reduce tasks when a specific job runs. Which method should you tell that developers to implement?

A.

MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of “tasks” into memory and virtual cores, thus eliminating the need for a developer to specify the number of reduce tasks, and indeed preventing the developer from specifying the number of reduce tasks.

B.

In YARN, resource allocations is a function of megabytes of memory in multiples of 1024mb. Thus, they should specify the amount of memory resource they need by executing –D mapreduce-reduces.memory-mb-2048

C.

In YARN, the ApplicationMaster is responsible for requesting the resource required for a specific launch. Thus, executing –D yarn.applicationmaster.reduce.tasks=2 will specify that the ApplicationMaster launch two task contains on the worker nodes.

D.

Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1) and MapReduce version 2 (MRv2) on YARN. Thus, executing –D mapreduce.job.reduces-2 will specify reduce tasks.

E.

In YARN, resource allocation is function of virtual cores specified by the ApplicationManager making requests to the NodeManager where a reduce task is handeled by a single container (and thus a single virtual core). Thus, the developer needs to specify the number of virtual cores to the NodeManager by executing –p yarn.nodemanager.cpu-vcores=2

Table schemas in Hive are:

A.

Stored as metadata on the NameNode

B.

Stored along with the data in HDFS

C.

Stored in the Metadata

D.

Stored in ZooKeeper

For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log files stored?

A.

Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode

B.

Cached in the YARN container running the task, then copied into HDFS on job completion

C.

In HDFS, in the directory of the user who generates the job

D.

On the local disk of the slave mode running the task

A user comes to you, complaining that when she attempts to submit a Hadoop job, it fails. There is a Directory in HDFS named /data/input. The Jar is named j.jar, and the driver class is named DriverClass.

She runs the command:

Hadoop jar j.jar DriverClass /data/input/data/output

The error message returned includes the line:

PriviligedActionException as:training (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.lib.input.invalidInputException:

Input path does not exist: file:/data/input

What is the cause of the error?

A.

The user is not authorized to run the job on the cluster

B.

The output directory already exists

C.

The name of the driver has been spelled incorrectly on the command line

D.

The directory name is misspelled in HDFS

E.

The Hadoop configuration files on the client do not point to the cluster

Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on YARN. Can you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still have a functional cluster?

A.

Yes. The daemon will receive data from the NameNode to run Map tasks

B.

Yes. The daemon will get data from another (non-local) DataNode to run Map tasks

C.

Yes. The daemon will receive Map tasks only

D.

Yes. The daemon will receive Reducer tasks only

Given:

You want to clean up this list by removing jobs where the State is KILLED. What command you enter?

A.

Yarn application –refreshJobHistory

B.

Yarn application –kill application_1374638600275_0109

C.

Yarn rmadmin –refreshQueue

D.

Yarn rmadmin –kill application_1374638600275_0109