Friday, February 7, 2014

Hadoop Developer Certification resources .

Cut and paste of some resources on Hadoop Developer Certification.

http://www.fromdev.com/2010/12/interview-questions-hadoop-mapreduce.html

"To clear this test you need to have a very good understanding of the flow of data in Hadoop, i.e. how the files are stored and read. You should be able to visualize on how the MapReduce programs interact with data and how they process them as key-value pairs."

QUESTION NO: 1
When is the earliest point at which the reduce method of a given Reducer can be called?
A. As soon as at least one mapper has finished processing its input split.
B. As soon as a mapper has emitted at least one record.
C. Not until all mappers have finished processing all records.
D. It depends on the InputFormat used for the job.
Answer: C
Explanation: In a MapReduce job reducers do not start executing the reduce method until the all Map jobs have completed. Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The programmer defined reduce method is called only after all the mappers have finished.

QUESTION NO: 2
Which describes how a client reads a file from HDFS?
A. The client queries the NameNode for the block location(s). The NameNode returns the block location(s) to the client. The client reads the data directory off the DataNode(s).
B. The client queries all DataNodes in parallel. The DataNode that contains the requested data responds directly to the client. The client reads the data directly off the DataNode.
C. The client contacts the NameNode for the block location(s). The NameNode then queries the DataNodes for block locations. The DataNodes respond to the NameNode, and the NameNode redirects the client to the DataNode that holds the requested data block(s). The client then reads the data directly off the DataNode.
D. The client contacts the NameNode for the block location(s). The NameNode contacts the
DataNode that holds the requested data block. Data is transferred from the DataNode to the
NameNode, and then from the NameNode to the client.
Answer: C
Explanation: The Client communication to HDFS happens using Hadoop HDFS API. Client
applications talk to the NameNode whenever they wish to locate a file, or when they want to
add/copy/move/delete a file on HDFS. The NameNode responds the successful requests by
returning a list of relevant DataNode servers where the data lives. Client applications can talk directly to a DataNode, once the NameNode has provided the location of the data.
Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, How the Client communicates with HDFS?

QUESTION NO: 3
You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, IntWritable values. Which interface should your class implement?
A. Combiner <Text, IntWritable, Text, IntWritable>
B. Mapper <Text, IntWritable, Text, IntWritable>
C. Reducer <Text, Text, IntWritable, IntWritable>
D. Reducer <Text, IntWritable, Text, IntWritable>
E. Combiner <Text, Text, IntWritable, IntWritable>
Answer: D

QUESTION NO: 4
Indentify the utility that allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?
A. Oozie
B. Sqoop
C. Flume
D. Hadoop Streaming
E. mapred
Answer: D
Explanation: Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer.
Reference:
http://hadoop.apache.org/common/docs/r0.20.1/streaming.html (Hadoop Streaming, second sentence)
QUESTION NO: 5
How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of MapReduce?
A. Keys are presented to reducer in sorted order; values for a given key are not sorted.
B. Keys are presented to reducer in sorted order; values for a given key are sorted in ascending order.
C. Keys are presented to a reducer in random order; values for a given key are not sorted.
D. Keys are presented to a reducer in random order; values for a given key are sorted in
ascending order.
Answer: A
Explanation: Reducer has 3 primary phases:
1. Shuffle
The Reducer copies the sorted output from each Mapper using HTTP across the network.
2. Sort
The framework merge sorts Reducer inputs by keys (since different Mappers may have output the The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged.
SecondarySort
To achieve a secondary sort on the values returned by the value iterator, the application should extend the key with the secondary key and define a grouping comparator. The keys will be sorted using the entire key, but will be grouped using the grouping comparator to decide which keys and values are sent in the same call to reduce.
3. Reduce
In this phase the reduce(Object, Iterable, Context) method is called for each <key, (collection of values)> in the sorted inputs.
The output of the reduce task is typically written to a RecordWriter via TaskInputOutputContext.write(Object, Object).
The output of the Reducer is not re-sorted.
Reference: org.apache.hadoop.mapreduce, Class
Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>













22 comments:

  1. Hadoop Developer online training| Hadoop Developer ...
    www.21cssindia.com/courses/hadoop-online-training-182.html‎
    hadoop developer online training, hadoop developer training, hadoop developer course contents, hadoop developer, hadoop developer enquiry, hadoop ...- Employees to learn at their own pace and maintain control of learning “where, when and how” with boundless access 24/7by 21st Century Software Solutions. contact@21cssindia.com ---- Call Us +917386622889

    ReplyDelete
  2. Higher Level Abstractions for MapReduce - 2 - Hive - Introduction - Hive QL - Hive User Defined Functions - Hive Use Cases - NOSQL Databases - NoSQL Concepts - Review of RDBMS - - Need for NOSQL - Brewers CAP Theorem - ACID vs BASE - Different Types of NoSQL Databases - Key Value - Columnar - Document - Graph - Columnar Databases - Hadoop Ecosystem - HBASE vs Cassandra - HBASE Architecture - HBASE Data Modeling - HBASE Commands - HBASE Coprocessors - Endpoints - HBASE Coprocessors - Observers - SQOOP - Flume & OOZIE.. - http://www.21cssindia.com/courses/hadoop-online-training-182.html
    Employees to learn at their own pace and maintain control of learning “where, when and how” with boundless access 24/7by 21st Century Software Solutions. contact@21cssindia.com

    ReplyDelete
  3. Hadoop Developer online training| Hadoop Developer ...
    www.21cssindia.com/courses/hadoop-online-training-182.html
    hadoop developer online training, hadoop developer training, hadoop developer course contents, hadoop developer, hadoop developer enquiry, hadoop ...Many more… | Call Us +917386622889
    Visit: http://www.21cssindia.com/courses.html

    ReplyDelete
  4. Thanks for InformationHadoop Course will provide the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action. This course will further examine related technologies such as Hive, Pig, and Apache Accumulo. HADOOP Online Training

    ReplyDelete
  5. Hi, thanks for sharing your the tips for clearing the certification with us. This would really be helpful for the newbies in understanding the basics of Hadoop Developer Trainings. Would also like to suggest the newbies seeking for more information to visit this page as well - https://intellipaat.com/hadoop-developer-training/

    ReplyDelete
  6. Thank you so much for sharing this worthwhile to spent time on. You are running a really awesome blog. Keep up this good work Big Data Hadoop Training in Chennai

    ReplyDelete
  7. Learning new technology would give oneself a true confidence in the current emerging Information Technology domain. With the knowledge of big data the most magnificent cloud computing technology one can go the peek of data processing. As there is a drastic improvement in this field everyone are showing much interest in pursuing this technology. Your content tells the same about evolving technology. Thanks for sharing this.

    Hadoop Training in Chennai | Best hadoop training institute in chennai | Big Data Hadoop Training in Chennai | Hadoop Course in Chennai

    ReplyDelete
  8. I have finally found a Worth able content to read. The way you have presented information here is quite impressive. I have bookmarked this page for future use. Thanks for sharing content like this once again. Keep sharing content like this.

    Software testing training in chennai | Testing training in chennai | Manual testing training in Chennai

    ReplyDelete
  9. There is a huge demand for professional big data analysts who are able to use the software which is used to process the big data in order to get accurate results. MNC's are looking for professionals who can process their data so that they can get into a accurate business decision which would eventually help them to earn more profits, they can serve their customers better, and their risk is lowered.
    big data training in chennai|big data training|big data course in chennai|big data training chennai|big data hadoop training in chennai

    ReplyDelete
  10. SAS stands for statistical analysis system which is a analysis tool developed by SAS institute and with the help of this tool data driven decisions can be taken which is helpful for the bsuiness.
    SAS training in Chennai | SAS course in Chennai | SAS training institute in Chennai

    ReplyDelete
  11. Thank you for your guide to with upgrade information about Hadoop
    Hadoop Admin Online Course

    ReplyDelete

  12. This is quite educational arrange. It has famous breeding about what I rarity to vouch. Colossal proverb.
    This trumpet is a famous tone to nab to troths. Congratulations on a career well achieved. This arrange is synchronous s informative impolites festivity to pity. I appreciated what you ok extremely here 


    Selenium training in bangalore
    Selenium training in Chennai
    Selenium training in Bangalore
    Selenium training in Pune
    Selenium Online training

    ReplyDelete
  13. Thanks For sharing Your information The Information Shared Is Very Valuable Please Keep updating Us Time Just Went On Redaing The Article Python Online Course Devops Online Course Data Science Online Course Aws Science Online Course

    ReplyDelete
  14. Having read this I thought it was extremely informative. I appreciate you spending some time and energy to put this informative article together. I once again find myself spending way too much time both reading and leaving theme comments. But so what, it was still worth it!

    ReplyDelete
  15. it very student Hadoop Developer Certification resources all your information very interesting..

    Latest ieee Paper Titles 2020

    Final Year Projects 2020

    Final Year Ieee Projects 2020

    ReplyDelete
  16. I would really like to read some personal experiences like the way, you've explained through the above article. I'm glad for your achievements and would probably like to see much more in the near future. Thanks for share.
    Salesforce Training in Chennai

    Salesforce Online Training in Chennai

    Salesforce Training in Bangalore

    Salesforce Training in Hyderabad

    Salesforce training in ameerpet

    Salesforce Training in Pune

    Salesforce Online Training

    Salesforce Training

    ReplyDelete
  17. Thanks for your informative article, Your post helped me to understand the future and career prospects & Keep on updating your blog with such awesome article.
    oracle training in chennai

    oracle training in omr

    oracle dba training in chennai

    oracle dba training in omr

    ccna training in chennai

    ccna training in omr

    seo training in chennai

    seo training in omr

    ReplyDelete