About 50 results
Open links in new tab
  1. frameworks - Simple explanation of MapReduce? - Stack Overflow

    Aug 26, 2008 · MapReduce is a method to process vast sums of data in parallel without requiring the developer to write any code other than the mapper and reduce functions. The map function takes …

  2. Good MapReduce examples - Stack Overflow

    Sep 12, 2012 · MapReduce is a framework originally developed at Google that allows for easy large scale distributed computing across a number of domains. Apache Hadoop is an open source …

  3. mapreduce - hadoop map reduce secondary sorting - Stack Overflow

    Aug 23, 2013 · Can any one explain me how secondary sorting works in hadoop ? Why must one use GroupingComparator and how does it work in hadoop ? I was going through the link given below and …

  4. Difference between combiner and partitioner - Stack Overflow

    Apr 11, 2019 · I am a newbie to MapReduce and I just can't figure out the difference in the partitioner and combiner. I know both run in the intermediate step between the map and reduce tasks and both …

  5. mapreduce - How to optimize shuffling/sorting phase in a hadoop job ...

    Dec 10, 2015 · mapreduce.shuffle.max.threads: Number of worker threads for copying the map outputs to reducers. mapreduce.reduce.shuffle.input.buffer.percent: How much of heap should be used for …

  6. How does the MapReduce sort algorithm work? - Stack Overflow

    MapReduce's use of input files and lack of schema support prevents the performance improvements enabled by common database system features such as B-trees and hash partitioning, though …

  7. Setting the number of map tasks and reduce tasks

    Jul 31, 2011 · For each input split a map task is spawned. So, over the lifetime of a mapreduce job the number of map tasks is equal to the number of input splits. mapred.map.tasks is just a hint to the …

  8. mapreduce - How does Hadoop perform input splits? - Stack Overflow

    5 Difference between block size and input split size. Input Split is logical split of your data, basically used during data processing in MapReduce program or other processing techniques. Input Split size is …

  9. c# - Map and Reduce in .NET - Stack Overflow

    Jan 9, 2009 · What scenarios would warrant the use of the "Map and Reduce" algorithm? Is there a .NET implementation of this algorithm?

  10. What is the purpose of shuffling and sorting phase in the reducer in ...

    Mar 3, 2014 · Then, the MapReduce job stops at the map phase, and the map phase does not include any kind of sorting (so even the map phase is faster). Tom White has been an Apache Hadoop …