How to get Mapreduce output in a single file instead of multiple files in Hadoop Cluster on Google Cloud? -

- June 15, 2010

when running jar on local hadoop multi-node cluster, can see output reducer output , single file every job.

when run same jar on google cloud, multiple output files (part-r-0000*). instead need output written single file. how do that?

well 1 simple solution configure job run 1 reducer. seems on google cloud default setting different. see here how that: setting number of reducers in mapreduce job in oozie workflow

another way deal have concatenating script run @ end of map reduce job pieces part-r files, ie

cat *part-r* >>alloutput

may bit more complex if have headers , need copy local first.

Search This Blog

GCM

How to get Mapreduce output in a single file instead of multiple files in Hadoop Cluster on Google Cloud? -

Comments

Post a Comment

Popular posts from this blog

android - Hide only the Action bar on Scroll not action bar tabs -

matlab - "Contour not rendered for non-finite ZData" -

delphi - Indy UDP Read Contents of Adata -