hadoop - Accesing file in Mapper through Distributed Cache -


I want to access the contents of the file distributed in my mapper. Below I wrote the code that generates the file name for the distributed cache. Please help me get access to the contents of the file

  Public square DistrictActionemptionMaparadabasebase tool mapper & lt; Longweightable, text, text, text & gt; {Text A = new text (); Path [] date = new path [0]; Public Zero Configuration (JobConfconfig) {try {dates = distributedCache.getLocalCacheFiles (conf); String astr = dates.toString (); A = new text (abstract); } Hold (IOException ioe) {System.err.println ("Captured files are being caught while exception:" + StringUtils.stringifyException (ioe)); }} Override public audio map (long-term appropriate key, text value, outputclalter & lt; text, text & gt; output, reporter reporter) throws IOException {string line = value. Tutorial (); (Path cachefile: date) for {output.collect (new text (line), new text (cacheFile.getName ())); }}}}}  

Try your configured () method instead:

  list & lt; String []> Lines; Path [] files = new path [0]; Public Zero Configure (JobConf Conf) {Lines = New Arreelist & lt; & Gt; (); Buffradder; {Files = try Cache.getLocalCacheFiles (Conf) distributed; SW = new BufferedReader (new FileReader (files [0] .toString ())); String line; While ((line = SW.readLine ())! = Null {lines.add (line.split (",")); // Now, each line entry is a string array, in which each element is column} SW.close (); } Hold (IOException ioe) {System.err.println ("Captured files are being caught while exception:" + StringUtils.stringifyException (ioe)); }}  

In this way, you will have the contents of files in the cached distributed in the variant lines (the first file in this case) in each line Entry represents a string array, which is divided by ',' Then the first column of the first row is lines.get (0) [0] , the second line is the third line lines.get (1) [2] , etc.


Comments