You are on page 1of 1

Assignment # 3

1. Write the java version of mapper.py and reducer.py and execute it on hadoop.

2. You are supposed to build an index of the web crawler you developed in your first assignment.
You should put a file in HDFS containing all the gathered data by the crawler in the form of key
(weblink) and value (all the text). Given a set of words, your map reduce job shall return all the
links found for a certain word.
Please use your own intellect in how to organize your input file for easier parsing of keys and
values as an input to the map reduce job.

You might also like