How to read the .doc or .docx file
How to read the.doc file using Apache pig Latin programming using map reduce
A = load './pig/test.docx';
B = foreach A generate flatten(TextLoader((chararray)$0)) as word;
C = group B by word;
D = foreach C generate COUNT(B), group;
store D into './wordcountone';
No comments:
Post a Comment