Nutch OpenOffice parser does not pass along metadata.

Description

The Nutch OpenOffice parser does not pass along the metadata set by Nutch(WAX). Thus, when we get to the indexing stage, the metadata is missing – so there is no document title, segment, digest, etc. to use for indexing. In fact, it causes the indexer code to explode since it assumes at least the segment and digest fields will be set to non-null values. Without the metadata, those fields are null which triggers a NullPointerException in Lucene.

Environment

None

Activity

Show:
Aaron Binns
October 26, 2009, 10:58 PM

Fixed. SVN 2832. One-line change/fix to include metadata object passed in from caller.

Fixed

Assignee

Aaron Binns

Reporter

Aaron Binns

Labels

None

Issue Category

None

Group Assignee

None

ZendeskID

None

Estimated Difficulty

None

Actual Difficulty

None

Fix versions

Priority

Major
Configure