HTML noindex and nofollow enforced in HTMLParser?

Description

Reading through the NutchWAX HTML parser code, it looks like the 'noindex' and 'nofollow' HTML meta-tags are enforced in the parser code.

nutch-1.1/src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java method getParse().

More investigation needed.

Environment

None
Obsolete

Assignee

Aaron Binns

Reporter

Aaron Binns

Labels

None

Issue Category

None

Group Assignee

None

ZendeskID

None

Estimated Difficulty

None

Actual Difficulty

None

Priority

Major
Configure