Done issues

Sensible output for requesting page of results past the end.
WAX-27
Add option to omit storing of content in segment
WAX-34
Add strict/loose option to DateAdder for revisit lines with extra data on end
WAX-19
Observe content size limit on importing.
WAX-3
Add DFS read/write support to DateAdder
WAX-13
Digest differs between ARCReader and Wayback index-arc.
WAX-5
Add URL canonicalization to pageranker
WAX-33
Option to skip ARC record import based on HTTP status code of content
WAX-16
Investigate why so many PDFs fail to parse.
WAX-8
Add a "field setter" filter to set a field to a static value in the Lucene document during indexing.
WAX-23
Add XML elements containing all search URL params for self-link generation
WAX-26
500 error - java.lang.NegativeArraySizeException
WAX-32
nutchwax home page issue tracker still points to sf.net
WAX-29
Allow for blank lines and comment lines in manifest file.
WAX-21
Entire file not imported
WAX-9
Various code clean-ups based on code review using PMD tool.
WAX-22
Nutchwax requires very long timeouts on remotely hosted arc files
WAX-30
contrib/archive/README.txt needs clarifications
WAX-31
Investigate malformed URL report during date-adder
WAX-28
Investigate why reading content from archive file uses such small chunks
WAX-14
Add reading of archive files from DFS
WAX-18
Implementor/user-provided XSLT for OpenSearch results
WAX-4
Change config to that URL filters are not applied during link inversion
WAX-7
testing, please ignore
WAX-1
Change metadata field name in search results from "arcname" to "filename"
WAX-11
Add metadata field "fileoffset"
WAX-12
Date queries cause TooManyClauses exceptions
WAX-2
Add "exacturl" metadata field to indexing so it can be searched as-is, not parsed/tokenized like the "url" field.
WAX-10
Add utility/tool to dump unique values of a field in an index.
WAX-25
More aggressive collapsing by site in search results
WAX-17
bug in exacturl query
WAX-20
DateAdder fails due to uncaught exception in URL canonicalization
WAX-24
Change DateAdder to allow for implementation of URLCanonicalizer to be defined in property.
WAX-6
issue 1 of 33

Sensible output for requesting page of results past the end.

Description

If you query the search server with a starting position past the last result, you get an HTML 500 error page containing an exception stack trace of the form:
java.lang.NegativeArraySizeException
org.apache.nutch.searcher.Hits.getHits(Hits.java:61)
org.apache.nutch.searcher.OpenSearchServlet.doGet(OpenSearchServlet.java:154)
org.archive.access.nutch.NutchwaxOpenSearchServlet.doGet(NutchwaxOpenSearchServlet.java:76)
javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

This is stupid. We should return the same results page as normal, but with either no results, the same as the last page of results, or with a useful message.

Environment

None

Status

Assignee

Aaron Binns

Reporter

Aaron Binns

Labels

None

Group Assignee

None

ZendeskID

None

Estimated Difficulty

None

Actual Difficulty

None

Fix versions

Priority

Major
Configure