Projects
Filters
Dashboards
Apps
Create
NutchWAX
Classic software project
Back to project
Filters
All issues
Open issues
Done issues
Resolved recently
Updated recently
View all filters
Projects
/
Issues
Export Issues
Go to advanced search
Search issues
Submit
Assignee
Reporter
Status
Type
Switch to detail view
Type
Key
Summary
Assignee
Reporter
P
Status
Created
WAX
-
81
Corrupt script tag at end of page causes HTML parser infinite loop.
Aaron Binns
Aaron Binns
Open
Apr 19, 2011
WAX
-
80
Mime-type detection infinite loop due to control character in DOCTYPE declaration.
Aaron Binns
Aaron Binns
In Progress
Apr 18, 2011
WAX
-
79
Extract HTML meta tags for 'description' and 'keywords' and add to segment.
Aaron Binns
Aaron Binns
In Progress
Apr 15, 2011
WAX
-
76
Slow parsing
Aaron Binns
Aaron Binns
In Progress
Sep 1, 2010
WAX
-
15
Option to skip an ARC record based on size or other filtering policy
Aaron Binns
Aaron Binns
Open
Jul 18, 2008
WAX
-
83
nutchwax-0.13/src/java/org/archive/nutchwax/imagesearch/DocIndexer.java:309: error: method filter in class IndexingFilters cannot be applied to given types
Aaron Binns
Sam
Open
Mar 20, 2012
WAX
-
64
research sorting feature for NutchWAX
Hunter Stern
Hunter Stern
In Progress
Sep 21, 2009
WAX
-
75
Hacks to use with Hadoop-0.20 from Cloudera
Aaron Binns
Aaron Binns
Open
Jul 10, 2010
WAX
-
82
Nutch HTML parser infinite loop.
Aaron Binns
Aaron Binns
In Progress
Apr 21, 2011
WAX
-
78
HTML noindex and nofollow enforced in HTMLParser?
Aaron Binns
Aaron Binns
Open
Apr 15, 2011
WAX
-
60
DateAdder should have an option to determine if norms should be used.
Aaron Binns
Aaron Binns
Open
Jul 22, 2009
WAX
-
54
In IndexSearcher.translateHits(), when de-duping use a FieldSelector when loading the document to only load the site field.
Aaron Binns
Aaron Binns
Open
Jul 13, 2009
WAX
-
44
Add record to index for non-text documents
Aaron Binns
Aaron Binns
In Progress
May 20, 2009
WAX
-
5
Digest differs between ARCReader and Wayback index-arc.
Aaron Binns
Aaron Binns
Resolved
Jun 30, 2008
WAX
-
17
More aggressive collapsing by site in search results
Aaron Binns
Aaron Binns
Resolved
Jul 24, 2008
WAX
-
30
Nutchwax requires very long timeouts on remotely hosted arc files
Aaron Binns
Erik Hetzner
Resolved
Jan 27, 2009
WAX
-
33
Add URL canonicalization to pageranker
Aaron Binns
Aaron Binns
Resolved
Feb 18, 2009
WAX
-
31
contrib/archive/README.txt needs clarifications
Aaron Binns
Paul Baclace
Resolved
Feb 9, 2009
WAX
-
29
nutchwax home page issue tracker still points to sf.net
Aaron Binns
Erik Hetzner
Resolved
Jan 27, 2009
WAX
-
28
Investigate malformed URL report during date-adder
Aaron Binns
Aaron Binns
Resolved
Oct 20, 2008
WAX
-
14
Investigate why reading content from archive file uses such small chunks
Aaron Binns
Aaron Binns
Resolved
Jul 9, 2008
WAX
-
13
Add DFS read/write support to DateAdder
Aaron Binns
Aaron Binns
Resolved
Jul 8, 2008
WAX
-
18
Add reading of archive files from DFS
Aaron Binns
Aaron Binns
Resolved
Jul 28, 2008
WAX
-
40
Integrate nutchwax with Access Control Oracle
Aaron Binns
Lewis Crawford
Open
Mar 25, 2009
WAX
-
70
Cannot use rsync URLs, no handler for rsync protocol.
Aaron Binns
Aaron Binns
Open
Jan 12, 2010
WAX
-
77
JDK6u23 breaks GzippedInputStream & W/ARCReaders with different GZIP handling
Aaron Binns
Aaron Binns
In Progress
Apr 8, 2011
WAX
-
74
Add support for storing fields in compressed form.
Aaron Binns
Aaron Binns
Open
Mar 18, 2010
WAX
-
71
NutchWAX-required libraries not included in nutch-1.0.job
Aaron Binns
Aaron Binns
Open
Feb 20, 2010
WAX
-
72
Simply build system to copy NW files into Nutch dirs and use Nutch build.xml
Aaron Binns
Aaron Binns
Open
Feb 20, 2010
WAX
-
73
Change default value of searcher.fieldcache in nutch-site.xml to 'false'
Aaron Binns
Aaron Binns
In Progress
Feb 20, 2010
WAX
-
68
Compatibility with {index+segment}s created by NutchWAX 0.10.
Aaron Binns
Aaron Binns
Open
Oct 29, 2009
WAX
-
65
Some odd-ball characters display as '?' in search results.
Aaron Binns
Aaron Binns
In Progress
Oct 22, 2009
WAX
-
69
Class not found when importing within a Hadoop MR job.
Aaron Binns
Aaron Binns
Open
Jan 12, 2010
WAX
-
66
Index documents without crawldb nor linkdb.
Aaron Binns
Aaron Binns
In Progress
Oct 26, 2009
WAX
-
67
Nutch OpenOffice parser does not pass along metadata.
Aaron Binns
Aaron Binns
In Progress
Oct 26, 2009
WAX
-
63
LengthNormUpdater returning error code if no fields in index have norms is inconvenient.
Aaron Binns
Aaron Binns
Open
Sep 19, 2009
WAX
-
62
Add ability to configure HTTP headers to support cacheing.
Aaron Binns
Aaron Binns
In Progress
Aug 20, 2009
WAX
-
61
Change mime-type of OpenSearch XML response from text/xml to application/xml.
Aaron Binns
Aaron Binns
In Progress
Aug 20, 2009
WAX
-
20
bug in exacturl query
Aaron Binns
Aaron Binns
Resolved
Sep 8, 2008
WAX
-
55
NutchWaxBean's command-line searching should emit title along with other document metadata.
Aaron Binns
Aaron Binns
Open
Jul 14, 2009
WAX
-
56
Date-adder allows for duplicate dates to be added to a record.
Aaron Binns
Aaron Binns
Open
Jul 14, 2009
WAX
-
58
Need tool to update an existing index's norms based on pagerank information.
Aaron Binns
Aaron Binns
Open
Jul 22, 2009
WAX
-
57
nutchwax command-driver doesn't properly enclose arguments in quotes.
Aaron Binns
Aaron Binns
Open
Jul 16, 2009
WAX
-
59
Wrong log() function used in PageRankScoringFilter.
Aaron Binns
Aaron Binns
In Progress
Jul 22, 2009
WAX
-
53
IndexMerging parallel indexes fails when index is empty.
Aaron Binns
Aaron Binns
Open
Jul 7, 2009
WAX
-
52
Add option to NutchWaxBean to specify directory where index+segments are to be found.
Aaron Binns
Aaron Binns
Open
Jul 7, 2009
WAX
-
51
Enhance index merging to combine parallel indexes.
Aaron Binns
Aaron Binns
In Progress
Jul 7, 2009
WAX
-
47
Stop storing document key in "orig" field in index, synthesize it as needed from the "url" and "digest" fields.
Aaron Binns
Aaron Binns
In Progress
Jun 23, 2009
WAX
-
48
Use NutchWAX configurable query filter for site and url fields.
Aaron Binns
Aaron Binns
In Progress
Jun 23, 2009
WAX
-
45
Add ability to store but not index a field via ConfigurableIndexingFilter
Aaron Binns
Aaron Binns
In Progress
Jun 2, 2009
Give feedback
Showing 1-50 of 83
1
2