Issues

Select view

Select search mode

Task
Ambiguity between srcset urls and data:image base64 encoded image
Unassigned
Adam Miller
Major
Fixed
Mar 8, 2019
Apr 24, 2019
Improvement
Crawl M3U8 files and capture resources they describe
Unassigned
Barbara Miller
Major
Not a Bug
Oct 28, 2016
Oct 28, 2016
Improvement
Add support for extracting URLs from img srcset attribute
Unassigned
Adam Miller
Major
Fixed
Oct 20, 2016
Mar 8, 2019
Question
Are URLs including 'Japanese Full Space' supported?
Unassigned
Masahiro Shimada
Minor
Fixed
Oct 28, 2015
May 26, 2016
Bug
JVM terminated without running Heritrix.
Unassigned
programmer
Critical
Duplicate
Sep 16, 2015
Sep 23, 2015
Bug
java 8 keytool issue
Unassigned
Luck Colors
Blocker
Fixed
Aug 13, 2015
May 6, 2016
Improvement
HostsReport issues
Unassigned
Kristinn Sigurðsson
Major
Fixed
Jul 13, 2015
Jan 15, 2016
Bug
WARCWriterProcessor no longer prints hop path and link context for outlinks in meta data records
Unassigned
Adam Miller
Major
Fixed
Mar 26, 2015
Mar 27, 2015
Bug
spam
Unassigned
c
Major
Cool Story, Bro
Mar 23, 2015
Mar 23, 2015
Bug
ServerNotModified WARC revisit records incorrectly record WARC-Payload-Digest
Unassigned
Kristinn Sigurðsson
Major
Fixed
Mar 11, 2015
Aug 20, 2015
New Feature
Allow submission of non-login GET forms
Unassigned
Hunter Stern
Minor
Obsolete
Dec 9, 2014
Mar 29, 2016
Dec 25, 2014
Bug
Seeds Report missing redirect URLs for 301 / 302 responses
Unassigned
Adam Miller
Major
Fixed
Oct 22, 2014
Nov 12, 2014
Bug
IllegalStateException "got suspicious value" in IpAddressSetDecideRule when
Unassigned
Kristinn Sigurðsson
Major
Fixed
Oct 3, 2014
Oct 4, 2014
Bug
WarcWriterProcessor writes full body of revisited items
Unassigned
Kristinn Sigurðsson
Major
Fixed
Jul 25, 2014
Jul 25, 2014
Bug
NullPointerException when getting cookies
Unassigned
Kristinn Sigurðsson
Major
Fixed
Jul 16, 2014
Oct 3, 2014
Bug
ExtractorHTML shouldn't treat codebase contents as embeds
Unassigned
Kristinn Sigurðsson
Major
Fixed
Jun 4, 2014
Jun 4, 2014
Bug
dont't use dns search domains on name resolution
Unassigned
samuel stoller
Major
Duplicate
Apr 22, 2014
Apr 24, 2014
Bug
deadlock in frontier
Unassigned
Noah Levitt
Critical
Fixed
Apr 2, 2014
Apr 25, 2014
Bug
Flash extractor not parsing initactions section of swf for possible links
Unassigned
Hunter Stern
Major
Fixed
Feb 13, 2014
Mar 1, 2014
Feb 20, 2014
Bug
Heritrix adding port to Host header
Unassigned
Hunter Stern
Major
Fixed
Feb 3, 2014
Jul 17, 2014
Feb 13, 2014
Bug
WorkQueueFrontier.deleteURIs mishandles deletions from retired queues
Unassigned
Kristinn Sigurðsson
Minor
Fixed
Jan 15, 2014
Jan 16, 2014
Bug
support url with two consecutive question marks "??"
Unassigned
Noah Levitt
Major
Fixed
Dec 7, 2013
Dec 7, 2013
Bug
"Failed to start bean 'bdb'" when trying to build and launch a job which was stopped or to build and launch a job from a checkpoint.
Unassigned
Arkiver
Critical
Duplicate
Dec 3, 2013
Dec 12, 2013
Improvement
on checkpoint w/arcs are closed and new ones started; add option not to do that
Unassigned
Noah Levitt
Major
Fixed
Oct 11, 2013
Oct 11, 2013
New Feature
option to forget all but latest checkpoint
Unassigned
Noah Levitt
Major
Done
Sep 11, 2013
Sep 11, 2013
Bug
checkpoint-resumed crawl job stats are inconsistent-- some start from 0, some resume from checkpoint numbers
Unassigned
Noah Levitt
Major
Fixed
Sep 7, 2013
Sep 11, 2013
Bug
url agnostic duplicate urls don't appear as such in stats if revisit record not written
Unassigned
Noah Levitt
Major
Fixed
Jun 8, 2013
Jun 8, 2013
New Feature
Add link-extraction support for JSON files.
Unassigned
Dominic Dela Cruz
Minor
Fixed
May 16, 2013
Jun 3, 2013
May 20, 2013
Improvement
Class Link is unnecessary
Unassigned
Kristinn Sigurðsson
Major
Fixed
May 15, 2013
Jul 10, 2014
Bug
final crawl reports written after crawl is "FINISHED"
Unassigned
Noah Levitt
Major
Fixed
Apr 23, 2013
Apr 23, 2013
Bug
xml representation of job url can be invalid
Unassigned
Noah Levitt
Major
Fixed
Mar 29, 2013
Jan 7, 2014
Bug
ARCReaderFactory.get(String, InputStream, boolean) doesn't support uncompressed arcs
Unassigned
Noah Levitt
Major
Fixed
Feb 6, 2013
Feb 6, 2013
New Feature
improved form-login support: improved detection/triggering, dynamic fields
Unassigned
Gordon Mohr
Major
Done
Jan 10, 2013
Jan 7, 2014
Bug
Checkpoints recover error on windows
Unassigned
Andres Aguilar
Minor
Fixed
Jan 2, 2013
Dec 12, 2013
Bug
refactor to use archive-commons
Unassigned
Noah Levitt
Major
Done
Dec 20, 2012
Dec 21, 2012
New Feature
custom extractor that constructs outlinks from strings found in content
Unassigned
Noah Levitt
Major
Fixed
Nov 17, 2012
Jan 13, 2013
New Feature
url-agnostic content digest revisit deduplication
Unassigned
Noah Levitt
Major
Fixed
Sep 26, 2012
Jan 13, 2013
Bug
usedBaseForVia annotation repeated many times
Unassigned
Noah Levitt
Major
Fixed
Sep 5, 2012
Sep 5, 2012
Bug
fetchHTTP httpBindAddress setting doesn't take effect
Unassigned
Noah Levitt
Major
Fixed
Aug 28, 2012
Aug 28, 2012
Bug
XML representation for /engine/job/<jobName>/beans returns incorrect url for named beans
Unassigned
Adam Miller
Major
Fixed
Aug 21, 2012
Sep 5, 2012
Bug
XML representation for /engine/job/<jobName>/beans uses root node of type "script" instead of "beans"
Unassigned
Adam Miller
Major
Fixed
Aug 21, 2012
Aug 31, 2012
Bug
FetchHTTP tries every url twice without credentials before sending credentials
Unassigned
Noah Levitt
Major
Fixed
Jul 19, 2012
Jan 7, 2014
Bug
credentials cached to server never used
Unassigned
Noah Levitt
Major
Fixed
Jul 17, 2012
Jul 17, 2012
Improvement
Make ArchiveRecord.getPosition() public instead of protected.
Unassigned
Aaron Binns
Major
Fixed
Jul 9, 2012
Jul 16, 2012
Jul 9, 2012
Bug
schema-less relative URI not resolved correctly
Unassigned
Kenji Nagahashi
Major
Fixed
Jun 21, 2012
Jun 28, 2012
Bug
removing interface MultiReporter
Unassigned
Travis Wellman
Major
Fixed
Jun 5, 2012
Jun 5, 2012
New Feature
option to set bdb cache size as an absolute number instead of a percentage
Unassigned
Noah Levitt
Major
Fixed
Apr 20, 2012
May 2, 2012
Improvement
existing content length decide rule is not flexible enough
Unassigned
Noah Levitt
Major
Fixed
Apr 18, 2012
May 2, 2012
Bug
SEVERE error in FetchWhois on some government domains
Unassigned
Kenji Nagahashi
Major
Fixed
Apr 17, 2012
May 2, 2012
New Feature
define scope in terms of ip addresses
Unassigned
Travis Wellman
Major
Fixed
Apr 3, 2012
Apr 13, 2012
1-50 of 1000+
...