| | Ambiguity between srcset urls and data:image base64 encoded image | | | | | Fixed | Mar 8, 2019 | Apr 24, 2019 | | |
| | Crawl M3U8 files and capture resources they describe | | | | | Not a Bug | Oct 28, 2016 | Oct 28, 2016 | | |
| | Add support for extracting URLs from img srcset attribute | | | | | Fixed | Oct 20, 2016 | Mar 8, 2019 | | |
| | Are URLs including 'Japanese Full Space' supported? | | | | | Fixed | Oct 28, 2015 | May 26, 2016 | | |
| | JVM terminated without running Heritrix. | | | | | Duplicate | Sep 16, 2015 | Sep 23, 2015 | | |
| | java 8 keytool issue | | | | | Fixed | Aug 13, 2015 | May 6, 2016 | | |
| | HostsReport issues | | | | | Fixed | Jul 13, 2015 | Jan 15, 2016 | | |
| | WARCWriterProcessor no longer prints hop path and link context for outlinks in meta data records | | | | | Fixed | Mar 26, 2015 | Mar 27, 2015 | | |
| | spam | | | | | Cool Story, Bro | Mar 23, 2015 | Mar 23, 2015 | | |
| | ServerNotModified WARC revisit records incorrectly record WARC-Payload-Digest | | | | | Fixed | Mar 11, 2015 | Aug 20, 2015 | | |
| | Allow submission of non-login GET forms | | | | | Obsolete | Dec 9, 2014 | Mar 29, 2016 | Dec 25, 2014 | |
| | Seeds Report missing redirect URLs for 301 / 302 responses | | | | | Fixed | Oct 22, 2014 | Nov 12, 2014 | | |
| | IllegalStateException "got suspicious value" in IpAddressSetDecideRule when | | | | | Fixed | Oct 3, 2014 | Oct 4, 2014 | | |
| | WarcWriterProcessor writes full body of revisited items | | | | | Fixed | Jul 25, 2014 | Jul 25, 2014 | | |
| | NullPointerException when getting cookies | | | | | Fixed | Jul 16, 2014 | Oct 3, 2014 | | |
| | ExtractorHTML shouldn't treat codebase contents as embeds | | | | | Fixed | Jun 4, 2014 | Jun 4, 2014 | | |
| | dont't use dns search domains on name resolution | | | | | Duplicate | Apr 22, 2014 | Apr 24, 2014 | | |
| | deadlock in frontier | | | | | Fixed | Apr 2, 2014 | Apr 25, 2014 | | |
| | Flash extractor not parsing initactions section of swf for possible links | | | | | Fixed | Feb 13, 2014 | Mar 1, 2014 | Feb 20, 2014 | |
| | Heritrix adding port to Host header | | | | | Fixed | Feb 3, 2014 | Jul 17, 2014 | Feb 13, 2014 | |
| | WorkQueueFrontier.deleteURIs mishandles deletions from retired queues | | | | | Fixed | Jan 15, 2014 | Jan 16, 2014 | | |
| | support url with two consecutive question marks "??" | | | | | Fixed | Dec 7, 2013 | Dec 7, 2013 | | |
| | "Failed to start bean 'bdb'" when trying to build and launch a job which was stopped or to build and launch a job from a checkpoint. | | | | | Duplicate | Dec 3, 2013 | Dec 12, 2013 | | |
| | on checkpoint w/arcs are closed and new ones started; add option not to do that | | | | | Fixed | Oct 11, 2013 | Oct 11, 2013 | | |
| | option to forget all but latest checkpoint | | | | | Done | Sep 11, 2013 | Sep 11, 2013 | | |
| | checkpoint-resumed crawl job stats are inconsistent-- some start from 0, some resume from checkpoint numbers | | | | | Fixed | Sep 7, 2013 | Sep 11, 2013 | | |
| | url agnostic duplicate urls don't appear as such in stats if revisit record not written | | | | | Fixed | Jun 8, 2013 | Jun 8, 2013 | | |
| | Add link-extraction support for JSON files. | | | | | Fixed | May 16, 2013 | Jun 3, 2013 | May 20, 2013 | |
| | Class Link is unnecessary | | | | | Fixed | May 15, 2013 | Jul 10, 2014 | | |
| | final crawl reports written after crawl is "FINISHED" | | | | | Fixed | Apr 23, 2013 | Apr 23, 2013 | | |
| | xml representation of job url can be invalid | | | | | Fixed | Mar 29, 2013 | Jan 7, 2014 | | |
| | ARCReaderFactory.get(String, InputStream, boolean) doesn't support uncompressed arcs | | | | | Fixed | Feb 6, 2013 | Feb 6, 2013 | | |
| | improved form-login support: improved detection/triggering, dynamic fields | | | | | Done | Jan 10, 2013 | Jan 7, 2014 | | |
| | Checkpoints recover error on windows | | | | | Fixed | Jan 2, 2013 | Dec 12, 2013 | | |
| | refactor to use archive-commons | | | | | Done | Dec 20, 2012 | Dec 21, 2012 | | |
| | custom extractor that constructs outlinks from strings found in content | | | | | Fixed | Nov 17, 2012 | Jan 13, 2013 | | |
| | url-agnostic content digest revisit deduplication | | | | | Fixed | Sep 26, 2012 | Jan 13, 2013 | | |
| | usedBaseForVia annotation repeated many times | | | | | Fixed | Sep 5, 2012 | Sep 5, 2012 | | |
| | fetchHTTP httpBindAddress setting doesn't take effect | | | | | Fixed | Aug 28, 2012 | Aug 28, 2012 | | |
| | XML representation for /engine/job/<jobName>/beans returns incorrect url for named beans | | | | | Fixed | Aug 21, 2012 | Sep 5, 2012 | | |
| | XML representation for /engine/job/<jobName>/beans uses root node of type "script" instead of "beans" | | | | | Fixed | Aug 21, 2012 | Aug 31, 2012 | | |
| | FetchHTTP tries every url twice without credentials before sending credentials | | | | | Fixed | Jul 19, 2012 | Jan 7, 2014 | | |
| | credentials cached to server never used | | | | | Fixed | Jul 17, 2012 | Jul 17, 2012 | | |
| | Make ArchiveRecord.getPosition() public instead of protected. | | | | | Fixed | Jul 9, 2012 | Jul 16, 2012 | Jul 9, 2012 | |
| | schema-less relative URI not resolved correctly | | | | | Fixed | Jun 21, 2012 | Jun 28, 2012 | | |
| | removing interface MultiReporter | | | | | Fixed | Jun 5, 2012 | Jun 5, 2012 | | |
| | option to set bdb cache size as an absolute number instead of a percentage | | | | | Fixed | Apr 20, 2012 | May 2, 2012 | | |
| | existing content length decide rule is not flexible enough | | | | | Fixed | Apr 18, 2012 | May 2, 2012 | | |
| | SEVERE error in FetchWhois on some government domains | | | | | Fixed | Apr 17, 2012 | May 2, 2012 | | |
| | define scope in terms of ip addresses | | | | | Fixed | Apr 3, 2012 | Apr 13, 2012 | | |