Issues

files uploaded to action directory with http have .bin extension added, causing heritrix to ignore them
HER-1907
Ambiguity between srcset urls and data:image base64 encoded image
HER-2097
Add support for extracting URLs from img srcset attribute
HER-2094
ToeThread Fatal Exception: "kryo.SerializationException: Buffer limit exceeded" in BdbMultipleWorkQueues.get
HER-1996
why do we write the header "WARC-Truncated: length" in warc revisit records?
HER-1701
NPE in BdbMultipleWorkQueues.delete() -- queue stuck?
HER-507
heritrix hitting non existent URLs in wix.com/app-market
HER-2096
Webserver response 307 to 302 causes infinite redirect
HER-1560
Crawl M3U8 files and capture resources they describe
HER-2095
Heritrix ignores robots.txt
HER-2092
appCtx.getBean() does no longer work in scripting console
HER-2093
Are URLs including 'Japanese Full Space' supported?
HER-2089
java 8 keytool issue
HER-2085
CoderMalfunctionError: java.nio.BufferOverflowException
HER-527
Allow submission of non-login GET forms
HER-2078
heritrix is missing facility to shutdown from console
HER-2090
checkpointing gives error on Windows
HER-1906
Improve feedback after specifying errornous command line arguments
HER-2091
HostsReport issues
HER-2084
RuntimeException in AMQPUrlReceiver kills StarterRestarter?
HER-2088
JVM terminated without running Heritrix.
HER-2087
ServerNotModified WARC revisit records incorrectly record WARC-Payload-Digest
HER-2080
HTML extractor fails to extract CSS from a link tag
HER-2086
Expand hosts-report.txt with novel bytes, novel urls counts
HER-1500
duplicate user agent records in robots.txt cause overwriting of rules
HER-2083
1-25 of 2092