way to reenqueue failed url other than in original spot at the head of the queue

Description

See ARI-3013. A site that simply closes the connection when it should return a 404, which puts the affected url in the slow-retry cycle. Since it stays at the head of the queue, other urls which could likely be crawled successfully are starved. Quoting Gordon, "Perhaps a general policy that, faced with this sort of retryable error, round-robins among all URIs rather than maintains original ordering would be better as the default, or as an option."

Environment

None

Status

Assignee

Unassigned

Reporter

Noah Levitt

Labels

None

Group Assignee

None

ZendeskID

None

Estimated Difficulty

None

Actual Difficulty

None

Affects versions

Heritrix 3.1.0

Priority

Major
Configure