The cause looks to be from changes to CandidatesProcessor, CrawlURI, and SeedRecord in https://github.com/internetarchive/heritrix3/pull/76
The CandidatesProcessor clears the curi outlink list when it is done, and used to add them all to an outCandidates list. The outCandidates list was then read by SeedRecord to determine the redirect location. That list no longer exists and all calls to curi.getOutCandidates() were replaced with curi.getOutLinks(). This looks to be fine everywhere except SeedRecord, which tries to pull from the outLinks list after it has been cleared.
Removing the curi.getOutLinks().clear() from CandidatesProcessor should fix the problem, but I'm not sure if there is a reason to have the outlines cleared.
This may be a better solution: Allow the CandidatesProcessor to continue clearing the outlinks list, and instead store the redirect in the data attributes for seed CrawlURIs only.
Gordon, can you take a look?
We (Noah and I) merged https://github.com/internetarchive/heritrix3/pull/103 to address a related issue for https://webarchive.jira.com/browse/ARI-4084. Scoping was effected by clearing outlinks before the redirect of a seed was added to the list of seeds.