Possible race condition in org.archive.util.FileUtils.ensureWriteableDirectory() leading to data loss

Description

Possible race condition when creating writable directory. In the example below, multiple threads attempt to write the the same not-yet-existant directory. This results in a NullPointerException, and the data is not written. In this example, the dns information seems to have been cached and used, but not written. The warc numbers seem to increment as well, even though the write failed, leading to non-contiguous numbers.

2012-05-09 17:55:04.657 SEVERE thread-17 org.archive.crawler.framework.ToeThread.recoverableProblem() Problem java.lang.NullPointerException occured when trying to process 'dns:www.newt.org' at step ABOUT_TO_BEGIN_PROCESSOR in
Which have stack trace:
java.lang.NullPointerException
       at org.archive.modules.writer.WARCWriterProcessor.write(WARCWriterProcessor.java:284)
       at org.archive.modules.writer.WARCWriterProcessor.innerProcessResult(WARCWriterProcessor.java:209)
       at org.archive.modules.Processor.process(Processor.java:142)
       at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
       at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151)
java.io.IOException: Failed to create directory: /3/crawldata/e12/20120509175424
       at org.archive.util.FileUtils.ensureWriteableDirectory(FileUtils.java:677)
       at org.archive.modules.writer.WriterPoolProcessor.calcOutputDirs(WriterPoolProcessor.java:446)
       at org.archive.io.WriterPoolMember.createFile(WriterPoolMember.java:204)
       at org.archive.io.WriterPoolMember.checkSize(WriterPoolMember.java:181)
       at org.archive.modules.writer.WARCWriterProcessor.write(WARCWriterProcessor.java:231)
       at org.archive.modules.writer.WARCWriterProcessor.innerProcessResult(WARCWriterProcessor.java:209)
       at org.archive.modules.Processor.process(Processor.java:142)
       at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
       at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151)

Environment

None

Status

Assignee

Unassigned

Reporter

Adam Miller

Labels

None

Group Assignee

None

ZendeskID

None

Estimated Difficulty

None

Actual Difficulty

None

Priority

Major
Configure