We're updating the issue view to help you get more done. 

HTML extractor fails to extract CSS from a link tag

Description

I've observed this on several sites that seem to use the same content management system.

The CSS files are declared like thus (presumably for compatibility with older browsers):

1 2 <!--[if (lt IE 9)&(!IEMobile) ]><link href="/skin/basic9k/main-oldie.css?v7" rel="stylesheet" /><![endif]--><!--[if gte IE 9]><!--> <link href="/skin/basic9k/main.css?v7" rel="stylesheet" /><!--<![endif]-->}}

Heritrix does extract the "main-oldie.css" but fails to get the CSS file for modern browsers, main.css.

This is almost certainly due to the malformed comment tags around the link tag. But since browsers handle this, Heritrix should as well.

Environment

None

Status

Assignee

Unassigned

Reporter

Kristinn Sigurðsson

Priority

Major