What is Proxy Mode and Why Should I Use It?
Proxy Mode is an "offline browsing" mode, allowing users to better evaluate which page elements were captured and which are still being pulled from live sites. Using proxy mode is an excellent way to check the quality and completeness of your archived content. Proxy mode allows you to exclusively browse the archived documents within a single Archive-It collection. When you browse your collection without a proxy, it is possible for embedded files in your archived websites to inadvertently redirect to their counterparts on the live web. This means that when you are looking at an archived web page, you could actually be seeing the live version of an embedded document rather than the archived version. When these redirects happen, it is difficult to discern whether your documents were archived successfully. Proxy mode prevents these redirects from happening. For QA purposes, it is important to browse your documents using proxy mode to be certain that your harvests are complete.
When you use proxy mode to browse your collection, you will only see the most recent archived version of each document. This means that for the best results it is important to check your archived collection for completeness and accuracy after a crawl has completed and before the next crawl begins.
Currently it is not possible to browse through https URLs in proxy mode. In many cases you can adjust an https URL to be http (by removing the s), however some sites/browsers force the use of https in certain situations and are thus not browsable in proxy mode (ex. Facebook.com). Our engineers are currently working on allowing these types of sites to be browsed in proxy mode.
For more information, this blog post from the Web Science and Digital Libraries Research Group at Old Dominion University, Zombies in the Archive, explains how the live web finds its way into web archives.
How Do I Set Up Proxy Mode?
To use proxy mode you need to make a manual adjustment to your web browser's settings. Once you have adjusted your browser you will only be able to view material from your archived collection, and not any content from the live web. To view sites on the live web you will need to adjust your browser back to its original settings. Below you can find complete set up instructions and some tools that will make using proxy mode very easy. Once you have your browser set up, you will be able to browse your archived collection in proxy mode.
If you would like a partner specialist to help walk you through the process of setting up and browsing in proxy mode, please email archive-itsupport at archive.org.
Instructions for Setting up Proxy Mode in All Web Browsers
Internet Explorer 9
1) From the top of your browser go to Tools > Internet Options
2) Choose the "Connections" tab
3) Click the button that says "LAN Settings", towards the bottom
4) Choose the "Use Automatic configuration script" option, and paste the following URL into the space provided: "http://wayback.archive-it.org/proxy.pac"
5) Click "OK", and you're ready to go!
1) From the top of your browser go to Tools > Internet Options
2) Under the "Advanced" section, choose the "Network" tab, the click the button next to the text, "Configure how Firefox connects to the Internet" labeled "Settings..."
3) In this next screen, choose the option "Automatic Proxy Configuration URL:", and paste the following URL into the space provided: "http://wayback.archive-it.org/proxy.pac"
4) Click "OK" twice, then you are ready to go!
1. From the top of your browser go to Chrome --> Preferences. This should open a new page in your browser
2. Click Show Advanced Settings at the bottom of that page
3. Under Network, click Change Proxy Settings - This should open your computer's system preferences.
4. In System Preferences, click Automatic Proxy Configuration, enter http://wayback.archive-it.org/proxy.pac
5. Follow directions for applying that rule (on a mac click OK then Apply)
Browsing in Proxy Mode
Once you have followed the proxy mode set up instructions, you will be able to click through archived links as you normally would in the Wayback Machine's regular viewing mode.
The easiest way to browse your archives in proxy mode is to work from a list of your seeds and/or specific URLS that are important for you to check for accuracy. Once you have a proxy enabled, you can just paste in the seed or exact url you are looking for into the browser's address bar (you don't need to include wayback-archive-it.org, the collection ID or capture date). For example, if the archival URL is: http://wayback.archive-it.org/193/20080508191419/http://www.sdhistory.org/, you would remove everything starting with wayback.archive-it.org through the date code, now the URL would simply be: http://www.sdhistory.org/. At this point, if proxy mode has been set up correctly, you will be prompted for a username and password:
When presented with this screen, for the "User Name" enter the collection ID that corresponds to the URL you are attempting to view (click here to learn more about collection numbers). You can leave the "Password" field empty.
If successful, you will know you are looking at the archived version, because your proxy settings have been changed, and you will see the archived website disclaimer at the top of the screen. You may need to reload your page.
Please note, if you are viewing a webpage from one collection, then enter the URL for an archived page from a different collection, you will likely encounter a "Not in Archive" screen.
To view content in a different collection, click the "Switch Collections" link in the yellow archival banner area (see image below). This will bring up the username / password box again (pictured above) that allows you to enter a new Collection ID.
Proxy mode will always display the most recent capture date of the website you are browsing.
Once the archived web page is loaded, click links and check to make the sure the documents you intended to archive are there.
1) Are the images and menus archived and displayed accurately?
2) Click links to make sure pages and documents you intended to archive are in your collection.
3) Media files: Check media files to make sure they work.
4) Special files: Open PDF's and other special files types to make sure they function properly.
5) Vital documents: Make sure vital documents were archived properly.