Site search: Includes both URL search as well as keyword search. Keywords are derived from the anchor text of all webpages linking to a host. Site search functionality is currently viewable in the new Wayback Machine at https://web.archive.org.
Media search: Media search takes an archived web media resource (such as an image) and “tokenizes” its URL name by turning the filename into individual words which then become the text for a search index. An example of URL tokenization search can be seen in GifCities, where the search engine is powered by the words in the (in this case) .gif filenames. Tokenization provides a way to allow for search of resources that themselves may contain no text.
All search indexing at the Internet Archive is done using ElasticSearch, an open-source and widely utilized search tool. ElasticSearch is used across the Internet Archive for both web and non-web search and includes and monitored and maintained search cluster for high performance and easy addition of multiple indicies.