skip to content

System: Searchable Directory of User Agents: Indexing Tools

 Tweet0 Shares0 Tweets

The following is a directory of user agents, including their source and general purpose as far as we can determine. Most entries link to an "official" site containing more detailed information. You can also paste a UA from your logs into the form below, hit [Go!] and see a list the relevant agents.

We currently have 946 distinct user agents in our database representing everything from search engines to software components and spambots. These have been collected from our log files over a number of years and researched manually.

Search for User Agent

To use this form just copy and paste an entire User-Agent string from your server log file into the input box and then submit the form. The search is case-sensitive so "nokia" will not match "Nokia".

Most user agent strings now contain a number of separate components so the search will return a list of everything that has a match in the database.

Analyse UA:

Indexing Tools

View Category:

This is software that enables local or remote indexing of web pages and other content for the purposes of setting up a search engine.

1) AlkalineBOT - alkaline.vestris.com/docs/alkaline-faq/
2) ASPseek - www.aspseek.org/
ASPseek is an Internet search engine software developed by SWsoft and licensed as free software under GNU GPL.
3) Beholder - www.mesadynamics.com/beholder.htm
Beholder allows you to quickly search for images on the web, also scanning local image folders and iPhoto libraries.
4) COMBINE - www.lub.lu.se/combine/
Combine is an open system for harvesting and threshing (indexing) Internet resources.
5) CrawlConvera - www.convera.com/
Convera is a provider of search and categorization solutions.
6) DataparkSearch - www.dataparksearch.org/
DataparkSearch Engine is a web-based search engine released under the GPL and designed to organize search within a website, group of websites, intranet or local system.
7) DepSpid - about.depspid.net/
The DepSpid spider visits domains, analyses links and finally calculates scores about the link dependencies between individual domains.
8) dtSearchSpider - www.dtsearch.com/spider.html
9) egothor - www.egothor.org/
EGOTHOR is an Open Source, high-performance, full-featured text search engine written entirely in Java.
10) Enterprise_Search - www.innerprise.net/es-spider.asp
11) facebookexternalhit - www.facebook.com/externalhit_uatext.php
The Facebook system retrieves certain images or details only after a user provides us with a link. You may have found this page because a Facebook user sent a link from your website to other Facebook users.
12) FDSE - www.xav.com/scripts/search/
FDSE is an easy-to-install search engine for local and remote sites. It returns fast, accurate results from a template-driven architecture.
13) findlinks - wortschatz.uni-leipzig.de/nextlinks/findlinks.html
The objective of FindLinks is to provide NextLinks with data.
14) Ful/Text - www.hummingbird.com/products/searchserver/
Hummingbird SearchServer
15) GammaSpider - www.gammasite.com/
GammaSite develops and markets automatic categorization and tagging software
16) grub-client - grub.org/
Leveraging the power of distributed computing, Grub allows everyone with an Internet connection to participate in the last frontier of discovery. By downloading the unique screensaver, you can donate your computer's unused bandwidth to probing the hidden depths of the Web.
17) gsa-crawler - www.google.com/enterprise/gsa/
Google Search Appliance
18) holmes - www.ucw.cz/holmes/
Sherlock Holmes is a universal search engine - a system for gathering and indexing of textual data (text files, web pages, ...), both locally and over the network.
19) htdig - www.htdig.org/
A complete world wide web indexing and searching system for a small domain or intranet. Source code (GPL).
20) InsumaScout - www.insuma.de/insuma/en/SEscout.html
InsumaScout searches data situated in open data sources.
21) IXE Crawler
22) JavaCrawler
The JavaCrawler, a prototype next generation MetaCrawler written in Java, supports most of the features already present in the MetaCrawler
23) k2spider - www.verity.com/products/ultraseek/fab.html
24) larbin - larbin.sourceforge.net/
Larbin is a web crawler (also called (web) robot, spider, scooter...). It is intended to fetch a large number of web pages to fill the database of a search engine.
25) Mnogosearch - mnogosearch.org/
mnoGoSearch is a full-featured web search engine software for intranet and internet servers. mnoGoSearch for UNIX is a free software covered by the GNU General Public License.
26) mobileGate-Spider - www.mobilegate.at/steiermarksuche.php
Unsere Suchmaschinen-Technologie garantiert präzise und themenrelevante Resultate.
27) Mozilla/4.7 (Windows; I; Win95) - www.panopticsearch.com/
The Panoptic system is based on research by the Enterprise Search Group in CSIRO and the ANU in Canberra, Australia
28) MS Search - www.microsoft.com/sharepoint/
By default, the string for SharePoint Portal Server is:
Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 4.0 Robot) Microsoft
29) NextopiaBOT - www.nextopia.com/
30) NLCrawler - www.nlsearch.com/
Northern Light provides search and content integration technology and solutions for enterprises and individuals.
31) nuSearch Spider - www.nurelm.com/
32) Nutch - www.nutch.org/docs/en/bot.html
When we crawl to populate our index, we advertise the "User-agent" string "NutchOrg". If you see the agent "Nutch" or "NutchCVS", that's probably a developer testing a new version of our robot, or someone running their own instance
33) Oracle Ultra Search - otn.oracle.com/products/ultrasearch/
Ultra Search can be used to search across Collaboration Suite Components, corporate Web servers, databases, mail servers, fileservers and Oracle10g Portal instances.
34) PageFetcher-Google-CoOp; - www.google.com/coop/
Google Co-op is a platform that enables you to customize the web search experience for users of both Google and your own website.
35) perform_crawl - ivia.ucr.edu/useragents.shtml
The Nalanda iVia Focused Crawler (NIFC) is a focused Web crawler.
36) Project XP5 - marty.anstey.ca/projects/robots/
37) RDSIndexer - www.dytech.com.au/projects/RDS.asp
Information Resource Management Tool/Web Portal
38) Reaper - marty.anstey.ca/projects/robots/reaper.html
39) SemioTagger - www.entrieva.com/entrieva/products/semiotagger.asp?Hdr=semiotagger
Entrieva's SemioTagger is a categorization and indexing engine
40) SiteScanGa - sitescanga.com/
SiteScan is designed to help you configure Google Analytics.
41) SwishSpider - swish-e.org/
SWISH-E is a fast, powerful, flexible, free, and easy to use system for indexing collections of Web pages or other files
42) TeraText AGLS Harvester - www.teratext.com/
A text database system and search engine built for handling large text collections
43) TeraXML - www.doclinx.com/products/ftxml.html
44) T-H-U-N-D-E-R-S-T-O-N-E - www.thunderstone.com/texis/site/pages/webinator.html
45) Ultraseek - www.verity.com/products/ultraseek/
46) URL_Spider_Pro - www.innerprise.net/usp-bi.asp
With URL Spider Pro, you can create a search engine for any topic, no matter how specific.
47) vspider - www.macromedia.com/cfusion/search/?term=vspider
ColdFusion MX includes several Verity utilities to diagnose and manage your collections. These tools include the mkvdk, rcvdk, rck2, and vspider utilities...
48) WebRACE - www.cs.ucy.ac.cy/Projects/eRACE/webrace.html
WebRACE is a prototype HTTP Retrieval, Annotation and Caching Engine developed in Java
49) WSB - websearchbench.cs.uni-dortmund.de/websearch/features.html
WebSearchBench consists of the two software components Web Crawler and Search Engine (Repository, Indexer and search software)
50) Xbot - cdrnet.ch/projects/xbot/
The xbot software is a modular bot environment based on the .net framework for autonomous neuronal network, script or map driven mobile omniwheel robots using the SV203 controller.

For more information on the user agents listed you can click on the associated link. If you think any of the information here is incorrect or misleading please let us know using the Feedback link below.

Please be aware that we do not add user agents to the database on request, but rather wait to see them in our log files.

Browse User Agents by Category

Browser Extensions (42)
Browser extensions are programs that change or enhnace your web browser. Some of them also collect data by sending information on your browsing habits back to a central server.
Content Management (13)
Data Collection - Commercial (47)
These are sites that collect information for commercial benefit. As far as we are aware no useful information or reports are provided to the public.
Data Collection - Research (29)
These agents are conducting research on the WWW. They may also offer commercial services.
Devices (23)
Mobile phones and other gadgets with browser technology.
Download Managers (39)
Programs that enable users to download or extract information from a website or web server.
Indexing Tools (50)
This is software that enables local or remote indexing of web pages and other content for the purposes of setting up a search engine.
Link Checking Utilities (41)
This is software that conducts remote or local link checking.
Media Players (5)
Applications for playing music, video and other media over the Internet.
Other Resources (12)
Links to online resources relating to robots and spiders.
Proxies (7)
If several clients request the same content, the proxy can deliver that content from its cache, rather than requesting it from the origin server each time.
RSS/Atom Aggregators (43)
These are browser extensions or search spiders that focus on indexing or aggregating RSS and Atom feeds.
Search Engine Spiders (220)
These agents conduct Internet-wide indexing for various search engines.
Server Platforms (6)
Server Software (31)
Site Monitoring Services (15)
Software Components (58)
These are code libraries or application development packages that can be used to build Internet-related applications. How they are used depends on the developer.
Spambots? (45)
These are programs that are used predominately to harvest email addresses, find open guestbooks to post to, etc. They may also have legitimate uses.
Unclassified (174)
The following user agents have either not been identified or do not fit neatly into other categories. New agents appear every day that have limited lifespans. Most (but not all) legitimate user agents identify themselves with a URI or email address.
Validation Tools (10)
These are programs and sites that can be used to validate various aspects of your site: HTML, CSS, META tags, etc.
Web Browsers (36)

< System

Send a message to The Art of Web:


used only for us to reply, and to display your gravatar.

<- copy the digits from the image into this box

press <Esc> or click outside this box to close

Post your comment or question
top