This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Server Report


This is an automated email reply to acknowledge your message to 
slurp@inktomi.com

Slurp is Inktomi Corporation's web-indexing robot. It collects documents
from the World Wide Web to build a searchable index for search services
using the Inktomi search engine. For more information or to see a list
of partners with whom we work, please see:

http://www.inktomi.com/products/web_search/partners.html

For some frequently asked questions on Slurp and Inktomi's web crawling
and search technology, please see later in this email.

Some web site administrators do not want robots to index their site, or 
certain areas of their site.  This is particularly true for sections
that contain dynamically generated pages, CGI scripts, and so on.  There
is a de facto standard called the "Robots Exclusion Standard" (RES)
which allows web administrators to tell robots which areas of the site
they are allowed to visit, and which are off limits.

RES involves putting a file called "robots.txt" in the document root of
the web site, this file is parsed by a robot to determine what site
restrictions exist.  Every robot that visits your site should request
the robots.txt file from your web server. If there is no robots.txt
file, the RES specifies that the robot can visit all parts of the site
if it wishes.

Slurp obeys robots.txt restrictions. Every time Slurp, visits your site it 
will request the robots.txt file. If your site doesn't have a robots.txt, 
Slurp will obey its own rules on which URL's to visit (for example, Slurp 
doesn't visit the standard places to store CGI scripts even if robots.txt 
allows it to). If you don't have a problem with Slurp's visits, or other 
robots visits, then you don't need a robots.txt file. If you just want to
stop getting "robots.txt not found" errors in your server logs, you can
simply create an empty robots.txt file.

For more background on Slurp, please see:

http://www.inktomi.com/slurp.html

For more information on the RES, please see:

http://www.robotstxt.org/wc/exclusion.html

A number of frequently asked questions regarding Slurp and Inktomi's search 
system are available at:

http://support.inktomi.com/Search_Engine/Product_Info/FAQ/searchfaq.html

If your question is not answered by the links above, or if you still 
need further assistance, please send an email with a full problem 
description to slurp-help@inktomi.com

Thank you for your interest in Slurp!

Inktomi Corporation
-------------------

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]