I would like to crawl file servers at a different level of service than WSS sites. Case in point: .html and .htm are file types that are indexed in my farm because they are allowed file types. When I crawl a file server, however, I don't want them indexed. Is there a way to do that It would be nice to still capture the name and location of .html or .htm files but I don't want my indexer to spend the time busting them apart on file servers.
Thanks
Chris Fields

Can I control file types crawled by content type
vej
I need to exclude all from http://server/ , but include only some URLs for example http://server/Pages/
I created crawl rules: http://server/* to Exlude all content and http://server/Pages/* to Include, tried to re-oder rules, but nothing changes - MOSS didn't crawl anything, but I need that http://server/Pages/* should be crawled only.
Wildert
Chris,
You can achieve these results using Crawl Rules.
Example:
You have a fileshare \\server\sharedfolder that you want to crawl but you want to exclude *.htm files
Rerun a full crawl of the file share, once complete view the crawl log which will show all of the documents that were included and highlight those that were excluded and state 'Deleted by the gatherer (This item was deleted because it was excluded by a crawl rule.)'
Andrew