… Or how to fix the warnings “The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled” in the Crawling log.
By default, SharePoint Portal Server 2007 can crawl and filter a file with a size of up to 16 MB. After this limit is reached, SharePoint Portal Server enters a warning in the gatherer log “The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled.”
To change the limit of 16 MB, you must add in the registry a new entry MaxDownloadSize.
- Start Registry Editor (Regedit.exe).
- Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager
- Open Edit – New – DWORD Value. Name it MaxDownloadSize. Double-click, change the value to Decimal, and type the maximum size (in MB) for files that the gatherer downloads.
- Restart the server.
- Start Full Crawl.
Use this technique at your own risk! 🙂 Note: Increasing the file size may cause a timeout exception because the crawler can timeout if the file takes too long to crawl/index (because of its size). To increase timeout value:
- In Central Administration, on the Application Management tab, in the Search section, click Manage search service.
- On the Manage Search Service page, in the Farm-Level Search Settings section, click Farm-level search settings.
- In the Timeout Settings section change Connection and Request acknowledgement time.
- The key for WSS3 is HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\
Web Server Extensions\12.0\Search\Global\Gathering Manager
We can control how much the indexer will index on a single document based on registry keys on the indexerunder the regkey HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering ManagerMaxGrowFactor * MaxDownloadSize = max size of a file that can be indexed In MB.
MaxDownloadSize = 64MB (default = 16MB) MaxGrowFactor = 4, allows index filter to produce up to 256MB (64 x 4) of text from a file. (Defaults of 16MB * 4MB= 64MB of text)