Is it possible to obtain full-text contents of documents indexed by WDS through query interfaces?

Is it possible to obtain full-text contents of documents indexed by WDS through query interfaces

As far as I know, there is a special system property System.Search.Contents, which represents textual contents of the indexed document. But MSDN states:

System.Search.Contents: The contents of the item. This property is for query restrictions only; it cannot be retrieved in a query result. The Indexing Service friendly name is 'contents'.

And contents of this column in query result datasets seems to be DBNull indeed in all cases.

I would like to add some kind of post-processing for all documents, processed by WDS. In order to do this, I need to access full-text content of the documents being indexed. As far as I understand, the most efficient way would be to extract text directly from WDS index, if possible. Are there any WDS developers here Does WDS store full-text contents of the indexed documents internally Is it possible to retrieve that text somehow

I have also considered running IFilter once more for each indexed document on my own, but I stumbled into problems with e-mail (and presumably all other types of content, which is not directly accessible through the file system). There is a property System.ItemURL in WDS, which identifies indexed document. Theoretically I should be able to obtain item text by creating proper IFilter for that item using LoadIFilter or BindIFilterFromStream, but I cannot figure out how to create IFilters for non-filesystem objects. Is it possible to obtain IStream interface somehow from ItemURL like this:

"mapi://{S-1-5-21-641218907-4187178781-2627367884-1105}/Personal Folders($b697f890)/Inbox/..."



Answer this question

Is it possible to obtain full-text contents of documents indexed by WDS through query interfaces?

  • clovernews

    I regret to say that I don't think there is a public way to do it.

    WDS does not keep complete text of indexed items (and even though the index can be reversed it does not contain complete original text).

    There is a 'preview' of document text though: System.Search.AutoSummary (basically, first 2k of text).



  • Is it possible to obtain full-text contents of documents indexed by WDS through query interfaces?