Get The text of Words from a word document

This may be a topic for some other group, but this group gets replies faster then then Automation or some other Word groups, so maybe someone here knows this.

I really want to thank Cindy for pointing me in the right direction for what I wanted to do. I ended up using the DsoFramer control to host my word doc, and now I have the document loaded, when I first saw the Word.Document.Words collection, I thought cool a collection of string variables for words. Microsoft would never make it so easy.

Does anyone know how I access the words in the document, and highlight a word that matches my search criteria (similiar to the way Google highlights your results).

I am heading to the book store to buy an Office book next, but if someone has any help, books are like paper weights and become obsolete quick, so I would rather not buy one if I dont have to.

Thank you,

Corby Nichols

Houston, Texas




Answer this question

Get The text of Words from a word document

  • Carl Peto

    Thank you for replying, but I have no idea how regular expressions could help me loop through each word in the Word document, and compre it to words (a list of words, in a collection) and if the word is a match, hightlight the word, I am a guru at C# (self proclaimed) but I am a newbie to office, any code examples or link to what I want is what I need, your answer was like the mechanic telling me all you need to do is put in a new carbireartor (like I could do that to save my life).

    Thanks,

    Corby Nichols

    Houston, Texas


  • techlist

     Corby C# wrote:

    Thank you for replying, but I have no idea how regular expressions could help me loop through each word in the Word document, and compre it to words (a list of words, in a collection) and if the word is a match, hightlight the word, I am a guru at C# (self proclaimed) but I am a newbie to office, any code examples or link to what I want is what I need, your answer was like the mechanic telling me all you need to do is put in a new carbireartor (like I could do that to save my life).

     

    Thanks,

    Corby Nichols

    Houston, Texas

    Ok here's the vb.net code. C# should be just a few semi-colons:

    Imports System.Runtime.InteropServices
    Imports Microsoft.Office.Interop.Word

    Public Class Form1
    Inherits System.Windows.Forms.Form

    Private Sub btnFind_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnFind.Click

    Dim strFileName As String = "C:\Users\Adam\Desktop\Resume-AdamTurner.doc"
    Dim word As New Microsoft.Office.Interop.Word.Application
    Dim myapp As New Microsoft.Office.Interop.Word.Document
    Dim mycounter As Integer = 1

    myapp = word.Documents.Open(strFileName)
    word.Visible =
    True

    With word.Selection.Find

       .ClearFormatting()
       .Text =
    "test"
      
    .Execute()

       Do Until .Found = False

          .Execute()
          mycounter = mycounter + 1

       Loop

    End With

    MessageBox.Show("Found " & mycounter & " matches")

    End Sub
    End
    Class



  • Ozberg

    Thank you again for replying, that was code this time, so we are making great progress.

    Only problem is although I have 5 years Visual Basic experince (version 3- 6) I became a C# guy Jan./2001.

    And office automation is the first place I have ever liked the features of VB.Net better then C#, any

    idea what a C# equilevant of that looks like.

    I can play around, but I do I get to the selection object, is the Word.Document.Selection

    Thanks,

    Corby Nichols

    Houston, Texas



  • Mbowles

    Why not using the Words object with Regex. This would make your search quite easy.

    ApplicationClass wordApp = new ApplicationClass();

    object file = @"H:\Werk\DiamondDocs\DiamondDocs\bin\Debug\Archive\2007\B\Belastingdienst.doc";

    object oFalse = false;

    object nullobj = System.Reflection.Missing.Value;

    Range range = null;

    Document doc = wordApp.Documents.Open(ref file, ref nullobj, ref nullobj, ref nullobj,

    ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj,

    ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj, ref nullobj);

    object startIndex = 0;

    object eindIndex = (object)doc.Characters.Count;

    range = doc.Range(ref startIndex, ref eindIndex);

    string tekst = range.Text;

    string zoek = "aanslag";

    foreach (Range r in doc.Words)

    {

    Regex reg = new Regex(zoek);

    Match m = reg.Match(r.Text);

    if(m.Success){

    r.HighlightColorIndex = WdColorIndex.wdBlue;

    r.Select();

    }

    }

    foreach (Document d in wordApp.Documents)

    {

    d.Close(ref oFalse, ref nullobj, ref nullobj);

    }

    wordApp.Quit(ref oFalse, ref nullobj, ref nullobj);

    wordApp = null;

    doc = null;


  • Jim Karr

    You are looking for "regular expressions"

    Adamus



  • hrubesh

    I found what I needed, I had actually red the article, just wasnt able to get Word up at the time.

    http://www.builderau.com.au/program/dotnet/soa/Easily_utilise_Microsoft_Word_functionality_in_your_NET_application/0,339028399,339198903,00.htm

    Word.Range range = null;

    object startPosition = 0;

    object endPosition = (object) wordDoc.Characters.Count;

    range = wordDoc.Range(ref startPosition, ref endPosition);

    string text = range.Text;

    string wordSearch = "SQL Server";

    // Select All Instances of of the word

    int startIndex = text.IndexOf(wordSearch) + 1;

    int endIndex = startIndex + wordSearch.Length;

    // convert startPosition & endPosition from

    // startIndex & endIndex

    startPosition = (object) startIndex;

    endPosition = (object) endIndex;

    range = wordDoc.Range(ref startPosition, ref endPosition);

    range.HighlightColorIndex = Microsoft.Office.Interop.Word.WdColorIndex.wdYellow;

    range.Select();

    In the first section, the string text is every word from the doc, what I have to do next, is put a do / while loop around that code to find all occurances of the word.

    I will post that when I finish it.

    Thanks for your help everyone, now I can get back to what I was working on before I started messing with Word, whatever that was.

    Thank you,

    Corby Nichols


  • gkostel

    Corby C# wrote:

    Thank you for replying, but I have no idea how regular expressions could help me loop through each word in the Word document, and compre it to words (a list of words, in a collection) and if the word is a match, hightlight the word, I am a guru at C# (self proclaimed) but I am a newbie to office, any code examples or link to what I want is what I need, your answer was like the mechanic telling me all you need to do is put in a new carbireartor (like I could do that to save my life).

    Thanks,

    Corby Nichols

    Houston, Texas

    Selection.Find.ClearFormatting
    With Selection.Find
    .Text = "test" 'text to find
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
    End With
    Selection.Find.Execute



  • Alex Farber

    Also, if you want to find and highlight all find in the document just change the following line:

            .MatchAllWordForms = True

    Adamus



  • ramesh_1031

    In all the examples of automating Word, the samples set Word.Visible = true. Why I'd rather not have Word popping up in the background while I process their document. But if I set Visible=false, I get an error trying to open the document.
  • alex ivanov 3

    Also, a very simple way to have Office generate the code for you:

    1. Tools | Macro | Record New Macro
    2. Complete your task
    3. Stop the macro recorder
    4. Alt + F11
    5. Your code is now inside the module.

    This will work about 85% of the time and comes in very handy for VBA. Of course, in some cases, you'll have to modify or hard code your own solution.

    Adamus



  • Rocinante8

    What is a culling applicaiton for resumes

    The calling application (is that what you meant) is a contact management program for a recruitting company,

    so yes, it manages resumes, contacts, experiences, companies, skills, and the like.

    Thank you,

    Corby Nichols


  • GlenAtMotorola

    Corby111 wrote:

    Thank you again for replying, that was code this time, so we are making great progress.

    Only problem is although I have 5 years Visual Basic experince (version 3- 6) I became a C# guy Jan./2001.

    And office automation is the first place I have ever liked the features of VB.Net better then C#, any

    idea what a C# equilevant of that looks like.

    I can play around, but I do I get to the selection object, is the Word.Document.Selection

    Thanks,

    Corby Nichols

    Houston, Texas

    ThisDocument.Select

    Adamus



  • Matt Lin

    Just out of curiousity, are you writing a culling application for resumes

    Adamus



  • neo1000

     Corby111 wrote:

    And office automation is the first place I have ever liked the features of VB.Net better then C#, any

    idea what a C# equilevant of that looks like.

    I'm more of a vb.net/asp.net/sql server guy myself. I've never had much problem converting VBA to .net. Vb.net & C# share the same libraries, so you shouldn't have much of a problem converting it.

    Here's a good article on the C# interop for MS Word: http://www.builderau.com.au/program/dotnet/soa/Easily_utilise_Microsoft_Word_functionality_in_your_NET_application/0,339028399,339198903,00.htm

    Adamus



  • Get The text of Words from a word document