Nacho
Nacho

Reputation: 962

c# ms word get visible text

I'm trying to obtain the text shown in a MS Word window in C# using Microsoft.Office.Interop.Word. Please note it's not the whole document or even the page; just the same content the user sees.

The following code seems to work with simple documents:

Application word = new Application();
word.Visible = true;
object fileName = @"example.docx";
word.Documents.Add(ref fileName, Type.Missing, Type.Missing, true);

Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;

Range r1 = word.ActiveWindow.RangeFromPoint((int)rect.Left, (int)rect.Top);
Range r2 = word.ActiveWindow.RangeFromPoint((int)rect.Left + (int)rect.Width, (int)rect.Top + (int)rect.Height);
r1.End = r2.Start;

Console.WriteLine(r1.Text.Replace("\r", "\r\n"));

However, when the document includes other structures such as headers, only parts of the text are returned.

So, what's the correct way to achieve this?

Thanks a lot!

Updated Code

Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;

foreach (Range r in word.ActiveDocument.StoryRanges) {
    int left = 0, top = 0, width = 0, height = 0;
    try {
        try {
            word.ActiveWindow.GetPoint(out left, out top, out width, out height, r);
        } catch {
            left = (int)rect.Left;
            top = (int)rect.Top;
            width = (int)rect.Width;
            height = (int)rect.Height;
        }
        Rect newRect = new Rect(left, top, width, height);
        Rect inter;
        if ((inter = Rect.Intersect(rect, newRect)) != Rect.Empty) {
            Range r1 = word.ActiveWindow.RangeFromPoint((int)inter.Left, (int)inter.Top);
            Range r2 = word.ActiveWindow.RangeFromPoint((int)inter.Right, (int)inter.Bottom);
            r.SetRange(r1.Start, r2.Start);

            Console.WriteLine(r.Text.Replace("\r", "\r\n"));
        }
    } catch { }
}

Upvotes: 7

Views: 2472

Answers (4)

Abhay Kalariya
Abhay Kalariya

Reputation: 61

I have similar requirement in my Word add ins.

Try bellow code, it works for me.

IntPtr h = Process.GetCurrentProcess().MainWindowHandle;

            h = NativeMethodsActiveScreen.FindWindowExW(h, new IntPtr(0), "_WwF", "");
            h = NativeMethodsActiveScreen.FindWindowExW(h, new IntPtr(0), "_WwB", null);
            h = NativeMethodsActiveScreen.FindWindowExW(h, new IntPtr(0), "_WwG", null);

            NativeMethodsActiveScreen.tagRECT t = new NativeMethodsActiveScreen.tagRECT();
            NativeMethodsActiveScreen.GetWindowRect(h, out t);

            var Aw = RibbonHelper.SharedApplicationInstance.ActiveWindow;
            Range fullDocRange = RibbonHelper.SharedApplicationInstance.ActiveDocument.Range();
            Range r1 = RibbonHelper.SharedApplicationInstance.ActiveWindow.RangeFromPoint(t.left, t.top);
            Range r2 = RibbonHelper.SharedApplicationInstance.ActiveWindow.RangeFromPoint(t.right, t.bottom);
            Range r = RibbonHelper.SharedApplicationInstance.ActiveDocument.Range(r1.Start, r2.Start);

if it helps please mark answer as helpful.

Thanks

Upvotes: 0

LeoJinDev
LeoJinDev

Reputation: 405

The above discussion is very specific to the Office versions.

I think my code will work in all cases.

        IntPtr h = (IntPtr)Globals.ThisAddIn.Application.ActiveWindow.Hwnd;
        String strText = NativeInvoker.GetWindowText(h);
        if (strText != null && strText.StartsWith(Globals.ThisAddIn.Application.ActiveWindow.Caption))
        {
            h = NativeInvoker.FindWindowEx(h, IntPtr.Zero, "_WwF", "");
            h = NativeInvoker.FindWindowEx(h, IntPtr.Zero, "_WwB", null);
            h = NativeInvoker.FindWindowEx(h, IntPtr.Zero, "_WwG", null);

            Rect t;
            if (NativeInvoker.GetWindowRect(h, out t))
            {
                Range r1 = (Range)Globals.ThisAddIn.Application.ActiveWindow.RangeFromPoint((int)t.Left, (int)t.Top);
                Range r2 = (Range)Globals.ThisAddIn.Application.ActiveWindow.RangeFromPoint((int)t.Right, (int)t.Bottom);
                Range r = Globals.ThisAddIn.Application.ActiveDocument.Range(r1.Start, r2.Start);
                ....

You can refer to the NativeInvoker class contents from anyware.

I hope my code will help your work.

Phon.

Upvotes: 1

JohnZaj
JohnZaj

Reputation: 3230

There may be some problems with this:

  • Its not reliable. Are you truly able to get consistent results each time? For example, on a simple "=rand()" document, run the program 5 times in a row without changing the state of Word. When I do this, I get a different range printed to the console each time. I would first start here: there seems to be something wrong with your logic for getting the ranges. For example, rect.Left keeps returning different numbers every time I execute it against the same document left alone on screen
  • It gets tricky with other stories. Perhaps RangeFromPoint cannot
    extend across multiple story boundaries. However, lets assume it does. You would still need to enumerate each story e.g.

enumerator = r1.StoryRanges.GetEnumerator(); { while (enumerator.MoveNext() { Range current = (Range) enumerator.Current; } }

Have you tried to look at How to programmatically extract the text of the currently viewed page of an Office.Interop.Word.Document object ?

Upvotes: 4

Trisped
Trisped

Reputation: 6007

You are probably seeing the side effects of range selecting across page elements.
In most cases, if you move your cursor to the top left of the screen, down to the bottom right of the screen it will only select the main body text (no headers or footers). Also, if the document has columns, and those columns start or end off screen, then when you select from the fist column the text through to the last column will be selected, even if it is off the screen.

To my knowledge there is no easy way to achieve your goal unless you are willing to ignore the inconsistencies, or want to deal with all the use cases specifically (images, columns, tables, etc.).

If you can tell us what you are trying to do then we can offer alternatives, otherwise please mark an answer as correct.

Upvotes: 1

Related Questions