How to read current active IE window's html content

Hi there!,

I really hope someone can help me with this issue... it is really cracking my brain up:S

What I want to be able to is to load the html content of the current active IE window by a simple IE Frame or console application.
Not using the webbrowser component. That is possible but I just don't know how to do it.
Maby with the HtmlDocument

I have googled and googled.. I found this example

http://www.codeproject.com/vb/net/ByPassAutomation.asp

which is VB, but other than that the example does NOT work.. well I have IE 7.0 so maby it's because of the tabs..

can anyone please help me

Regards,

Qawi




Answer this question

How to read current active IE window's html content

  • Samurai Sjakkie

    I do have this working with a BandObject.

    Jumping thru a few hoops....it basically takes this:

    With the BandObject it has an event called Explorer Attached.

    In Load or whatever...subscribe to attached event:

    this.ExplorerAttached += new EventHandler(MyBandObjectToolbar_ExplorerAttached);

    private void MyBandObjectToolbar_ExplorerAttached(object sender, EventArgs e)

    {

    //In this event subscribe to DocComplete

    Explorer.DocumentComplete += new SHDocVw.DWebBrowserEvents2_DocumentCompleteEventHandler(Explorer_DocumentComplete);

    }

    void Explorer_DocumentComplete(object pDisp, ref object URL)

    {

    //Needed to get the current tab HTML

    textBox1.Text = GetTabHtml(URL.ToString());

    }

    public static string GetTabHtml(string theUrl)

    {

    string tabHtml = "";

    foreach (SHDocVw.WebBrowser tab in shellWindows)

    {

    if(theUrl == tab.LocationURL)

    {

    HTMLDocument currentTab = tab.Document as mshtml.HTMLDocument;

    tabHtml = currentTab.body.outerHTML;

    }

    }

    return tabHtml;

    }



  • GrandpaB

    Hi,

     

    I think I got it:) thanks.. however.. how do I install it to IE

    I got this..

     

    Performing Post-Build Event...
    Microsoft (R) .NET Global Assembly Cache Utility. Version 2.0.50727.42
    Copyright (c) Microsoft Corporation. All rights reserved.
    Assembly successfully added to the cache
    Microsoft (R) .NET Framework Assembly Registration Utility 2.0.50727.42
    Copyright (C) Microsoft Corporation 1998-2004. All rights reserved.
    Types registered successfully
    
    so it's okay... but what to do now 
     
    I noticed that when I rightclick the IE bar it's actually there.. but when I mark it nothing happens.. 


  • Charles Tam

    Hi ahmedilyas,

    No it's not the webbrowser component in C#:-)
    It is the open website in Internet Explorer 7.0

    The last active page which has been opened IE 7.0

    Do you have a workaround for this

    I hope so... there is absolutly nothing on this when searching google other then what I have fount which doesn't work:(



  • Alexey Raga

    okay... I have almost fixed it!!

    You can se the code below.. however.. there is one little thing that is still not working:-(
    I the foreground window has tabs and you are viewing tab nr. 1,2 or 3 out of 4 tabs ect.. you will always get the content of tab number 4..

    any tweak for this By the way I used the win32 api function GetForegroundWindow()

    foreach (SHDocVw.WebBrowser ie in shellWindows)

    {

    HTMLDocument doc = ie.Document as mshtml.HTMLDocument;

    if(doc != null )

    {

    IntPtr activeHwnd = GetForegroundWindow();

    if (ie.HWND == activeHwnd.ToInt32())

    {

    richTextBox1.Text = doc.body.outerHTML;

    }

    }

    }

    }



  • Ori'

    hmm actually this code is weird:S

    I think I AM on the active tab because when I call refresh it refreshes the page!
    But the innertext gives med the inner text for the last opened tab:S

    richTextBox1.Text = doc.body.outerHTML; --->wrong tab!

    ie.Refresh(); --- > correct tab!

    so if it is not outerHTML what is it

    foreach (SHDocVw.WebBrowser ie in shellWindows)

    {

    HTMLDocument doc = ie.Document as mshtml.HTMLDocument;

    IntPtr activeHwnd = GetForegroundWindow();

    if(doc != null )

    {

    if (ie.HWND == activeHwnd.ToInt32())

    {

    richTextBox1.Text = doc.body.outerHTML;

    ie.Refresh();

    }

    }

    }



  • Michael Hansen

    Are you creating a BandObject (using the codeproject example)..

  • Cam70

    More than likely a pop-up....or javascript window...

    I have a half-assed pop up blocker....it works on that page.



  • Bravo2007

    using System;

    using System.Collections.Generic;

    using System.ComponentModel;

    using System.Data;

    using System.Drawing;

    using System.Text;

    using System.Windows.Forms;

    using mshtml;

    namespace WindowsApplication1

    {

    public partial class Form1 : Form

    {

    private SHDocVw.ShellWindows shellWindows = new SHDocVw.ShellWindowsClass();

    public Form1()

    {

    InitializeComponent();

    }

    private void Form1_Load(object sender, EventArgs e)

    {

    foreach (SHDocVw.WebBrowser ie in shellWindows)

    {

    HTMLDocument doc = ie.Document as mshtml.HTMLDocument;

    string docBody = doc.body.outerHTML;

    Console.WriteLine(docBody);

    }

    }

    }

    }



  • MaggieChan

    if you use a webbrowser control, you can read the html document on the page in the document_completed event and look at the DocumentText property, this will give you the entire page as is in Html, the html code for the page

    is this not what you are after



  • Nonu_k

    try:

    if(ie.HWND != 0)

    //get doc



  • Mike!

    Hi again,

    No I am using your code as "stand-alone"
    However I modified it a little so I could get the last active window, but there is one last issue and that is that I get the last opened TAB inside of this IE window and not the one that the user is active in.

    I don't know.. I tried to find a win32 api function for i:S

    If it can be done with the BandObject it's fine with me... I see that it has a get UrlLocation, but it's not implemented

    not sure, but it's not showing up when I create an instance of the BandObject:(



  • Alex Yakhnin - MSFT

    No I have it included in a much larger shareware project. sorry.

    That code works fine. Just add those events to the Bandobject example.



  • KIWIDOGGIE

    I was wrong before.. it refreshed ALL tabs.

    Cablehead >> Do you have that as a working project

    if you have can you send til to my mail bwa <at> mondo . dk

    thanks in advanced



  • sammy chen

    Hey I got it to work!!!

    LAST issue:)

    On some websites it returns empty.. meaning I don't get the html of the tab.
    on google it worked fine... on

    www.eksperten.dk

    it didn't work

    any ideas



  • How to read current active IE window's html content