Does anybody have a good reference that covers the IE Object Model that I can use to scrape data feom Web pages and put it into Access tables.
![]() |
Patch reliability is unclear. Unless you have an immediate, pressing need to install a specific patch, don't do it. |
SIGN IN | Not a member? | REGISTER | PLUS MEMBERSHIP |
-
IE Object Model (from Access/VBA)
Home » Forums » AskWoody support » Questions: Browsers and desktop software » Internet Explorer and Edge » IE Object Model (from Access/VBA)
- This topic has 39 replies, 3 voices, and was last updated 21 years, 8 months ago.
Viewing 1 reply threadAuthorReplies-
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715273 -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715288Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715292Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715588The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715589The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715293Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724448Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725301 -
WSpatt
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725302 -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724449Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715289Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715274 -
WSjscher2000
AskWoody Lounger
-
-
WSpatt
AskWoody Lounger
-
-
-
WSpatt
AskWoody Lounger -
WSKenK
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714670I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable. -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714671I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable.
-
-
WSKenK
AskWoody Lounger
-
-
WSjscher2000
AskWoody Lounger
Viewing 1 reply thread -

Plus Membership
Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.
AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.
Get Plus!
Welcome to our unique respite from the madness.
It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.
Search Newsletters
Search Forums
View the Forum
Search for Topics
Recent Topics
-
Windows 11 Insider Preview Build 26100.4482 (24H2) released to Release Preview
by
joep517
51 minutes ago -
Windows 11 Insider Preview build 27881 released to Canary
by
joep517
54 minutes ago -
Very Quarrelsome Taskbar!
by
CWBillow
11 hours, 6 minutes ago -
Move OneNote Notebook OFF OneDrive and make it local
by
CWBillow
13 hours, 50 minutes ago -
Microsoft 365 to block file access via legacy auth protocols by default
by
Alex5723
2 hours, 36 minutes ago -
Is your battery draining?
by
Susan Bradley
2 hours, 30 minutes ago -
The 16-billion-record data breach that no one’s ever heard of
by
Alex5723
2 hours, 26 minutes ago -
Weasel Words Rule Too Many Data Breach Notifications
by
Nibbled To Death By Ducks
17 hours, 42 minutes ago -
Windows Command Prompt and Powershell will not open as Administrator
by
Gordski
1 hour, 48 minutes ago -
Intel Management Engine (Intel ME) Security Issue
by
PL1
2 hours ago -
Old Geek Forced to Update. Buy a Win 11 PC? Yikes! How do I cope?
by
RonE22
1 hour, 50 minutes ago -
National scam day
by
Susan Bradley
1 hour, 32 minutes ago -
macOS Tahoe 26 the end of the road for Intel Macs, OCLP, Hackintosh
by
Alex5723
3 hours, 48 minutes ago -
Cyberattack on some Washington Post journalists’ email accounts
by
Bob99
1 day, 18 hours ago -
Tools to support internet discussions
by
Kathy Stevens
7 hours, 27 minutes ago -
How get Group Policy to allow specific Driver to download?
by
Tex265
1 day, 9 hours ago -
AI is good sometimes
by
Susan Bradley
2 days, 1 hour ago -
Mozilla quietly tests Perplexity AI as a New Firefox Search Option
by
Alex5723
1 day, 16 hours ago -
Perplexity Pro free for 12 mos for Samsung Galaxy phones
by
Patricia Grace
3 days, 2 hours ago -
June KB5060842 update broke DHCP server service
by
Alex5723
3 days ago -
AMD Ryzen™ Chipset Driver Release Notes 7.06.02.123
by
Alex5723
3 days, 4 hours ago -
Excessive security alerts
by
WSSebastian42
1 day, 19 hours ago -
* CrystalDiskMark may shorten SSD/USB Memory life
by
Alex5723
3 days, 14 hours ago -
Ben’s excellent adventure with Linux
by
Ben Myers
4 hours, 31 minutes ago -
Seconds are back in Windows 10!
by
Susan Bradley
3 days, 1 hour ago -
WebBrowserPassView — Take inventory of your stored passwords
by
Deanna McElveen
1 day, 19 hours ago -
OS news from WWDC 2025
by
Will Fastie
1 day, 5 hours ago -
Need help with graphics…
by
WSBatBytes
2 days, 9 hours ago -
AMD : Out of Bounds (OOB) read vulnerability in TPM 2.0 CVE-2025-2884
by
Alex5723
4 days, 5 hours ago -
Totally remove or disable BitLocker
by
CWBillow
3 days, 4 hours ago
Recent blog posts
Key Links
Want to Advertise in the free newsletter? How about a gift subscription in honor of a birthday? Send an email to sb@askwoody.com to ask how.
Mastodon profile for DefConPatch
Mastodon profile for AskWoody
Home • About • FAQ • Posts & Privacy • Forums • My Account
Register • Free Newsletter • Plus Membership • Gift Certificates • MS-DEFCON Alerts
Copyright ©2004-2025 by AskWoody Tech LLC. All Rights Reserved.