Does anybody have a good reference that covers the IE Object Model that I can use to scrape data feom Web pages and put it into Access tables.
![]() |
Patch reliability is unclear. Unless you have an immediate, pressing need to install a specific patch, don't do it. |
SIGN IN | Not a member? | REGISTER | PLUS MEMBERSHIP |
-
IE Object Model (from Access/VBA)
Home » Forums » AskWoody support » Questions: Browsers and desktop software » Internet Explorer and Edge » IE Object Model (from Access/VBA)
- This topic has 39 replies, 3 voices, and was last updated 21 years, 7 months ago.
Viewing 1 reply threadAuthorReplies-
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715273 -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715288Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715292Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715588The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715589The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715293Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724448Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725301 -
WSpatt
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725302 -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724449Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715289Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715274 -
WSjscher2000
AskWoody Lounger
-
-
-
WSpatt
AskWoody Lounger
-
-
WSpatt
AskWoody LoungerWSKenK
AskWoody Lounger-
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714670I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable. -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714671I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable.
WSKenK
AskWoody LoungerWSjscher2000
AskWoody LoungerViewing 1 reply thread -

Plus Membership
Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.
AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.
Get Plus!
Welcome to our unique respite from the madness.
It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.
Search Newsletters
Search Forums
View the Forum
Search for Topics
Recent Topics
-
“kill switches” found in Chinese made power inverters
by
Alex5723
16 minutes ago -
Windows 11 – InControl vs pausing Windows updates
by
Kathy Stevens
10 minutes ago -
Meet Gemini in Chrome
by
Alex5723
4 hours, 16 minutes ago -
DuckDuckGo’s Duck.ai added GPT-4o mini
by
Alex5723
4 hours, 24 minutes ago -
Trump signs Take It Down Act
by
Alex5723
12 hours, 23 minutes ago -
Do you have a maintenance window?
by
Susan Bradley
3 hours, 37 minutes ago -
Freshly discovered bug in OpenPGP.js undermines whole point of encrypted comms
by
Nibbled To Death By Ducks
4 minutes ago -
Cox Communications and Charter Communications to merge
by
not so anon
15 hours, 43 minutes ago -
Help with WD usb driver on Windows 11
by
Tex265
20 hours, 52 minutes ago -
hibernate activation
by
e_belmont
1 day ago -
Red Hat Enterprise Linux 10 with AI assistant
by
Alex5723
1 day, 4 hours ago -
Windows 11 Insider Preview build 26200.5603 released to DEV
by
joep517
1 day, 7 hours ago -
Windows 11 Insider Preview build 26120.4151 (24H2) released to BETA
by
joep517
1 day, 7 hours ago -
Fixing Windows 24H2 failed KB5058411 install
by
Alex5723
3 hours, 36 minutes ago -
Out of band for Windows 10
by
Susan Bradley
1 day, 12 hours ago -
Giving UniGetUi a test run.
by
RetiredGeek
1 day, 19 hours ago -
Windows 11 Insider Preview Build 26100.4188 (24H2) released to Release Preview
by
joep517
2 days, 2 hours ago -
Microsoft is now putting quantum encryption in Windows builds
by
Alex5723
1 minute ago -
Auto Time Zone Adjustment
by
wadeer
2 days, 7 hours ago -
To download Win 11 Pro 23H2 ISO.
by
Eddieloh
2 days, 4 hours ago -
Manage your browsing experience with Edge
by
Mary Branscombe
4 hours, 27 minutes ago -
Fewer vulnerabilities, larger updates
by
Susan Bradley
22 hours ago -
Hobbies — There’s free software for that!
by
Deanna McElveen
1 day, 4 hours ago -
Apps included with macOS
by
Will Fastie
1 day, 2 hours ago -
Xfinity home internet
by
MrJimPhelps
23 hours, 3 minutes ago -
Convert PowerPoint presentation to Impress
by
RetiredGeek
2 days ago -
Debian 12.11 released
by
Alex5723
3 days, 4 hours ago -
Microsoft: Troubleshoot problems updating Windows
by
Alex5723
3 days, 8 hours ago -
Woman Files for Divorce After ChatGPT “Reads” Husband’s Coffee Cup
by
Alex5723
2 days, 11 hours ago -
Moving fwd, Win 11 Pro,, which is best? Lenovo refurb
by
Deo
8 seconds ago
Recent blog posts
Key Links
Want to Advertise in the free newsletter? How about a gift subscription in honor of a birthday? Send an email to sb@askwoody.com to ask how.
Mastodon profile for DefConPatch
Mastodon profile for AskWoody
Home • About • FAQ • Posts & Privacy • Forums • My Account
Register • Free Newsletter • Plus Membership • Gift Certificates • MS-DEFCON Alerts
Copyright ©2004-2025 by AskWoody Tech LLC. All Rights Reserved.