Does anybody have a good reference that covers the IE Object Model that I can use to scrape data feom Web pages and put it into Access tables.
![]() |
There are isolated problems with current patches, but they are well-known and documented on this site. |
SIGN IN | Not a member? | REGISTER | PLUS MEMBERSHIP |
-
IE Object Model (from Access/VBA)
Home » Forums » AskWoody support » Questions: Browsers and desktop software » Internet Explorer and Edge » IE Object Model (from Access/VBA)
- This topic has 39 replies, 3 voices, and was last updated 21 years, 8 months ago.
Viewing 1 reply threadAuthorReplies-
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715273 -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715288Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715292Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715588The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715589The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715293Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724448Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725301 -
WSpatt
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725302 -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724449Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715289Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715274 -
WSjscher2000
AskWoody Lounger
-
-
WSpatt
AskWoody Lounger
-
-
-
WSpatt
AskWoody Lounger -
WSKenK
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714670I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable. -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714671I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable.
-
-
WSKenK
AskWoody Lounger
-
-
WSjscher2000
AskWoody Lounger
Viewing 1 reply thread -

Plus Membership
Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.
AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.
Get Plus!
Welcome to our unique respite from the madness.
It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.
Search Newsletters
Search Forums
View the Forum
Search for Topics
Recent Topics
-
How start headers and page numbers on page 3?
by
Davidhs
2 hours, 34 minutes ago -
Attack on LexisNexis Risk Solutions exposes data on 300k +
by
Nibbled To Death By Ducks
4 minutes ago -
Windows 11 Insider Preview build 26200.5622 released to DEV
by
joep517
11 hours, 15 minutes ago -
Windows 11 Insider Preview build 26120.4230 (24H2) released to BETA
by
joep517
11 hours, 17 minutes ago -
MS Excel 2019 Now Prompts to Back Up With OneDrive
by
lmacri
58 minutes ago -
Firefox 139
by
Charlie
13 hours, 39 minutes ago -
Who knows what?
by
Will Fastie
6 hours, 22 minutes ago -
My top ten underappreciated features in Office
by
Peter Deegan
12 hours ago -
WAU Manager — It’s your computer, you are in charge!
by
Deanna McElveen
6 hours, 24 minutes ago -
Misbehaving devices
by
Susan Bradley
14 hours, 8 minutes ago -
.NET 8.0 Desktop Runtime (v8.0.16) – Windows x86 Installer
by
WSmeyerbos
1 day, 17 hours ago -
Neowin poll : What do you plan to do on Windows 10 EOS
by
Alex5723
1 hour, 58 minutes ago -
May 31, 2025—KB5062170 (OS Builds 22621.5415 and 22631.5415 Out-of-band
by
Alex5723
1 day, 16 hours ago -
Discover the Best AI Tools for Everything
by
Alex5723
15 hours, 43 minutes ago -
Edge Seems To Be Gaining Weight
by
bbearren
1 day, 6 hours ago -
Rufus is available from the MSFT Store
by
PL1
1 day, 14 hours ago -
Microsoft : Ending USB-C® Port Confusion
by
Alex5723
2 days, 17 hours ago -
KB5061768 update for Intel vPro processor
by
drmark
17 hours, 30 minutes ago -
Outlook 365 classic has exhausted all shared resources
by
drmark
16 hours, 12 minutes ago -
My Simple Word 2010 Macro Is Not Working
by
mbennett555
2 days, 13 hours ago -
Office gets current release
by
Susan Bradley
2 days, 16 hours ago -
FBI: Still Using One of These Old Routers? It’s Vulnerable to Hackers
by
Alex5723
4 days, 6 hours ago -
Windows AI Local Only no NPU required!
by
RetiredGeek
3 days, 14 hours ago -
Stop the OneDrive defaults
by
CWBillow
4 days, 6 hours ago -
Windows 11 Insider Preview build 27868 released to Canary
by
joep517
4 days, 16 hours ago -
X Suspends Encrypted DMs
by
Alex5723
4 days, 19 hours ago -
WSJ : My Robot and Me AI generated movie
by
Alex5723
4 days, 19 hours ago -
Botnet hacks 9,000+ ASUS routers to add persistent SSH backdoor
by
Alex5723
4 days, 19 hours ago -
OpenAI model sabotages shutdown code
by
Cybertooth
4 days, 20 hours ago -
Backup and access old e-mails after company e-mail address is terminated
by
M W Leijendekker
4 days, 8 hours ago
Recent blog posts
Key Links
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 |
Want to Advertise in the free newsletter? How about a gift subscription in honor of a birthday? Send an email to sb@askwoody.com to ask how.
Mastodon profile for DefConPatch
Mastodon profile for AskWoody
Home • About • FAQ • Posts & Privacy • Forums • My Account
Register • Free Newsletter • Plus Membership • Gift Certificates • MS-DEFCON Alerts
Copyright ©2004-2025 by AskWoody Tech LLC. All Rights Reserved.