w3schools has many excellent tutorials.  Their tutorial on HTML DOM is at http://www.w3schools.com/htmldom/default.asp

These pages apply to specific properties of the Expession Web object model as they pertain to certain aspects of the DOM.

A good starting point is to take open a htm file to play around with.  Make sure it does not contain any data that you need.

You can get rid of most of the ActiveDocument by using

ActiveDocument.DocumentHTML = " "

This will leave only one blank space in the ActiveDocument.

You could also use:

ActiveDocument.DocumentHTML = ""

If you use the quotation marks without a space in between them, the ActiveDocument will be left with:

<html>
<body>
</body>
</html>							

It doesn't really matter whether you set the DocumentHTML to " " or to "".  With both methods, if you use the statement

MsgBox ActiveDocument.all.Length

The MessageBox will pop up and display "3".

This means that the ActiveDocument still has three elements, even though you have erased the entire DocumentHTML.

To see what these three elements are, you can use:

MsgBox ActiveDocument.all.Item(0).tagName
MsgBox ActiveDocument.all.Item(1).tagName
MsgBox ActiveDocument.all.Item(2).tagName

The results will show that
item(0) tagName is "html"
item(1) tagName is "head"
item(2) tagName is "body"

You could also use:

MsgBox ActiveDocument.all(0).tagName
MsgBox ActiveDocument.all(1).tagName
MsgBox ActiveDocument.all(2).tagName

The results will be the same for either
ActiveDocument.all.Item(0) or for
ActiveDocument.all(0). 

I believe that showing the item property is the recommended method, but I will sometimes switch back and forth.

So even if you delete all of the text in your document, it will still have 3 html elements.

This is because the start tag and end tag are optional for the html, head and body elements. 

Deleting the text just removes the start tag and the end tag, but those three elements will still remain on an Expression Web ActiveDocument.

The first thing I usually put on an empty page is the Doctype.  I use:

  Dim strDocType As String

   strDocType = "<!DOCTYPE HTML PUBLIC " & """" & _
   "-//W3C//DTD HTML 4.01 Transitional//EN" & _
   """" & " " & """" & _
   "http://www.w3.org/TR/html4/loose.dtd" & """" & ">"

   ActiveDocument.DocumentHTML = strDocType

The ActiveDocument will now look like:

<!DOCTYPE HTML PUBLIC 
"-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd">

The above output will probably be on a single continuous line.  I wrapped it here so that it would display in its entirety.

The next thing I usually do is to add the Start Tag and the

To show the OuterHTML of these three elements you can use:

Sub Add_HTML_Head_Body_OuterHTML()
   Dim strHTML As String
   Dim strHead As String
   Dim strBody As String
   
   Dim objHTMLElement As IHTMLElement
   Dim objHeadElement As IHTMLElement
   Dim objBodyement As IHTMLElement
  
  
  
   Set objHTMLElement = ActiveDocument.all.Item(0)
   Set objHeadHead = ActiveDocument.all.Item(1)
   Set objBodyBody = ActiveDocument.all.Item(2)
  
   strHTML = vbCrLf & vbCrLf & "<html>" & _
   vbCrLf & _
   "</html>"
   
   strHead = "<head>" & _
   vbCrLf & _
   "</head>" & _
   vbCrLf
  
   strBody = "<body>" & _
   vbCrLf & _
   "</body>" & _
   vbCrLf
  
   ActiveDocument.all.Item(0).outerHTML = strHTML
   ActiveDocument.all.Item(1).outerHTML = strHead
   ActiveDocument.all.Item(2).outerHTML = strBody

End Sub

The results will be:

<html>

<head>
</head>
<body>
</body>
</html>

To start examining the hierarchy, you can use:

MsgBox ActiveDocument.all(0).Children.Length

This will show that element (0) has two children. 

You can see the children's names with.


MsgBox ActiveDocument.all(0).Children(0).tagName
MsgBox ActiveDocument.all(0).Children(1).tagName

The children's names will be "head" and "body"

To see the parent name of these children you could use:

MsgBox ActiveDocument. _
all(0). _
Children(0). _
parentElement.tagName				

This will show that the parentElement tagName of the first child is "html".  The same would hold true for the other child "body"

So at this point the ActiveDocument has one root element called "html".

"html" has two children, "head" and "body" 

You should also be able to see the 2 children of "html" by using:

MsgBox ActiveDocument.all(0).innerHTML

The results will incorrectly show as:

<html>

<head>
</head>
<body>
</body>
</html>

I believe this is another bug in Expression Web..

The correct output should be just the HTML between the html start tag and the html end tag which would be:

<head>
</head>
<body>
</body>

I'm not aware of any other elements that show the incorrect innerHTML.

Next - Head Children

Valid XHTML 1.0 Transitional        Valid CSS!