VBA - Read XML file

sopana picture sopana · Feb 19, 2018 · Viewed 7.6k times · Source

I need to read a XML file generated by an application to further do some automation. I am new to Excel VBA and have searched the net in vain!

Sample XML file -

<?xml version="1.0" encoding="ISO-9970-7"?>
<!DOCTYPE MAPPING SYSTEM "mapper.pmp">
<MAPPER CREATION_DATE="01/01/2006 10:47:36" REPOSITORY_VERSION="XXX.YY">
<REPOSITORY NAME="development" VERSION="100" CODEPAGE="Latin1" DATABASETYPE="Oracle">
<FOLDER NAME="ETL" GROUP="" OWNER="Administrator" SHARED="NOTSHARED" DESCRIPTION="This is the test folder for checking xml read" PERMISSIONS="rwx------" UUID="ab16147d-15e7-5fg1-h9i1-jk2548001234">
    <SOURCE BUSINESSNAME ="" DATABASETYPE ="Oracle" DBDNAME ="SAMPLE_DB" DESCRIPTION ="SOURCE DEFINITION FOR SRC_TEST TABLE." NAME ="SRC_TEST" OBJECTVERSION ="1" OWNERNAME ="SAMPLE_DB" VERSIONNUMBER ="1">
        <SOURCEFIELD BUSINESSNAME ="" DATATYPE ="number" DESCRIPTION ="" FIELDNUMBER ="1" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="24" LEVEL ="0" NAME ="SRC_TEST_KEY" NULLABLE ="NOTNULL" OCCURS ="0" OFFSET ="0" PHYSICALLENGTH ="15" PHYSICALOFFSET ="0" PICTURETEXT ="" PRECISION ="15" SCALE ="0" USAGE_FLAGS =""/>
        <SOURCEFIELD BUSINESSNAME ="" DATATYPE ="varchar2" DESCRIPTION ="" FIELDNUMBER ="2" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="24" LEVEL ="0" NAME ="EMP_NAME" NULLABLE ="NOTNULL" OCCURS ="0" OFFSET ="24" PHYSICALLENGTH ="15" PHYSICALOFFSET ="15" PICTURETEXT ="" PRECISION ="15" SCALE ="0" USAGE_FLAGS =""/>
        <SOURCEFIELD BUSINESSNAME ="" DATATYPE ="varchar2" DESCRIPTION ="" FIELDNUMBER ="3" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="24" LEVEL ="0" NAME ="EMP_DEPT" NULLABLE ="NULL" OCCURS ="0" OFFSET ="48" PHYSICALLENGTH ="15" PHYSICALOFFSET ="30" PICTURETEXT ="" PRECISION ="15" SCALE ="0" USAGE_FLAGS =""/>
        <SOURCEFIELD BUSINESSNAME ="" DATATYPE ="number" DESCRIPTION ="" FIELDNUMBER ="4" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="24" LEVEL ="0" NAME ="EMP_SAL" NULLABLE ="NOTNULL" OCCURS ="0" OFFSET ="72" PHYSICALLENGTH ="15" PHYSICALOFFSET ="45" PICTURETEXT ="" PRECISION ="100" SCALE ="0" USAGE_FLAGS =""/>
    </SOURCE>
    <TARGET BUSINESSNAME ="" CONSTRAINT ="" DATABASETYPE ="Flat File" DESCRIPTION ="Tagret definition for Flat file 1." NAME ="FLAT_FILE" OBJECTVERSION ="1" TABLEOPTIONS ="" VERSIONNUMBER ="1">
        <FLATFILE CODEPAGE ="MS1252" CONSECDELIMITERSASONE ="NO" DELIMITED ="YES" DELIMITERS ="," ESCAPE_CHARACTER ="" KEEPESCAPECHAR ="NO" LINESEQUENTIAL ="NO" MULTIDELIMITERSASAND ="NO" NULLCHARTYPE ="ASCII" NULL_CHARACTER ="*" PADBYTES ="1" QUOTE_CHARACTER ="NONE" REPEATABLE ="NO" ROWDELIMITER ="0" SKIPROWS ="0" STRIPTRAILINGBLANKS ="NO"/>
        <TARGETFIELD BUSINESSNAME ="" DATATYPE ="string" DESCRIPTION ="" FIELDNUMBER ="1" KEYTYPE ="NOT A KEY" NAME ="EMP_NAME" NULLABLE ="NULL" PICTURETEXT ="" PRECISION ="256" SCALE ="0"/>
        <TABLEATTRIBUTE NAME ="Datetime Format" VALUE ="A  19 mm/dd/yyyy hh24:mi:ss"/>
        <TABLEATTRIBUTE NAME ="Thousand Separator" VALUE ="None"/>
        <TABLEATTRIBUTE NAME ="Decimal Separator" VALUE ="."/>
        <TABLEATTRIBUTE NAME ="Line Endings" VALUE ="System default"/>
    </TARGET>
    <TARGET BUSINESSNAME ="" CONSTRAINT ="" DATABASETYPE ="Flat File" DESCRIPTION ="Tagret definition for Flat file 2." NAME ="FLAT_FILE_LIST" OBJECTVERSION ="1" TABLEOPTIONS ="" VERSIONNUMBER ="1">
        <FLATFILE CODEPAGE ="MS1252" CONSECDELIMITERSASONE ="NO" DELIMITED ="YES" DELIMITERS =";" ESCAPE_CHARACTER ="" KEEPESCAPECHAR ="NO" LINESEQUENTIAL ="NO" MULTIDELIMITERSASAND ="NO" NULLCHARTYPE ="ASCII" NULL_CHARACTER ="*" PADBYTES ="1" QUOTE_CHARACTER ="NONE" REPEATABLE ="NO" ROWDELIMITER ="0" SKIPROWS ="0" STRIPTRAILINGBLANKS ="NO"/>
        <TARGETFIELD BUSINESSNAME ="" DATATYPE ="string" DESCRIPTION ="" FIELDNUMBER ="1" KEYTYPE ="NOT A KEY" NAME ="EMP_DEPT" NULLABLE ="NULL" PICTURETEXT ="" PRECISION ="10" SCALE ="0"/>
        <TARGETFIELD BUSINESSNAME ="" DATATYPE ="string" DESCRIPTION ="" FIELDNUMBER ="2" KEYTYPE ="NOT A KEY" NAME ="EMP_NAME" NULLABLE ="NULL" PICTURETEXT ="" PRECISION ="256" SCALE ="0"/>
        <TARGETFIELD BUSINESSNAME ="" DATATYPE ="number" DESCRIPTION ="" FIELDNUMBER ="3" KEYTYPE ="NOT A KEY" NAME ="EMP_SAL" NULLABLE ="NULL" PICTURETEXT ="" PRECISION ="10" SCALE ="0"/>
        <TABLEATTRIBUTE NAME ="Datetime Format" VALUE ="A  19 mm/dd/yyyy hh24:mi:ss"/>
        <TABLEATTRIBUTE NAME ="Thousand Separator" VALUE ="None"/>
        <TABLEATTRIBUTE NAME ="Decimal Separator" VALUE ="."/>
        <TABLEATTRIBUTE NAME ="Line Endings" VALUE ="System default"/>
    </TARGET>   
</FOLDER>
</REPOSITORY>
</MAPPER>

I want to read 1) the NAME in SOURCEFIELD, for e.g. SRC_TEST_KEY, EMP_NAME, EMP_DEPT and EMP_SAL. 2) Their PHYSICALLENGTH and so on.

Attempted VBA code (I've searched on net)-

Sub read_xml()
    Dim Init, i As Integer
    Dim xmlDoc As MSXML2.DOMDocument
    Dim elements As Object
    Dim el As Variant
    Dim Prop As String
    Dim NumberOfElements As Integer
    Dim n As IXMLDOMNode
    Init = 5

    Set xmlDoc = CreateObject("MSXML2.DOMDocument")
    xmlDoc.Load ("C:\path\to\Sample_xml.xml")
    Set TitleNodes = xmlDoc.SelectNodes("/MAPPER/REPOSITORY/FOLDER/SOURCE/")
    lengthTitleNodes = Len(TitleNodes)

    For i = 0 To lengthTitleNodes
        Title = TitleNodes(i).NodeValue
    Next i
End Sub

But obviously this is not working, It doesn't recognize any Tags, nodes, etc. I'm doing something majorly wrong here but haven't figured it out!! Can someone please help me in pointing to the right direction?

Thanks in advance!

Answer

S Meaden picture S Meaden · Feb 19, 2018

I had to amend your file to make it work. I had to remove the encoding because otherwise it gives parse error System does not support the specified encoding.
Also, I had to remove <!DOCTYPE MAPPING SYSTEM "mapper.pmp"> because it gave parse error DTD is prohibited..

Option Explicit

'* Tools->References
'* MSXML2      Microsoft XML, v6.0     C:\Windows\SysWOW64\msxml6.dll


'* Amendments to make run
'Changed top line to <?xml version="1.0" ?>
'Removed <!DOCTYPE MAPPING SYSTEM "mapper.pmp">

Sub t()
    Dim xmldoc As msxml2.DOMDocument60
    Set xmldoc = New msxml2.DOMDocument60
    'Set xmldoc = CreateObject("MSXML2.DOMDocument.6.0")
    Call xmldoc.setProperty("SelectionLanguage", "XPath")
    xmldoc.Load ("C:\path\to\Sample_xml.xml")
    'xmldoc.Load "n:\SO_Q48862991.xml"
    Debug.Assert xmldoc.parseError.ErrorCode = 0

    Dim TitleNodes As msxml2.IXMLDOMNodeList
    Set TitleNodes = xmldoc.SelectNodes("/MAPPER/REPOSITORY/FOLDER/SOURCE")

    Dim lengthTitleNodes As Long
    lengthTitleNodes = TitleNodes.Length

    Dim i As Long
    For i = 0 To lengthTitleNodes - 1
        Dim Title As msxml2.IXMLDOMElement
        Set Title = TitleNodes(i)

        Dim xmlSourceFields As msxml2.IXMLDOMNodeList
        Set xmlSourceFields = Title.SelectNodes("SOURCEFIELD")

        Dim lSourceFieldLoop
        For lSourceFieldLoop = 0 To xmlSourceFields.Length - 1
            Dim xmlSourceField As msxml2.IXMLDOMElement
            Set xmlSourceField = xmlSourceFields.Item(lSourceFieldLoop)

            Debug.Print xmlSourceField.getAttribute("NAME"), xmlSourceField.getAttribute("PHYSICALLENGTH")



        Next lSourceFieldLoop

    Next i
End Sub

Now outputs

SRC_TEST_KEY  15
EMP_NAME      15
EMP_DEPT      15
EMP_SAL       15