I would like to use VBA in Excel 2007 to step through about 500 receipts, which have been converted to one large OneNote 2010 notebook. Each notebook tab contains a different receipt. I need to get the pertainant details (Receipt #, Receipt Date, Amount, all of the line item data, tax, etc) out of each and I want to create a sheet in Excel with that data.
The data is semi-structured, meaning that once I find "Order Number" I know there is a space character and then the order number follows. But it might be on different lines or even pushed over, etc. But thats OK. I can write the VBA code, that isnt the problem..
I figured it would be easier than data entry or cheaper than hiring someone to key all this in by hand... I dont want to go the OCR route because I need the accuracy that I think I can get from some sort of Office Automation between Excel and OneNote. I just cant find any example of automation using OneNote 2010 (either from the OneNote side nor the Excel side). Can another point me in the right direction? MSDN has a developer site for Office and OneNote but I must be blind to not see any examples or even an object model!
This VBA example code on MSDN might help you. I retrieves a list of all OneNote notebooks. It is written for OneNote 2010 and works for my Office 2010 package, but I hope it works on 2007 also.
I have modified the example source to retrieve all pages and the page content. The page content is XML so you would have to parse that.
Modified MSDN Example:
'Add the following references (adjust to our office version):
'
' - Microsoft OneNote 14.0 Object Library
' - Microsoft XML, v6.0
Sub ListOneNotePages()
' Original example is from http://code.msdn.microsoft.com/office/onenote-2010-retrieve-data-023e69c0
' License: Apache 2.0
' Modified to get all pages & content instead of the notebook list
' Connect to OneNote 2010.
' OneNote will be started if it's not running.
Dim oneNote As OneNote14.Application
Set oneNote = New OneNote14.Application
' Get the XML that represents the OneNote pages
Dim oneNotePagesXml As String
' oneNotePagesXml gets filled in with an XML document providing information
' about all OneNote pages.
' You want all the data. Thus you provide an empty string
' for the bstrStartNodeID parameter.
oneNote.GetHierarchy "", OneNote14.HierarchyScope.hsPages, oneNotePagesXml, xs2010
' Use the MSXML Library to parse the XML.
Dim doc As MSXML2.DOMDocument
Set doc = New MSXML2.DOMDocument
If doc.LoadXML(oneNotePagesXml) Then
' Find all the Page nodes in the one namespace.
Dim nodes As MSXML2.IXMLDOMNodeList
Set nodes = doc.DocumentElement.SelectNodes("//one:Page")
Dim node As MSXML2.IXMLDOMNode
Dim pageName As String
Dim sectionName As String
Dim pageContent As String
Dim temp As String
' Walk the collection of Pages.
' Read attribute values and write them
' out to the Immediate window of your VBA host.
For Each node In nodes
pageName = node.Attributes.getNamedItem("name").Text
Debug.Print "Page name: "; vbCrLf & " " & pageName
Call oneNote.GetPageContent(GetAttributeValueFromNode(node, "ID"), pageContent, piBasic)
Debug.Print " content: " & pageContent
Next
Else
MsgBox "OneNote 2010 XML Data failed to load."
End If
End Sub
Private Function GetAttributeValueFromNode(node As MSXML2.IXMLDOMNode, attributeName As String) As String
If node.Attributes.getNamedItem(attributeName) Is Nothing Then
GetAttributeValueFromNode = "Not found."
Else
GetAttributeValueFromNode = node.Attributes.getNamedItem(attributeName).Text
End If
End Function