How to calculate a weighted average in XPath or XQuery?

cbosson picture cbosson · Jan 2, 2013 · Viewed 8.7k times · Source

I need to calculate a simple weighted average in a XForms form. How can I do that in an elegant and declarative way using XPath and/or XQuery ?

[EDITED] This is the source XML document :

<Examens>
    <Examen>
        <ExamenId>1</ExamenId>
        <Coef>1</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>4</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur>4.2</Valeur>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>3.8</Valeur>
            </Note>
        </Notes>
    </Examen>
    <Examen>
        <ExamenId>2</ExamenId>
        <Coef>2</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>5</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur/>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>3.5</Valeur>
            </Note>
        </Notes>
    </Examen>
    <Examen>
        <ExamenId>3</ExamenId>
        <Coef>3</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>6</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur>5.4</Valeur>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>2</Valeur>
            </Note>
        </Notes>
    </Examen>
</Examens>

The following snippet is correctly displaying the values (= ./Valeur) and the weight (= ./../../Coef) :

<xforms:repeat nodeset="$currentBranche//Note[EleveId=$currentEleveId]">
     <xforms:output ref="./Valeur"/>
     <xforms:output ref="./../../Coef"/>
</xforms:repeat>

BTW, I also need to exclude the nodes for which Valeur is an empty string. For example, in the following simple average calculation with the XPath avg() function, I got an error ("Cannot convert '' to double") if one node's content is an empty string. This is a problem, because the node exist (it's part of a model instance) and the value is an empty string when the user has not yet entered a value.

<xforms:output ref="round(avg($currentBranche//Note[EleveId=$currentEleveId]/Valeur)*100) div 100"/>

[EDITED]

The correct calculations are :

If EleveId=100 : weighted average = (1*4+2*5+3*6) / (1+2+3) = 5.333
If EleveId=101 : weighted average = (1*4.2+3*5.4) / (1+3) = 5.1
If EleveId=102 : weighted average = (1*3.8+2*3.5+3*2) / (1+2+3) = 2.8

Answer

Dimitre Novatchev picture Dimitre Novatchev · Jan 2, 2013

In XPath 1.0 Use:

  sum($currentBranche//Note[EleveId=$currentEleveId]/Valeur[number(.)=number(.)])

 div

  count($currentBranche//Note[EleveId=$currentEleveId]/Valeur)

In Xpath 2.0 (XQuery) use:

round(avg($currentBranche//Note[EleveId=$currentEleveId]/Valeur
                                             [number(.)=number(.)])*100
      ) div 100

If all Valeur values are guaranteed to be castable as xs:decimal, then use:

avg($currentBranche//Note[EleveId=$currentEleveId]/Valeur
                                                 [castable as xs:decimal]
                                                    /xs:decimal(.)
   )

In this case there won't be (noticeable) loss of precision and you can use later the format-number() function to get the wanted number of digits after the decimal point.


II. Producing "weighted average":

Given the provided XML document:

<Examens>
    <Examen>
        <ExamenId>1</ExamenId>
        <Coef>1</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>4</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur>4.2</Valeur>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>3.8</Valeur>
            </Note>
        </Notes>
    </Examen>
    <Examen>
        <ExamenId>2</ExamenId>
        <Coef>2</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>5</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur/>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>3.5</Valeur>
            </Note>
        </Notes>
    </Examen>
    <Examen>
        <ExamenId>3</ExamenId>
        <Coef>3</Coef>
        <Notes>
            <Note>
                <EleveId>100</EleveId>
                <Valeur>6</Valeur>
            </Note>
            <Note>
                <EleveId>101</EleveId>
                <Valeur>5.4</Valeur>
            </Note>
            <Note>
                <EleveId>102</EleveId>
                <Valeur>2</Valeur>
            </Note>
        </Notes>
    </Examen>
</Examens>

this XPath 2.0 expression produces the weighted average:

   for $elevId in distinct-values(/*/*/*/*/EleveId)
    return
      round(100*
             (sum(/*/*/*/Note
                    [EleveId eq $elevId
                   and number(Valeur) eq number(Valeur)
                     ]
                      /(Valeur * ../../Coef)
                  )
            div
              sum(/*/*/*/Note
                    [EleveId eq $elevId
                   and number(Valeur) eq number(Valeur)
                    ]
                     /../../Coef
                  )
              )
           )
    div 100

and the expected, correct result is produced:

5.33 5.1 2.8