Calculating Avg of multiple fields excluding zero values

Is there any easy path to this? Some function that I’ve forgotten or not found?

Fields A,B,C,D or Fields A1, A2, A3, A4

Either way, I’d like to calc the Average but only consider those values that are non-zero.

ie. Field values: 0,200,300,400 = 300 (not 225)

I think I’d build an array of all non-zero items, set a running total variable to zero then do an arrayfilter(, using assign( to increment the running total by each array element. Finally divide the running total by the length of the array to give the average.

Of course that would easier if there were an arraysum( function, which I’ve searched for in vain in the past (so many array functions, yet not always the one you’re looking for…). The ‘sum’ of a text array, i.e. the concatenation of its elements, is just the contents of the text variable or field containing the array, with or without the array separators removed. But there’s no equivalent for the arithmetic sum of a numeric data array, hence the need to add it up an element at a time using arrayfilter(.

Brute force specific solution:

(A+B+C+D)/(?(A=0,0,1)+?(B=0,0,1)+?(C=0,0,1)+?(D=0,0,1))

If all might be zero at once, you could put it within a catcherror( to provide a default for the divide by zero error.

If your fields were line item fields, eg. AmountΩ, and zero values corresponded with empty fields (which I don’t think they automatically do) then:

sum(AmountΩ)/count(AmountΩ)

would do what you want. For a more general solution convert values of the fields you wish to process into a text ARRAY, then:

arraynumerictotal(ARRAY, SEP)/arraysize(arraystrip(arrayfilter(ARRAY, SEP, {?(VAL(IMPORT())=0,“”,IMPORT())}), SEP), SEP)

If all values are zero this gives a result of zero rather than the divide by zero error you might want, as arraysize of an empty text array is 1. You could modify it to check whether the arraystrip result was “” to chose between the arraysize result and zero in your denominator if that matters.

From a solution posted in another recent thread, instead of the
?(A=0,0,1)
construction to sum up the denominator, maybe
Min(1,A) would work and be a little faster.
(A +B+C+D)/(Min(1,A)+Min(1,B)+Min(1,C)+Min(1,D))

I believe the ?(Compare,True,False) construction processes both the True and the False action first, then evaluates the Compare, and hands over the True or the False depending upon the Compare result - at least back in Pan6 days.

I don’t know how “busy” the Min() funcition is.

You could do a CatchError, but it seems to me, more direct and clearer (6 months down the road) if you just check the sum of A+B+C+D first, to make sure it is not zero. You have to add them up anyway.

T’were it me, I’d probably do something like:

Local varsum, varavg,vardenom
varavg = 0
varsum = A+B+C+D
If varesum <> 0
vardenom = (Min(1,A)+Min(1,B)+Min(1,C)+Min(1,D))
varavg = varsum/vardenom
Else
//Your bailout goes here
EndIF

Yes, varsum and vardenom can be replaced by their defining equations, but things always change, though the calculation for the average is constant. I’d “spend” some extra temporary variables so if there’s a change, I just change the summing and not the average calculation. It also gives you a single variable to look at when debugging.

I used to have to fix code in a place were a programmer would pass complicated functions as parameters to subroutines. It worked when it worked. But when it didn’t work, it took a bit of time to break out where, in the series of actions, the problem occurred.

Setting varavg to zero may also be unnecessary, but I like to initialize all my variables.

A little late to the game but I guess it would be proper for me to state that I am attempting to FormulaFill a field with the Std Deviation of those non zero values for 8,000 records. I thinking this would be:

SD=Sqr((?(A>0,(A-Avg)^2,0)+?(B>0,(B-Avg)^2,0)+?(C>0,(C-Avg)^2,0)+?(C>0,(D-Avg)^2,0))/n)

n = Count of non zero fields
Avg = Average of non zero fields

A StdDev function might help, might not.

If negative values are allowed you could have a zero total without all, or even any, parts being zero and those zero totals belong in your final stats.

If you’re only dealing with 4 fields, but want 8000 records and want to work with the total sum and non-zero total count, then use four aggregate( “sum” functions and four aggregate( “Count” functions, the later with query formula for non-zero fields. Divide the sum of the first four by the sum of the last four. Might be able to accomplish the same faster via tweaking summarytable( and using the proper row of its output. But I don’t have time now to play with that.

An equivalent formula for standard deviation is the square root of (the mean of the squares minus the square of the mean).

SD=sqr((A^2+B^2+C^2+D^2)/cache(0-(A≠0)-(B≠0)-(C≠0)-(D≠0),"n") - ((A+B+C+D)/n)^2)

I am using the fact that true is -1 and false is 0 to count the non zero values, and the cache( function so I don’t have to count them twice.

True enough - but you have to start somewhere. It’s like climbing Mt. Hood in Oregon. You can start at Government Camp (small town at the bottom), or at Timberline Lodge, or take the snowcat from the lodge up to 9000 ft and start the climb from there. Given that the RAmeeti didn’t show negative numbers in his example, they weren’t accommodated in the equation. PanX has all kinds of ways to guarantee the “quality” of the data input before you get to the calculation stage.

That said, I’m sure many times I’ve also commented on the possible, unstated, variations of data.

And thanks to Dave for the Win!! I added a column so as to calculate the number of non zero values prior to calculating the std dev results so that that value could be used as my ‘n’ value.

All of this is an effort to calculate the std dev of multiple competitor prices for a part. Not all competitors (Fields A,B,C) participate and sell any particular part so I was needing to remove those zero values prior to doing the Std Dev calculations. Understanding the std dev of competitors allows me to better calculate an appropriate retail price.

Dave gets the overall win, but for our entries a better start would be A * B * C * D ≠ 0

Just change the array separator to “+” and do a formulacalc… You might have to strip white space and replace “+-” with “-” if there are negative numbers…