In Excel, there are several well-known built-in functions whose result depends on the contents of the cells in the parameter range. These are easy to find, as their names end with **IF** or **IFS**, such as **SUMIF**/**SUMIFS** or **AVERAGEIF**/**AVERAGEIFS**.

I’ll use the following example dataset:

Here we have a dataset of time spent by drivers Smith and Jones on 10 selected routes. We want to know the average time spent by each driver on a route and show how large is the deviation of the time spent on each route relative to the driver’s average.

This is a typical case for an **AVERAGEIFS** function (although we’ll have one filter here, I’ll still use the multiple-filter **AVERAGEIFS** function instead of the **AVERAGEIF** function, because the former function has a more intuitive order of parameters). Here’s the result:

The formulae used here are:

- in
**AvgTimeSpend**column: =AVERAGEIFS([TimeSpent], [Driver], [@Driver]) - in
**DevFromAvg**column: =[@TimeSpent]/[@AvgTimeSpent] – 1

Now we see that there are actually some atypical values that skew the distribution of time spent. For instance, Smith used 22 hours on route 7, which is way higher than on other typical routes. Instead of taking into account all observations for a driver, you want to use only *typical* observations. What is considered typical, of course, varies depending on application, but assume here that we want to use the following criterion:

An observation is considered atypical if it is 30% lower or 30% higher than the median for a subgroup.

Now you want to add a column showing median time spent for each driver, but there is no **MEDIANIF** function in Excel. Here’s where on-the-fly filters come into play.

On-the-fly filters in formulae have two components: an **IF()** filter and an array formula. Let’s look on the formula that would have been analogical to a **MEDIANIF** function, had the latter been implemented in Excel:

{=MEDIAN(IF([Driver] = [@Driver], [TimeSpent], “”))}

We call the **MEDIAN** function with a single parameter — a list of values. This list contains numeric values taken from the [TimeSpent] column if the condition is passed (i.e. for the rows of the [Driver] column which are equal to the value of that column in the current row, [@Driver]) or, else, an empty string, “”. The **MEDIAN **function then ignores those elements which are empty strings and calculates the median over the numeric elements of the list. The curly brackets show that this is an array formula, because the **IF** function is applied to the whole column instead of the current row only.

The same approach can be applied to any other function. Just use the following template:

=

function( IF(filter_condition,data_that_pass_filter,“”) )

and don’t forget to press **Ctrl+Shift+Enter** to ensure that you get an array formula.

This was the core of the solution. What’s left is to create a column showing whether a given time spent is atypical:

=OR([@TimeSpent] / [@MedTimeSpent] – 1 < -30%, [@TimeSpent] / [@MedTimeSpent] – 1 > 30%)

and recalculate the average column, taking into account only those values that are typical:

=AVERAGEIFS([TimeSpent], [Driver], [@Driver], [Atypical], FALSE)

Finally, the deviation column has now to be based on the recalculated average column. Here is how the resulting table looks like:

## 2 thoughts on “On-the-Fly Filters in Excel Functions”