On-the-Fly Filters in Excel Functions

In Excel, there are several well-known built-in functions whose result depends on the contents of the cells in the parameter range. These are easy to find, as their names end with IF or IFS, such as SUMIF/SUMIFS or AVERAGEIF/AVERAGEIFS.

I’ll use the following example dataset:


Here we have a dataset of time spent by drivers Smith and Jones on 10 selected routes. We want to know the average time spent by each driver on a route and show how large is the deviation of the time spent on each route relative to the driver’s average.

This is a typical case for an AVERAGEIFS function (although we’ll have one filter here, I’ll still use the multiple-filter AVERAGEIFS function instead of the AVERAGEIF function, because the former function has a more intuitive order of parameters). Here’s the result:


The formulae used here are:

  • in AvgTimeSpend column: =AVERAGEIFS([TimeSpent], [Driver], [@Driver])
  • in DevFromAvg column: =[@TimeSpent]/[@AvgTimeSpent] – 1

Now we see that there are actually some atypical values that skew the distribution of time spent. For instance, Smith used 22 hours on route 7, which is way higher than on other typical routes. Instead of taking into account all observations for a driver, you want to use only typical observations. What is considered typical, of course, varies depending on application, but assume here that we want to use the following criterion:

An observation is considered atypical if it is 30% lower or 30% higher than the median for a subgroup.

Now you want to add a column showing median time spent for each driver, but there is no MEDIANIF function in Excel. Here’s where on-the-fly filters come into play.

On-the-fly filters in formulae have two components: an IF() filter and an array formula. Let’s look on the formula that would have been analogical to a MEDIANIF function, had the latter been implemented in Excel:

{=MEDIAN(IF([Driver] = [@Driver], [TimeSpent], “”))}

We call the MEDIAN function with a single parameter — a list of values. This list contains numeric values taken from the [TimeSpent] column if the condition is passed (i.e. for the rows of the [Driver] column which are equal to the value of that column in the current row, [@Driver]) or, else, an empty string, “”. The MEDIAN function then ignores those elements which are empty strings and calculates the median over the numeric elements of the list. The curly brackets show that this is an array formula, because the IF function is applied to the whole column instead of the current row only.

The same approach can be applied to any other function. Just use the following template:

=function( IF(filter_condition, data_that_pass_filter, “”) )

and don’t forget to press Ctrl+Shift+Enter to ensure that you get an array formula.

This was the core of the solution. What’s left is to create a column showing whether a given time spent is atypical:

=OR([@TimeSpent] / [@MedTimeSpent] – 1 < -30%, [@TimeSpent] / [@MedTimeSpent] – 1 > 30%)

and recalculate the average column, taking into account only those values that are typical:

=AVERAGEIFS([TimeSpent], [Driver], [@Driver], [Atypical], FALSE)

Finally, the deviation column has now to be based on the recalculated average column. Here is how the resulting table looks like:



2 thoughts on “On-the-Fly Filters in Excel Functions

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s