Determining Deliveries from Stations’ Primary and Secondary Terminals

Assume we’re operating in a business that is delivering products from terminals to stations. (These products, of course, are then sold at stations but that’s not the topic of this post.) We have a large dataset showing how much of each product was delivered to each station from each terminal. We are interested in showing, for each station, how much of each product, in absolute terms and percentagewise, was delivered from a station’s top two terminals. We are not interested in smaller terminals (third, fourth etc.), so we want to accurately show the summary statistics for the secondary terminal and reallocate the remaining volume to the primary terminal.

I’ll illustrate the approach on the following dataset in an Excel table called Data:

Station Product Terminal Volume
1 A T1 110102
1 A T2 97448
1 A T3 107668
1 B T1 97631
1 C T1 96481
2 A T4 100970
2 B T3 104833
2 B T4 111438
2 B T5 101601
3 A T2 96478
3 A T4 98585
3 B T1 106350
3 C T3 102795
3 C T5 102129
4 A T1 101276
4 B T1 99078
4 C T1 96343

We have four stations (1, 2, 3 and 4) where we deliver three products (A, B and C) from five terminals (T1 through T5).

The resulting table should look in the following way:

Station Product Terminal 1 Terminal 1 Vol Terminal 1 % Terminal 2 Terminal 2 Vol Terminal 2 %
1 A
1 B
1 C
2 A
2 B
3 A
3 B
3 C
4 A
4 B
4 C

Terminal 1 is the primary terminal for a given combination of station and product and Terminal 2 is the secondary terminal.

There are, of course, several ways to tackle this problem. I’ll illustrate an approach that does not create any intermediate tables. But for it to work, we’ll have to add a few columns to the data table. These are:

  • TotalVol column:   shows the total volume of a product delivered to a station from all terminals; the formula is =SUMIFS([Volume], [Station], [@Station], [Product], [@Product])
  • MaxVol column:   shows the maximum volume of a product delivered to a station across terminals (this is the quantity delivered from the primary terminal before adjustments); the formula is {=MAX(IF(([Station] = [@Station]) * ([Product] = [@Product]) = 1, [Volume], “”))}. Note that this is an array formula that applies an advanced on-the-fly filter to calculate the maximum value from a correct subset of station–product combinations
  • IsPrimary column:   determines whether a given row contains data about the primary terminal of a given station–product combination; the formula is =([@Volume] = [@MaxVol])
  • SMaxVol column:   shows the second maximum volume for a given station–product combination across terminals (this is the quantity delivered from the secondary terminal); the formula is {=MAX(IF(([Station] = [@Station]) * ([Product] = [@Product]) * ([Volume] < [@MaxVol]) = 1, [Volume], “”))}. This is the same formula as for MaxVol column but an additional filtering condition is that we’re only taking terminals where volume delivered is less than maximum volume delivered for a station–product combination. In other words, we’re computing the maximum volume across all non-primary terminals
  • IsSecondary column:   determines whether a given row contains data about the secondary terminal of a given station–product combination; the formula is =([@Volume] = [@SMaxVol])
  • UID column:   a column that will uniquely identify a combination of station and product, which we’ll refer to in the results table; the formula is =[@Station] & “-” & [@Product]
  • UID-Primary column:   a column that will uniquely identify a row containing the primary terminal for a given station–product combination; the formula is =[@UID] & “-” & [@IsPrimary]
  • UID-Secondary column:   a column that will uniquely identify a row containing the secondary terminal for a given station–product combination; the formula is =[@UID] & “-” & [@IsSecondary]

After these manipulations, the Data table becomes:

determining-deliveries-from-stations-primary-and-secondary-terminals-data-modified

Then the formulae in the results table are quite straightforward. Just two remarks are in place. Firstly, recall that we have to adjust the volumes from the primary terminals when deliveries are made from more than two terminals. Secondly, we have to wrap the formulae for the secondary terminal in the IFERROR function, because not everywhere we have a secondary terminal, and without wrapping, we’ll have #ERROR! in our results table. Here the formulae are:

  • Terminal 1 column:   =INDEX(Data[Terminal], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Primary], 0))
  • Terminal 1 Vol column:   =INDEX(Data[TotalVol], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Primary], 0)) – INDEX(Data[SMaxVol], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Primary], 0))
  • Terminal 1 % column:   =[@[Terminal 1 Vol]] / INDEX(Data[TotalVol], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Primary], 0))
  • Terminal 2 column:   =IFERROR(INDEX(Data[Terminal], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Secondary], 0)), “”)
  • Terminal 2 Vol column:   =IFERROR(INDEX(Data[SMaxVol], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Secondary], 0)), “”)
  • Terminal 2% column:   =IFERROR([@[Terminal 2 Vol]] / INDEX(Data[TotalVol], MATCH([@Station] & “-” & [@Product] & “-TRUE”, Data[UID-Secondary], 0)), “”)

Here’s how the results table looks like:

determining-deliveries-from-stations-primary-and-secondary-terminals-result

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s