You are on page 1of 20

Advanced Functions, Add-Ins & VBA

Prepared For IALEIA SWOC November Excel Workshop By Joseph Ryan Glover Police Analyst.com

November 28, 2012 Contact 416-538-0659 joseph.ryan.glover@gmail.com www.policeanalyst.com

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

Table of Contents
Advanced Functions, Add-Ins & VBA 3 Advanced Functions? The FREQUENCY Function 3 4

The Microsoft Analysis ToolPak Add-In 8 Performing a T-Test to Determine Statistical Significance 9

Microsoft Are Not the Only People Who Make Add-Ins 13 The PatternUI Add-In 13

Making Your Own Add-In 16 The DAYOFYEAR Function 16

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

Advanced Functions, Add-Ins & VBA


The plan for this workshop is to start by demonstrating an example of an advanced function in Excel (specifically the FREQUENCY function), then well move to learning how to use the Analysis ToolPak Add-in that Microsoft provides for free with Excel, then Ill show you how to install Add-ins that you can find on the web and finally Ill show you how to create your own Add-In with your own functions using Visual Basic for Applications (VBA). While this material is more advanced than the last workshop Im confident of your success. If we continue to do the examples together well all learn some cool Excel tricks that can help us do our jobs more productively. Good luck and have fun.

Advanced Functions?
When you start talking about advanced functions in Excel you are moving into the realm of packaged functionality that achieves some kind of complicated effect. That is, the work done behind the scenes by the function is more complicated than just counting how long a string is (i.e. LEN) or putting together a date (i.e. DATE). Advanced functions ask the user to have a stronger understanding of what the function is designed to do and, in return, the function works to make hard things easier to accomplish.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

The FREQUENCY Function


For this workshop I am going to demonstrate how to use the FREQUENCY function as iy is useful for different kinds of analysis that law enforcement analysts might tackle. FREQUENCY is used to tabulate frequency data, which is used to create histogram charts, from a series of data points and bin sizes. A histogram is basically a column chart that counts how many data points fall within each of several bins. The bins are rangeslike 1-5, 6-10, 11-15, etcthat cover the entire range of the data set. The idea is that after binning the data you can plot the bins versus their counts and gain further insights into the character of your data. To get started have a look in the Excel file Advanced_Functions.xlsx for a sheet named Frequency. It contains in column A a list of 100 numbers the represent daily call for service numbers. This column will serve as the data that we want binned. In order to use FREQUENCY we need to figure out the full range of the data and to do that were going to use the MIN and MAX functions to find the minimum and maximum values of the data set, just like in the screen shot below.

Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

From the screen shot we can see that the min value is 205 and the max value is 295. This indicates that bins that cover the range 200 300 will suit us fine. To make things manageable I think the bins should go by 10s, so 201 210, 211 220, 221 230, etc. I set up the bins in column G the screen shot below.

Notice that the bin values have not been entered as ranges (like 201-210) but simply as the largest value each bin will hold. Excel is smart enough to understand that these values represent the bin maximums. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

Now well use the FREQUENCY function to put each of the 100 data points in column A into one of the 10 bins defined in column G. The FREQUENCY function is a little different than what youre used to, befitting its advanced status, as its whats known as an array function. Unlike traditional Excel functions array functions are passed multiple arrays of values (really just multiple ranges) and the function outputs the results across multiple cells. In order to use FREQUENCY we need to first highlight all the cells the results will fill, and for this example that will be cells H2 through H11 and then type, in the formula bar at the top of the page, =FREQUENCY(A2:A101, G2:G11) just as in the screen shot below. But, before you hit the enter key, a key element of array functions is that when you are finished typing them in you need to hit the CTRL, SHIFT and ENTER all at the same time. If you do this properly you will see two things: the first is that each highlighted cell in the range H2:H11 will get a value and the function in the formula bar will be surrounded by curly braces like this {}.

After successfully entering the function youll see 10 values in H2 through H11. Each number corresponds to the number of data points that falls within each of the bins we defined and their sum is the total number of data points in our data set (in this example there are a hundred). Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

To get a quick look at the histogram highlight columns H2 through H11 and hit the Insert->Charts->Column Charts. Excel will spit out a chart that has the 10 columns and that reveals that the daily number of calls for service follow a normal or bell curve, which, if youre looking to do further analysis on calls for service, is very valuable information. Screen shot below.

So there you have it, an example of an Excel advanced function that not only works in a different way than other functions (i.e. its an array function) but outputs a complex result that could not otherwise easily be achieved (try binning by hand, it takes forever, especially if you have a lot of data points). I admit, even using FREQUENCY is a little complicated and well see in the next section how Microsoft and Excel have worked to make producing these kind of results even easier.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

The Microsoft Analysis ToolPak Add-In


The Analysis ToolPak is a free Add-In provided by Microsoft as part of the default installation of Excel. Add-Ins are a kind of Excel file, typically with a .xlam extension, that package together a bunch of related functionality. Add-Ins can be made by big companies like Microsoft but they are also be made by third parties (there are number of companies whose whole business is making and selling Add-Ins) and by every day users of Excel (Ill show you how to make an Add-In later in this work shop). To get started with the Analysis ToolPak we need to check if it is installed. You may or may not have it already installed but to check if you do you click the Datatab in Excel and look at the far right. If you see Data Analysis as an option, its already active (see screen shot).

If you dont see it that means we need to turn it on. First, click the green Filetab (or the Excel jewel if youre in 2007) to call up the Save/Open/Close menu. Look around the option named Optionsand click it. This will open up a window with a menu on the left. Click on the menu option named Add-Ins on the left hand side. The screen will change and at the bottom, beside the word Manage, will be a select box and a button labeled Go. Click the Go button to open yet another window and you will see a series of checkboxes. Fingers crossed that one of them says Analysis ToolPak and is unchecked. Click the checkbox and click ok. Now go back to Data tab, look at the far right and youll see Data Analysis. Youre set. If you click the Data Analysis button in the menu youre going to get a new window with a lot of options. Its not possible to cover everything thats in there in one day (or one week). What Im going to do is demonstrate one of the tools that I use often to determine if a result is real or not: its called t-Test: Two-Sample Assuming Unequal Variances. Yes, its a mouthful, dont worry, Ill walk you through it. (As a side note, did you see the Histogram tool in the checkbox list? It does exactly what we did above manually with FREQUENCY.)

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

Performing a T-Test to Determine Statistical Significance


The value of the t-Test is that it can tell you if a difference you are seeing in two measurements is legitimate or if the difference is likely to be just from randomness. In the example below I am going to use the t-Test to determine if the difference in the number of daily motor vehicle collisions during the winter is meaningfully different then the number during the summer. This is just one of a nearinfinite number of uses. First, you need some background to understand what it is were doing. In statistics we are always comparing two hypotheses. The first is named the null hypothesis and the second is named the alternate hypothesis. We almost always want to reject the null hypothesis and accept the alternate one because in that situation it means something interesting has occurred. For our motor vehicle collision example we want to determine if there is a meaningful difference in the number of daily motor vehicle collisions during the first quarter of the year (i.e. the winter) versus the third quarter of the year (i.e. the summer). Now, you may be thinking obviously theres a difference, the weather but lets, for the sake of education, assume that we need to demonstrate statistically that there is a difference. In this study the null hypothesis would be that there is no difference between the quarters, they are the same while the alternate hypothesis would be yes, there is a statistically significant difference between the two quarters. The term statistically significant needs some explanation. When we want something to be statistically significant we want to be able to say, with a particular level of confidence, that the results we are seeing are not just due to chance. First we pick our confidence level or how sure we want to be and then, through the magic of statistics, we are provided a number that our results have to beat in that we can be that sure. In this example were going with the 95% confidence level which means that if we find a difference between the average number of MVCs during the winter and summer we want to be 95% sure its not due to chance. We can up the confidence interval to 99% if we want but, as well see, well need to adhere to even stricter conditions. Whatever we choose just remember that significant in statistically significant doesnt mean important (as significant typically means in everyday conversation) it just means not due to randomness. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

How do we go about demonstrating that the two quarters have real, non-random differences in the number of daily motor vehicle collisions? The first thing we do randomly select from the first and third quarters 30 different days worth of MVC counts. Ive provided this for you in a sheet named MVCs in the Advanced_Functions.xlsx spreadsheet and screen grabbed the start and end of the data below.

Notice that we there are two columns of data, one marked Winter and one marked Summer and each has 30 entries. Also notice that at the bottom of each column is a number labeled Mean. In those cells I have used Excels AVERAGE function to find the mean of all the numbers (add them all up, divide by 30). And behold, the means are different. That proves the quarters are different, right? Not quite. We still have to deal with the peskiness that is statistical significance. Its possible, after all, that its just through chance those means are different. As discussed above we want to be 95% sure that they arent. This is where the Data Analysis Add-In and the t-Test: Two-Sample Assuming Unequal Variances tool comes into play. We are going to feed the two sets of 30 numbers through the tool and its going to spit out a bunch of results then were going to zero in on the result that gives us the thumbs up or thumbs down on meaningful difference. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

10

So, click open the Data Analysis window. Click t-Test: Two-Sample Assuming Unequal Variances in the list and click the OK button. This opens up a new window (check the screen shot below) with some options. For the Variable 1 Range click the little arrow button and select all of the Winter data, which is in B2:31. For Variable 2 Range select all the Summer data from C2:C31. For the Hypothesized Mean Difference enter 0 (since were trying to prove they are the same, the difference of the mean should be 0) and everything else can stay the same. However, notice that there is field named Alpha that has the value 0.05. Its not a coincidence that 0.05 = 1 0.95. Alpha is another way of asking how sure we want to be.

Click the OK button. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

11

Excel will open up a new worksheet populated with a bunch of labels and values just like the one in the screen shot below.

So what are we looking at? Well, there are the means we saw. Next comes the variances, which are measures of how spread out values are, and they are quite different (which is why we needed to use the test that assumed unequal variances) and a little further down we see our t Stat and it has a value of about 4.35. Skip down two lines to the value labeled t Critical one-tail and note that it is about 1.68. Thats good news for us because the rule is that if your t Stat is larger than your t Critical value then the null hypothesis can be rejected and our results are therefore statistically meaningful. This means that we can now state, with the power of math backing us up, that we are 95% sure that the daily average number of MVCs in the winter are greater than the daily average number of MVCs in the summer. But what, exactly, does this demonstration of the Analysis ToolPak Add-In have to do with advanced functions? The answer lies in the sheet labeled Manual t-Test in the Advanced_Functions.xlsx file. On this sheet I worked out all the bits and bobs of the t-Test solution using traditional function calls and look at what a mess it is. I had to Google all over to remind myself how to do this manually and it still looks like gibberish to me. And thats the power of Add-Ins: they package up messiness and make it easier to work with. They make it so non-experts can still use powerful tools.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

12

Microsoft Are Not the Only People Who Make Add-Ins


There are a thousand different Add-Ins out there. Some are insanely complicated, like SpatialXL (an Add-In that puts a functional GIS inside of Excel) and cost an arm and a leg (SpatialXL is $1000+) and some are available for free and solve a simple problem that someone had. Were going to have a look at one of the latter kind of Add-Ins now

The PatternUI Add-In


The Add-In that were going to install is named PatternUI.xlam. To get the file you can either download it off the web from http:// officeblogs.net/excel/PatternUI.zip or you can find it on your memory stick. However you get it, drag it to your My Documents folder on your computer. To install an Add-In open up Excel and click the on the green File tab. Just like turning on the Analysis ToolPak Add-In go to Options>Add-Ins and click the GO button near the Manage label. However, this time, when the little window with the checkboxes opens up click the Browse button to open the file dialogue. Browse to where you dropped the PatternUI.xlam file, click OK, make sure the checkbox beside the name is checked and click OK again. Continued on the next page...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

13

PatternUI is used for putting patterns on charts so go back for a moment to the t-Test page and highlight the first 5 or so values under the Winter label and click Insert->Column to create a new column chart. Click on the chart and then click the Chart Tools tab named Format to see the button for the new Add-In named Patterns like in the screen shot below.

Continued on next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

14

If you click the columns and then the Patterns button youll be offered a number of different hatching patterns that, if you select one, will be applied to the columns (see screen shot below). I use this Add-In frequently at work because my charts are usually printed in black and white and using patterns instead of colours really allows the differences to stand out in a way that just doesnt happen with colour.

So there you have it. Really useful functionality, bundled into an Add-In and provided for free by a kind soul on the Internet. If you are having a chronic problem in Excel that you think could be solved by an Add-In chances are someone else has had the same problem and possibly published their solution. Hop on Google, look around, and see if you can find a solution to your problem that has been turned into an Add-In by some Good Samaritan.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

15

Making Your Own Add-In


Sometimes youll find that other people havent solved your problem and maybe youll want to roll up your sleeves and solve it yourself. Now its time to make your own Add-In.

The DAYOFYEAR Function


Close all your Excel workbooks (its best to start fresh) and open a new one but dont save anything yet. The first thing youll need to do is add the Developer tab to your workbook. This process is a little different in Excel 2007 and Excel 2010 so I am going to explain how to do it in 2010 first. Once again, go back to the green File table and select Excel Options. In the menu at the left select Customize Ribbon. You should see something similar to the screen shot below. From the drop down on the left select All Tabs and scan in the left hand list for a checkbox that says Developer. Click on it and click the Add >> button in the middle of the page. This will shift the Developer checkbox over to the right hand column. Click OK.

In 2007 its quite a bit easier to activate the Developer tab. Click on the Excel jewel in the top left hand corner and find the Excel Options button near the bottom of the window. Click it. The new window that opens up has a menu where the topmost option is Popular. Click that if it isnt already selected. There will be a set of four checkboxes and number three will be labelled Show Developer tab in the Ribbon. Click that checkbox, click ok, and youre good to go. You now have the developer tools active in Excel. The developers tools contain all you need to create your own Add-In. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

16

Back in your workbook youll see the new Developer tab at the top of the page Click it. At the far left you will see a button labeled Visual Basic (see the screen shot below).

Click the Visual Basic button and a whole new window will open up. This is the Visual Basic developer window and a screen shot is below.

Now dont be intimated, most of the stuff is unimportant. But what is important is in the left hand column in the little window named Project. There you will see an entry named Book2 or whatever the name of your current worksheet is. Click the plus symbol to open it up. In there youll see your worksheets and not much else. Right click on the project name and select Insert->Module. All the code you are going to write for this Add-In is going to reside in this module, which is really just a text file with a fancy name. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

17

Click your cursor inside the text area to left and lets solve a problem. In my first session we briefly talked about how to determine the day of the year from a date. I told you there was no built in Excel function to do that work and we had to make an ugly formula to get the answer. I also told you I would fix that problem later. This is that later. We are going to write a function named DAYOFYEAR that, when given a date, will return the day of the year that date represents. Because this function is going to live in your Add-In it will be available to all your Excel worksheets in the future. The following screen shot lists the entire code for this Add-In. The discussion is below.

Ill break down the code line by line. In the first line we are defining the function we are going to make. We need to tell Excel that a) were making a new function, b) that its name is DAYOFYEAR and c) that were going to pass in a value and that value is going to be a date. In the second line I use the Year function, which works the same in VBA as it does in Excel, to get the year of the date that was passed in. I save that year in a variable named date_in_year. In the third line I create a new date using the VBA function DATESERIAL, which behaves exactly the same as Excels DATE function, and I give it the previously determined date_in_year, the month of January (1) and a day of 0. In the fourth line I set the variable DAYOFYEAR to be equal to the date passed in date_in minus the start_of_year date I calculated in line three. Since the variable in line four is named DAYOFYEAR thats what Excel is going to send back from the function call. The fifth and last line just tells VBA that the function is over. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

18

Back in your workbook enter a date into cell A2 where you know the day of the year (try something early in January). Then enter into cell B2 the function =DAYOFYEAR(A2) and see if you get something like in the screen shot below. If you get an error like #NAME or #VALUE instead of the right answer then its likely that you have a typo in your code. Go back into the Visual Basic editor and double check that the code is correct.

Try a few different dates where you know the day of year to prove to yourself that this function you just created does what its supposed to. Continued on the next page ...

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

19

Now we need to save our Add-In. We do this by saving the file as we normally would. Click the save button and when the Save As dialogue comes up change the Save as type drop down to read Excel Add-In and name it MyAddIn. Put it in your My Documents folder and click the Save button. The screen shot demonstrates the options.

Once its saved, close all of Excel again. Now open a new worksheet. Type in a date in cell A1 and use DAYOFYEAR in cell B1. Did it work? Not likely. Thats because your Add-In hasnt been installed yet. Just like the PatternUI.xlam file you need to install MyAddIn. xlam the first time before you can use it (it worked before because you were actually in the file). So, once again, click File->Options>Add-Ins. When the Add-Ins window opens, click Browse and locate your Add-In. Click OK and OK. Now try the DAYOFYEAR function call again. Perfect. If you close out of Excel and open it back up, the function call will work without needing to install the Add-In since it will be active until you uninstall it. Congratulations on making your first Excel Add-In. This concludes the Advanced Functions, Add Ins & VBA workshop. Thank you.

Advanced Functions, Add-Ins & VBA. CC BY-NC-SA 2012 Joseph Ryan Glover

20

You might also like