You are on page 1of 186

SAS Programming 1: Essentials

How-To Demonstrations
For SAS 8 and SAS9

SAS Programming 1: Essentials e-Course Copyright 2010 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. Book code NULL, course code ECPRG1, prepared date 10Mar2010. ECPRG1_002

For Your Information

iii

Table of Contents
Lesson 1 Lesson 2 Lesson 3 Lesson 4 Lesson 5 Lesson 6 Lesson 7 Lesson 8 Lesson 9 Lesson 10 Lesson 11 Lesson 12 Lesson 13 Lesson 14 Lesson 15 Getting Started with SAS Programming ............................................. 1-1 Navigating and Using the SAS Interface ............................................ 2-1 Working with SAS Code....................................................................... 3-1 Working with SAS Libraries and SAS Data Sets ............................... 4-1 Creating SAS Data Sets ....................................................................... 5-1 Creating SAS Data Sets from Microsoft Excel Worksheets .............. 6-1 Creating SAS Data Sets from Delimited Raw Data Files ................... 7-1 Validating and Cleaning Data .............................................................. 8-1 Manipulating Data................................................................................. 9-1 Combining SAS Data Sets Vertically ................................................ 10-1 Combining SAS Data Sets Horizontally ............................................ 11-1 Enhancing Reports............................................................................. 12-1 Using the Output Delivery System to Create Reports..................... 13-1 Creating Summary Reports and Data Sets....................................... 14-1 Creating Graphs Using SAS/GRAPH Software ................................ 15-1

iv

For Your Information

Lesson 1 Getting Started with SAS Programming

Lesson 1: Getting Started with SAS Programming

Accessing, Managing, Analyzing, and Presenting Data

This demonstration shows you some sample reports in SAS. Later in the course, after you create your practice files, learn how to submit programs and view output in SAS, and learn how to define SAS libraries, you can return here and submit the program reports.sas to produce the following sample reports. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. View the first report, which is an HTML report that lists data. You see columns for employee ID, job title, salary, gender, date of birth, and manager ID. Take a look at Job_Title. The values here are codes that you cant interpret unless you know the data. 2. View the second report. To make the data values more meaningful, you can format them as shown in the second report. For example, you can display I as Warehouse Assistant I or M as Warehouse Manager. You can also display Female and Male instead of F and M. Notice that the report is grouped by gender. Now view the next report, which is a text version of the same report. In SAS, this type of output is called listing output. 3. View the third report, which is a distribution by job title. In this report, which job title represents the highest percentage of the total? Ten individuals have the job title Warehouse Assistant I, making up 55.56 percent of our data. 4. View the fourth report. This report, which is also in HTML format, gives us some simple summary statistics. Take a look at the mean, or average, salary for each job title. Which job title has the highest standard deviation? The Warehouse Assistant II job title has the highest standard deviation for Salary. Warehouse Assistant IV and Warehouse Manager each have only one person, so they have no standard deviation. Now view the next report, which is a listing version of the same report. 5. View the fifth report. You might also want to produce more advanced statistics in your reports. The fifth report shows the mean, the median, the mode, and other statistical measures for Salary. Scroll

1-2

Lesson 1: Getting Started with SAS Programming

forward to the Extreme Observations table. These are the observations with the five lowest and the five highest values in the data. Thats good information to know. Look at the list of highest values. How many individuals in the data have a salary over $40,000? Only one individualin record 17, or what we call observation 17 in SAShas a salary over $40,000. 6. View the sixth report, which is a three-dimensional bar chart. In this frequency report, Job_Title appears on one axis and Gender on the other axis, with frequency count as the height of the bars. Notice that when the mouse hovers over one of those bars, the values for the bar appear. This feature is especially useful if one of the bars is somewhat hidden in the background. View the seventh report, which is a three-dimensional pie chart showing the total salary by job title for warehouse assistants. We've exploded the slice for Warehouse Assistant II to highlight it. Which job title accounts for more than half the total salaries for warehouse assistants? The sum of salaries for the Warehouse Assistant I job title is $262,615, more than half the total.

1-3

Lesson 1: Getting Started with SAS Programming

Running SAS in Interactive Mode

This demonstration shows you how to start a SAS session. 1. Double-click the desktop icon for SAS. When SAS opens, you see the interactive windowing environment. Notice the Results and Explorer windows on the left and the Log and Editor windows on the right. 2. In the Log window, notice that there are some messages including copyright notes and the version of SAS. One important piece of information is the site number. If you contact SAS technical support by telephone or Web site for assistance, SAS requests this site number. You might want to make a note of that number once you start SAS. 3. What is the site number in the Log window shown here? It is 7857162208.

1-4

Lesson 1: Getting Started with SAS Programming

Viewing the SAS Programming Process

This demonstration shows you the overall programming process in SAS. You havent learned how to use SAS yet, but if you want to experiment, you can follow these steps. Assume youve already defined your business need and planned your SAS program. NOTE: You must create practice files for the course before you complete this demonstration. Youll create practice files later in this lesson. For information about creating practice files, see Setting Up Practice Files in Appendix A. 1. Copy and paste this program into the editor: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; title 'New Sales Employees'; porc print data=work.newsalesemps; run; proc means data=work.newsalesemps; class job_title; var salary; run;

1-5

Lesson 1: Getting Started with SAS Programming

title; 2. Replace my-file-path with the location where you stored the practice files. 3. Select Run Submit to submit the program, which creates a SAS data set and two reports. 4. When the last report appears, view it. Then scroll to navigate to the first report and view it. 5. Part of the program created a SAS data set. What could you do to verify that the data set was created successfully? Check the Log window. Does the log indicate any problems? The program did run, but the log shows that a keyword was misspelled. PROC was misspelled as PORC. 6. Return to the editor and correct the misspelling. 7. Select Run Submit to resubmit the program. Check the log again. What else could you do to verify that the data set was created successfully? You could also view the data to make sure it looks correct. Navigate to the first report and view it. The data looks correct. At this point you could save your program.

1-6

Lesson 2 Navigating and Using the SAS Interface

Lesson 2: Navigating and Using the SAS Interface

Opening a SAS Program

This demonstration shows you how to open a SAS program in either the Editor or Program Editor window. You'll also explore some editing features of these windows. Note: This demonstration does not apply to SAS Enterprise Guide. 1. If the editor (the Editor window or Program Editor window) is not active, click the window to activate it. 2. To open a SAS program, select File Open Program (or File Open on UNIX and z/OS) and browse for the program newsalesemps.sas, which is in the location where you stored the practice files. On z/OS, type the fully qualified name of the program. Then click OK. 3. Notice the color-coding of elements in the code such as keywords. 4. Leave the program in the editor for the upcoming practice. For information about opening a program in UNIX, see Opening a SAS Program in Appendix E: UNIX Notes. For information about opening a program in z/OS, see Opening a SAS Program in Appendix F: z/OS Notes.

2-2

Lesson 2: Navigating and Using the SAS Interface

Running a SAS Program and Viewing Results

This demonstration shows you how to enter and submit a SAS program. Start with SAS windows cleared. As you learned earlier, you can use the Editor window or the Program Editor window as your editor for entering programming code. Note: This demonstration does not apply to SAS Enterprise Guide. 1. Copy and paste this program into the editor: data work.newemps; input First_Name $ Last_Name $ Job_Title $ Salary; datalines; Steven Worton Auditor 40450 Merle Hieds Trainee 24025 ; run; proc means data=work.newemps; run; proc print data=work.newemps; run; 2. Maximize the editor to see the entire program. 3. Notice the layout of the program. When you write SAS programs, you can enter code in free format. That is, SAS statements can begin and end in any column, be on the same line as other statements, extend over several lines, can be in uppercase or lowercase. However, for readability, most SAS programs follow a suggested standard. Our program here follows the suggested standard. Notice that

2-3

Lesson 2: Navigating and Using the SAS Interface

DATA, PROC, and RUN statements begin in or near column one; all other statements are indented by several columns for readability, a blank line separates steps, and only one statement appears on a line. 4. Look at the INPUT statement, which specifies the variables to create from the raw data file. This statement illustrates another suggested standard, which is to define each SAS variable on a separate line and to align the beginning and ending position numbers. 5. Click the running man tool on the toolbar to submit the program. 6. Activate the Log window by clicking the Log button. 7. Scroll down to review the log.

2-4

Lesson 2: Navigating and Using the SAS Interface

Specifying Results Preferences

This demonstration shows you how to specify listing results, HTML results, or both. Note: This demonstration does not apply to SAS Enterprise Guide. 1. Copy and paste this program into the editor: data work.newemps; input First_Name $ Last_Name $ Job_Title $ Salary; datalines; Steven Worton Auditor 40450 Merle Hieds Trainee 24025 ; run; proc means data=work.newemps; run; proc print data=work.newemps; run; 2. Submit the program. 3. Scroll through output in the Output window. 4. Select Tools Options Preferences and click the Results tab. 5. Clear Create listing and select Create HTML. 6. Click OK.

2-5

Lesson 2: Navigating and Using the SAS Interface

7. Resubmit the program and view output in the Results Viewer window. 8. Clear all content in the Results window. 9. Navigate to the Results tab again. Leave Create HTML selected and select Create listing. Click OK. 10. Resubmit the program again. 11. Expand the tree in the Results window to show the two types of output for each PROC step. 12. Click each report icon (first listing, and then HTML) for each PROC step to show reports.

2-6

Lesson 2: Navigating and Using the SAS Interface

Viewing a SAS Data Set

This demonstration shows you how to view a SAS data set using the Explorer window. Note: This practice does not apply to SAS Enterprise Guide.

1. Copy and paste this program into the editor: data work.newemps; input First_Name $ Last_Name $ Job_Title $ Salary; datalines; Steven Worton Auditor 40450 Merle Hieds Trainee 24025 ; run; proc means data=work.newemps; run; proc print data=work.newemps; run; 2. Submit the program. 3. Click the Explorer window to activate it, or select View Explorer from the menu. 4. On Windows and UNIX, double-click the Libraries icon and then double-click the Work icon. On z/OS, click Work.

2-7

Lesson 2: Navigating and Using the SAS Interface

5. On Windows and UNIX, double-click the Newemps icon. On z/OS, type ? next to Newemps and press Enter. From the menu, select Open. How many variables does the NewEmps data set contain? The data set contains four columns or variables: First_Name, Last_Name, Job_Title, and Salary. 6. Close the data set.

2-8

Lesson 2: Navigating and Using the SAS Interface

Exploring SAS Help and Documentation

This demonstration shows you how to access SAS Help and Documentation. Note: This demonstration does not apply to SAS Enterprise Guide.

1. Suppose you want more information about the statements in this program. Look over the code: ods trace on; ods select ExtremeObs; proc univariate data=orion.shoes_tracker; var Product_ID; run; ods trace off; 2. Select Help SAS Help and Documentation. UNIX or z/OS: If your SAS software is not configured to use a remote browser, you can use SAS OnlineDoc for this demonstration. To open SAS OnlineDoc, expand Reference in the course menu and click Selected SAS Documentation. Then click the 8.2 link next to SAS OnlineDoc. 3. Expand SAS Products, then Base SAS. Depending on your SAS version, expand either Base SAS 9.2 Procedures Guide or SAS Procedures. 4. Under Procedures, expand information for the UNIVARIATE procedure. Follow the link for the UNIVARIATE procedure, if any. What does PROC UNIVARIATE do? PROC UNIVARIATE produces a variety of statistics, including descriptive statistics, histograms, plots, goodness-of-fit tests, and many more.

2-9

Lesson 2: Navigating and Using the SAS Interface

5. View details about PROC UNIVARIATE syntax. 6. Close the SAS Help and Documentation window.

2-10

Lesson 3

Working with SAS Code

Lesson 3: Working with SAS Code

Adding Comments to a SAS Program

1. Copy and paste, then submit the following code to create the temporary data set NewSalesEmps: proc copy in=orion out=work; select newsalesemps; run; 2. Copy and paste the following example code in your editor: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; title 'New Sales Employees'; proc print data=work.newsalesemps; run; proc means data=work.newsalesemps; class Job_Title; var Salary; run; title;

3-2

Lesson 3: Working with SAS Code

Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 3. Add the word numeric as a comment for the variable Salary to indicate to another programmer that it is a numeric variable. 4. By using the alternate form for comments, add a comment indicating the SAS data set created and used by this piece of code. 5. Assume that you dont want to use the PROC PRINT statement at this time. So, you comment it out until you can debug the rest of your program. 6. Submit the code and look at the log. Notice that SAS didn't process the portions of code that were commented out.

3-3

Lesson 3: Working with SAS Code

Viewing and Correcting Syntax Errors

This demonstration shows you how you diagnose and correct syntax errors in a SAS program. 1. Copy and paste, then submit the following code to create the temporary data set NewSalesEmps: proc copy in=orion out=work; select newsalesemps; run; 2. Submit the following program to create a data set and print a report from it: daat work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; title 'New Sales Employees'; proc print data=work.newsalesemps run; proc means data=work.newsalesemps average max; class Job_Title; var Salary; run;

3-4

Lesson 3: Working with SAS Code

title; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 3. Review the output. 4. Review the log. 5. Notice that there is a WARNING message. 6. Notice that there is a syntax error. 7. To correct the error, modify the program in your editor. 8. Clear the messages from the Log window. 9. Resubmit the code. 10. Review the output. 11. Review the log.

3-5

Lesson 3: Working with SAS Code

Correcting Unbalanced Quotation Marks


1. 2. 3. Cancel the submitted statement. Correct the error. Resubmit the code.

Let's see how to cancel submitted statements and correct a missing quotation mark in a SAS program. Note: This demonstration does not apply to SAS Enterprise Guide. 1. Copy and paste, then submit the following code to create the temporary data set NewSalesEmps: proc copy in=orion out=work; select newsalesemps; run; 2. Copy and paste, then submit the following program. Note the DATA step running message at the top of the active window. data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=',; input First_Name $ Last_Name $ Job_Title $ Salary; run; title 'New Sales Employees'; proc print data=work.newsalesemps; run; proc means data=work.newsalesemps; class Job_Title; var Salary;

3-6

Lesson 3: Working with SAS Code

run; title; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 3. Cancel the submitted statements. For more information about completing the task in UNIX, see Cancelling Submitted Statements in Appendix E: UNIX Notes. For more information about completing the task in z/OS, see Cancelling Submitted Statements in Appendix F: z/OS Notes. 4. Correct the error. Add the final quotation mark to the DLM= option specification before the semicolon. 5. Submit the corrected code. 6. Check the output and log. 7. Save the program.

3-7

Lesson 3: Working with SAS Code

3-8

Lesson 4 Working with SAS Libraries and SAS Data Sets

Lesson 4: Working with SAS Libraries and SAS Data Sets

Assigning a Libref to a SAS Library

In this demonstration, you run a LIBNAME statement in SAS to define the Orion library. 1. Copy and paste the following code into the editor: libname orion 'my-file-path'; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 2. Check the log to verify that SAS successfully assigned the Orion libref. 3. Verify that the Orion library appears in the Explorer window.

4-2

Lesson 4: Working with SAS Libraries and SAS Data Sets

Viewing the Contents of a SAS Library

In this demonstration, you run a PROC CONTENTS step in SAS to see a list of data sets in the Orion library. You also use the Explorer window to see information about the library. Make sure youve defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit the following code: proc contents data=orion._all_ nods; run; 2. View the output. The first table, Directory, lists general information about the Orion library. The second table lists all members of the library in alphabetical order, and provides basic information about each member. As the Member Type column indicates, the Orion library contains two types of SAS files: data sets and indexes. Only the data sets are numbered in the first column. 3. Scroll to the bottom of the output. Notice that there are 32 data sets in the Orion library. However, there are more than 32 members, because this library also contains indexes. 4. Revise the PROC CONTENTS step so that the output displays the descriptor portions for all data sets in the Orion library, as follows: proc contents data=orion._all_; run; 5. Submit the code and view the output. Scroll down to see the descriptor portions. 6. In the Explorer window, double-click Libraries to show all the available libraries.

4-3

Lesson 4: Working with SAS Libraries and SAS Data Sets

7. To view the properties of the Orion library, right-click the Orion library and select Properties from the pop-up menu. Notice that the information in the Properties window is similar to the information in the first table in the PROC CONTENTS output. Close the Properties window. 8. Double-click the Orion library to view its contents. Scroll down to see all the data sets that this library contains.

4-4

Lesson 4: Working with SAS Libraries and SAS Data Sets

Viewing the Descriptor Portion of a Data Set

In this demonstration, you run a PROC CONTENTS step to view the descriptor portion of Work.NewSalesEmps. Then, you use the Explorer window to view descriptor information. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Make sure you've run the following program to create the temporary data set Work.NewSalesEmps: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 1. Submit the following code: proc contents data=work.newsalesemps; run; 2. View the output. The first table shows general information about the data set, such as the data set name, and the time and date that the data set was created. Notice that NewSalesEmps has 71 observations. The second table displays operating environment information. The third table is an alphabetic list of variables in the data set and their attributes.

4-5

Lesson 4: Working with SAS Libraries and SAS Data Sets

3. In the Explorer window, double-click the Work library to view its contents. Right-click NewSalesEmps and select Properties from the pop-up menu. The tabs in the Properties window contain information similar to the information in the PROC CONTENTS report. Click Cancel to close to the Properties window.

4-6

Lesson 4: Working with SAS Libraries and SAS Data Sets

Viewing the Data Portion of a Data Set

In this demonstration, you run a PROC PRINT step to see the data portion of Work.NewSalesEmps. You look at the PROC PRINT output that SAS displays by default. Then, you look at the same data set using the VIEWTABLE window. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Make sure you've run the following program to create the temporary data set Work.NewSalesEmps: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 1. Copy and paste the following code into the editor: proc print data=work.newsalesemps; run; 2. View the log to confirm that SAS read 71 observations from NewSalesEmps. 3. View the list report. Notice that, by default, PROC PRINT output displays all observations, all variables, and an Obs column for observation numbers. Notice that NewSalesEmps has three

4-7

Lesson 4: Working with SAS Libraries and SAS Data Sets

character variables -- First_Name, Last_Name, and Job_Title -- and one numeric variable -- Salary. Notice that Job_Title has a missing value in the third observation and Salary has a missing value in the fourth observation. 4. In the Explorer window, double-click the Work library. Double-click NewSalesEmps to open it in the VIEWTABLE window. Notice the observation numbers, variable names, and values. 5. Right-click the First_Name heading and select Column Attributes from the pop-up menu. View the variable attributes on the General, Colors, and Fonts tabs. Click Close to close the Column Attributes window. Note: The VIEWTABLE window has additional features. For example, you can customize the way that the data set appears in this window by changing colors and fonts, removing variables from display, and sorting observations. 6. Close the VIEWTABLE window.

4-8

Lesson 4: Working with SAS Libraries and SAS Data Sets

Viewing Selected Information in the Data Portion

In this demonstration, you run a PROC PRINT step to create a list report that shows the observations in NewSalesEmps, but not the Obs column. In the report, you show the three variables in this order: Last_Name, First_Name, and Salary. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Make sure you've run the following program to create the temporary data set Work.NewSalesEmps: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 1. Copy and paste the following code into the editor: proc print data=work.newsalesemps noobs; var Last_Name First_Name Salary; run; 2. Submit the program and view the output. The Obs column no longer appears. Three variables are shown as specified in the VAR statement.

4-9

Lesson 4: Working with SAS Libraries and SAS Data Sets

Sorting a Data Set by One Variable

In this demonstration, you run a PROC SORT step to sort Work.NewSalesEmps by the values of the variable Last_Name. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Make sure you've run the following program to create the temporary data set Work.NewSalesEmps: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 1. Copy and paste the following code into the editor. This PROC SORT step sorts Work.NewSalesEmps by Last_Name in ascending order and creates the new data set Work.NewSalesEmps2. proc sort data=work.newsalesemps out=work.newsalesemps2; by Last_Name; run;

4-10

Lesson 4: Working with SAS Libraries and SAS Data Sets

2. Add the following PROC PRINT step to print all variables in Work.NewSalesEmps2, with Last_Name first: proc print data=work.newsalesemps2; var Last_Name First_Name Job_Title Salary; run; 3. Submit the program. View the PROC PRINT output to verify that the observations appear in ascending order by the values of Last_Name. 4. Revise the PROC SORT step to sort by Salary in descending order and overwrite Work.NewSalesEmps2 as the output data set. The revised code is shown below: proc sort data=work.newsalesemps out=work.newsalesemps2; by descending Salary; run; proc print data=work.newsalesemps2; var Last_Name First_Name Job_Title Salary; run; 5. Submit the revised program. View the output to verify that the observations are sorted in descending order by the values of Salary. Scroll to the bottom of the output. Notice that the data set has one missing value for Salary. SAS treats a missing value as the lowest possible value, so this observation appears last.

4-11

Lesson 4: Working with SAS Libraries and SAS Data Sets

Sorting a Data Set by Multiple Variables

In this demonstration, you run a PROC SORT step that sorts NewSalesEmps by Job_Title in ascending order, and then by Salary in descending order. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Make sure you've run the following program to create the temporary data set Work.NewSalesEmps: data work.newsalesemps; length First_Name $ 12 Last_Name $ 18 Job_Title $ 25; infile 'my-file-path\newemployees.csv' dlm=','; input First_Name $ Last_Name $ Job_Title $ Salary; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 1. Copy and paste the following code into the editor. This PROC SORT step sorts Work.NewSalesEmps and creates the new data set Work.NewSalesEmps3. proc sort data=work.newsalesemps out=work.newsalesemps3; by Job_Title descending Salary; run; 2. Add a PROC PRINT step to print the variables in Work.NewSalesEmps3 in the following order: Job_Title, Salary, Last_Name, and First_Name. The code is shown below:

4-12

Lesson 4: Working with SAS Libraries and SAS Data Sets

proc print data=work.newsalesemps3; var Job_Title Salary Last_Name First_Name; run; 3. Submit the program. View the PROC PRINT output to verify that the observations are sorted by the values of Job_Title in ascending order. Within each job title, verify that the observations are sorted by the values of Salary in descending order. In the first observation, notice that Job_Title has a missing value. This observation appears first because SAS treats a missing value as the lowest possible value.

4-13

Lesson 4: Working with SAS Libraries and SAS Data Sets

4-14

Lesson 5

Creating SAS Data Sets

Lesson 5: Creating SAS Data Sets

Creating a SAS Data Set from an Existing SAS Data Set

Now that youve seen the syntax and rules for reading and creating SAS data sets in the DATA step, lets look at an example in SAS of using a DATA step to create a new data set from an existing data set. Suppose you want to create a temporary SAS data set named Work.Subset1 from the input data set Orion.Sales. 1. Copy and paste the following DATA step: data work.subset1; set orion.sales; run; 2. Submit the program and check the log. You might notice an error message about the Orion libname if you have not already defined the Orion library to the location where you stored the practice files. 3. If your log shows an error message about the Orion libname, add the following LIBNAME statement before the DATA step: libname orion 'my-file-path'; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. 4. Submit the program. 5. Check the log to see that the data set was created successfully with 165 observations and 9 variables. 6. Copy and paste the following PROC PRINT step:

5-2

Lesson 5: Creating SAS Data Sets

proc print data=work.subset1; run; 7. Submit the program and examine the output to see that the data set contains 9 variables and 165 observations.

5-3

Lesson 5: Creating SAS Data Sets

Subsetting Observations in the DATA Step

Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Suppose you want to write a DATA step to create a new data set named Work.Subset1 that contains a subset of the observations from Orion.Sales. You want to include only those employees who are from Australia and who have the word 'Rep' in their job title. 1. Copy and paste the following DATA step: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; run; 2. Can you think of a way to write this WHERE statement using different operators or symbols? You could write it like this: where Country eq 'AU' and Job_Title like '%Rep%'; 3. Submit the step and examine the log to verify that it ran without errors and that the new data set was created with 61 observations. 4. Submit the following PROC PRINT step: proc print data=work.subset1; run;

5-4

Lesson 5: Creating SAS Data Sets

5. Examine the output to see a listing of the output data set. Notice that all of the observations have 'AU' as the value for Country and have 'Rep' somewhere in their value for Job_Title.

5-5

Lesson 5: Creating SAS Data Sets

Subsetting Variables in the DATA Step

Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Suppose you already have a DATA step that subsets the input data to make sure that the output SAS data set includes only those observations that fit your criteria: employees from Australia who have 'Rep' in their job title. Now, suppose you want to further edit this DATA step so that the output SAS data set includes only those variables that you want to see: First_Name, Last_Name, Salary, Job_Title, and Hire_Date. 1. Copy and paste the following DATA step: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; run; proc print data=work.subset1; run; 2. Add a KEEP statement to subset the variables. (Remember, you could use either a KEEP statement or a DROP statement to subset the variables. Because you have the list of variable names that we want to keep, let's use a KEEP statement.) The DATA step should now look like this: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; run;

5-6

Lesson 5: Creating SAS Data Sets

proc print data=work.subset1; run; 3. Submit the step and examine the output to see a listing of the output SAS data set Work.Subset1. It now contains only five variables.

5-7

Lesson 5: Creating SAS Data Sets

Adding Permanent Labels to a SAS Data Set

Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Suppose you have a program to create a temporary SAS data set that includes a subset of observations and variables from the input SAS data set, and then prints a listing of the new SAS data set. You need to edit this program in order to add permanent labels to the descriptor portion of Work.Subset1. Suppose you want to add the label Sales Title to the Job_Title variable, and add the label Date Hired to the Hire_Date variable. 1. Begin with this program in the editor: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; run; 2. Add a LABEL statement after the KEEP statement, as follows: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; run;

5-8

Lesson 5: Creating SAS Data Sets

3. Add a PROC CONTENTS step after the DATA step so that you can view the descriptor portion of Work.Subset1, and a PROC PRINT step so that you can see the a list report that includes the descriptive labels. Your program should look like this: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; run; proc contents data=work.subset1; run; proc print data=work.subset1 label; run; 4. Submit the program and examine the output. You can see in the PROC CONTENTS output that the labels have been added to the descriptor portion of the Work.Subset1 data set. In the PROC PRINT output, notice that Sales Title and Date Hired have replaced their variable names as headings.

5-9

Lesson 5: Creating SAS Data Sets

Adding Permanent Formats to a SAS Data Set

Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Suppose you have a program that creates a temporary SAS data set named Work.Subset1 that contains a subset of the observations and variables from the input SAS data set, and includes descriptive labels for two of the variables. Now, suppose you want to format the Salary variable and the Hire_Date variable to make the report more readable. 1. Begin with the following program in the editor: data work.subset1; set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; run; proc contents data=work.subset1; run; proc print data=work.subset1 label; run; 2. Add a FORMAT statement after the LABEL statement. For Salary, use the COMMA8. format, and for Hire_Date use the ddmmyy10. format. The order of the FORMAT and LABEL statements is not important. data work.subset1;

5-10

Lesson 5: Creating SAS Data Sets

set orion.sales; where Country='AU' and Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; format Salary comma8. Hire_Date ddmmyy10.; run; proc contents data=work.subset1; run; proc print data=work.subset1 label; run; 3. Submit the program. The PROC CONTENTS output shows that the formats have been added to the descriptor portion of Work.Subset1. The PROC PRINT output shows that the Salary values have been formatted. Remember that the Hire_Date values are now under the descriptive label "Date Hired", and you can see that they have been formatted also.

5-11

Lesson 5: Creating SAS Data Sets

5-12

Lesson 6 Creating SAS Data Sets from Microsoft Excel Worksheets

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Accessing Excel Worksheets in SAS

Now that you've seen the syntax, let's try submitting a SAS/ACCESS LIBNAME statement in SAS and see how the resulting libref enables you to access Excel worksheets. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Suppose the Excel workbook that you want to use is the Sales.xls file, which is stored in location where you stored the practice files. So, you submit this LIBNAME statement: libname orionxls 'my-file-path\sales.xls'; Windows: Replace my-file-path with the location where you stored the practice files. SAS Enterprise Guide: If SAS and SAS Enterprise Guide are both installed on your local computer, replace my-file-path with the location where you stored the practice files.

2. Check the log to verify that the libref was successfully assigned. 3. In the Explorer window, notice that the Orionxls library icon has a small globe on it. The globe indicates that this data is outside of SAS. Double-click Orionxls. You can see that there are two items in the library. These represent the two worksheets in the original Excel workbook. At a first glance, the Explorer window makes it look as though you are viewing a library that contains two SAS data sets. But take a closer look. Notice that the names end in dollar signs. You'll learn more about these dollar sign references next.

6-2

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Using PROC CONTENTS on an Excel Worksheet

Now that you've seen the Excel data in its native form and learned how SAS interprets and converts that data, let's go ahead and see what happens when you run a SAS procedure on the Excel data. Let's use a PROC CONTENTS step on the Sales workbook and examine the results. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit the following PROC CONTENTS step. proc contents data=orionxls._all_; run; 2. In the PROC CONTENTS output, notice that the worksheet names end in dollar signs. Notice that the data set name has the worksheet name, including the dollar sign, enclosed in quotation marks and followed by the letter n. This is a SAS name literal. 3. Now, let's look at the Variables and Attributes for the Australia worksheet. Notice that the variable names match the labels except that the variable names contains underscores instead of spaces. Birth_date and Hire_date are both listed as numeric values with DATE9. formats because they were both formatted as dates in the original file.

6-3

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Creating a SAS Data Set from an Excel Worksheet

In the previous demonstration, you saw how to assign the libref Orionxls to the workbook Sales.xls. (Make sure you've defined the Orionxls library to the location where you stored the practice files.) You also saw that Sales.xls contains two worksheets: Australia and United States. Now, suppose you want to create a temporary SAS data set named Subset2 from the data in the Australia worksheet. 1. In the editor, copy and paste the following DATA step: data work.subset2; set orionxls.'Australia$'n; where Job_Title contains 'Rep'; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; format Salary comma10. Hire_Date weekdate.; run; 2. Add the following PROC PRINT step and PROC CONTENTS step: proc print data=work.subset2 label; run; proc contents data=work.subset2; run; 3. Submit the program and examine the PROC PRINT output to see that Work.Subset2 is identical to the Work.Subset1 data set that you created in the previous lesson.

6-4

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

4. Why do the First_Name and Last_Name variables appear in the report with spaces rather than underscores? The column names in the Excel worksheet contained spaces, so SAS automatically created labels for those variable with spaces rather than underscores. 5. In the PROC CONTENTS output, you can see the labels that you added in the DATA step as well as the labels that SAS added automatically. 6. After you have created a SAS data set from the Excel worksheet, you can disassociate the libref from the workbook. As long as you have the libref associated with the workbook in an active session of SAS, you cannot open the Excel file outside of SAS. Remember, to disassociate a libref, you write a new LIBNAME statement that specifies the libref and the CLEAR option, and then submit it. Submit this LIBNAME statement: libname orionxls clear;

6-5

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Importing an Excel Worksheet into a SAS Data Set

Suppose you want to use the Import Wizard to read data from the Australia worksheet in the Sales.xls file and write it to a SAS data set named Work.Subset2a. Then, you want to save the generated PROC IMPORT code. Note: The steps in this demonstration do not apply to SAS Enterprise Guide. 1. Select File Import Data to open the wizard. 2. In the Import Wizard, select the type of file you are importing. Microsoft Excel is the default file type, but you can see that there are many other types of files that you could import. Click Next. 3. Specify the workbook that you want to import. You can type the name and full path if you know it, or browse to it. In this example, you want to use the Sales.xls file, which is stored in the location where you stored the practice files. Select Sales.xls and then click OK. 4. Next, select the worksheet that you want to use as input data Australia is the first worksheet in the workbook, so it is selected by default. Click Options to specify options for how SAS should interpret and convert data values in the worksheet. Click OK in the Spreadsheet Options window and then click Next. 5. Specify the library and member name where you want SAS to store the imported data. Work is the default library. For Member, specify the data set name, Subset2a. 6. Finally, specify a filename and path to save the generated PROC IMPORT code if you want to save the program. In this example, navigate to the location where you want to save the program and save it as ImportSales.sas. Make note of the location so that you can find the program again later. 7. Click Finish. You can check the log to see that the output data set was successfully created.

6-6

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

8. You can also use a PROC PRINT step to print a list report of the new Subset2a data set. The data set contains nine variables and you can scroll down to see all of the observations. 10. Or, you can use a PROC CONTENTS step to view information about Work.Subset2a. 11. Finally, you can open the saved program to see the PROC IMPORT code that the wizard generated. With the Editor window active, select File Open Program, and navigate to the ImportSales.sas program and click Open. The program opens in an editor. You can see the code, including the options that you specified in the Spreadsheet Options window.

6-7

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Using the DATA Step to Create an Excel Worksheet

Suppose you want to use the DATA step to create an Excel workbook that contains the data from two SAS data sets: Qtr1_1007 and Qtr2_2007. 1. Submit the following LIBNAME statement: libname orionxls 'my-file-path\qtr2007a.xls'; Windows: Replace my-file-path with the location where you stored the practice files. SAS Enterprise Guide: If SAS and SAS Enterprise Guide are both installed on your local computer, replace my-file-path with the location where you stored the practice files.

2. Submit the following DATA steps: data orionxls.qtr1_2007; set orion.qtr1_2007; run; data orionxls.qtr2_2007; set orion.qtr2_2007; run; 3. Check the log to verify that the files were successfully created. 4. In the Explorer window, navigate to the Orionxls library. If you double-click it, you can see that there are four items listed. The items whose names end in a dollar sign are the Excel worksheets. The items whose names do not end in a dollar sign are named ranges in the worksheets. SAS creates both the worksheets and the named ranges by default.

6-8

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

5. Submit this LIBNAME statement to disassociate the Orionxls libref so that your newly created files are available in Microsoft Excel: libname orionxls clear;

6-9

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Using PROC COPY to Create Excel Worksheets

Suppose you want to use PROC COPY to create an Excel workbook that contains the data from two SAS data sets: Orion.Qtr1_2007 and Orion.Qtr2_2007. 1. Submit the following program: libname orionxls 'my-file-path\qtr2007b.xls'; proc copy in=orion out=orionxls; select qtr1_2007 qtr2_2007; run; proc contents data=orionxls._all_; run; Windows: Replace my-file-path with the location where you stored the practice files. SAS Enterprise Guide: If SAS and SAS Enterprise Guide are both installed on your local computer, replace my-file-path with the location where you stored the practice files.

View the log to verify that it ran successfully. In the PROC CONTENTS output, you can see that there are four items in Orionxls. The members with dollar signs in their names are worksheets, and the members without dollar signs are named ranges. You can scroll down to see the information about each member.

6-10

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

Exporting a SAS Data Set to an Excel Workbook

Now that you've seen the syntax for a PROC EXPORT step and learned about the Export Wizard, let's see how to use the Export Wizard to generate PROC EXPORT code. Suppose you want to use the Export Wizard to read data from the Orion.Qtr1_2007 and write it to an Excel worksheet named qtr1. Then, you want to save the generated PROC EXPORT code. Note: This demonstration does not apply to SAS Enterprise Guide. 1. To open the Export Wizard, select File Export Data. 2. In the wizard, specify the data set you want to export by selecting Orion as the library and Qtr1_2007 as the member, and then click Next. 3. Specify the type of file that you want to create, which is a Microsoft Excel workbook in this example. You can see that there are many other options for file type. Click Next. 4. Specify the workbook name and location for the exported file as my-file-path\qtr2007c.xls, and then click OK. (Replace my-file-path with the location where you stored the practice files.) Then, specify the name for the table or worksheet as qtr1, and click Next. 5. Finally, specify the filename and location to save the generated PROC EXPORT code as my-filepath\exportQtr1.sas and click Finish. (Replace my-file-path with the location where you stored the practice files.) 6. Check the log to verify that the file was successfully created. 7. In the Explorer window, navigate to the folder where you stored the exported file to see that it exists. You could run a PROC CONTENTS step or a PROC PRINT step on the file to examine it, but

6-11

Lesson 6: Creating SAS Data Sets from Microsoft Excel Worksheets

remember that you would need to use a SAS/ACCESS LIBNAME statement to assign a libref to it first.

6-12

Lesson 7 Creating SAS Data Sets from Delimited Raw Data Files

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Creating a SAS Data Set from a Delimited Raw Data File

Suppose you want to create a SAS data set named Work.Subset3 from the comma-delimited raw data file named sales.csv. You want to read the following data values from each record in the raw data file: Employee_ID, First_Name, Last_Name, Gender, Salary, Job_Title, and Country. For now, you want to ignore the date values that are also in each record. Employee_ID and Salary are numeric, and the remaining variables that you want to include are character. 1. Submit the following DATA step: data work.subset3; infile 'my-file-path\sales.csv' dlm=','; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 2. Submit the step and check the log to verify that SAS found the raw data file and read 165 records from it. The new data set, Work.Subset3, contains 165 observations and 7 variables. 3. In the editor, submit this PROC PRINT step to view a list report of the new data set: proc print data=work.subset3; run;

7-2

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

4. Look at the PROC PRINT output. Notice that the values for Job_Title seem to be truncated. Some of the values for First_Name and Last_Name might be truncated as well. It's hard to tell for certain. You'll see why this truncation happened and how to avoid it later in this lesson.

7-3

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Specifying the Length of Variables Explicitly

1. Copy and paste the following DATA step from the previous demonstration, in which several variables in the output data set had truncated values. data work.subset3; infile 'my-file-path\sales.csv' dlm=','; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 2. Add a LENGTH statement to set the length of First_Name to 12 and Last_Name to 18 so that those values are not truncated. Then, set the length of Gender to 1 because you don't need 8 characters. Set Job_Title to 25 and Country to 2. The DATA step should now look like this: data work.subset3; infile 'my-file-path\sales.csv' dlm=','; length First_Name $ 12 Last_Name $ 18 Gender $ 1 Job_Title $ 25 Country $ 2; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $; run; 3. Add this PROC PRINT step so that you can see a listing of your output data set:

7-4

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

proc print data=work.subset3; run; 4. Submit the program and examine the report. The values for First_Name, Last_Name, and Job_Title are no longer truncated, and the values for Gender and Country are still correct also. 5. Submit this PROC CONTENTS step to view the attributes of the variables to verify the different lengths: proc contents data=work.subset3; run;

7-5

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Specifying Informats in the INPUT Statement

1. Suppose you have this DATA step in your editor. Notice that the INPUT statement in this step names seven variables. data work.subset3; infile 'my-file-path\sales.csv' dlm=','; length First_Name $ 12 Last_Name $ 18 Gender $ 1 Job_Title $ 25 Country $ 2; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 2. Add 2 more variables to the INPUT statement. Birth_Date appears in the DATE9. format, so we'll use the DATE9. informat to read those values. Hire_Date appears in the MMDDYY10. format, so we'll use the MMDDYY10. informat to read those values. Remember, we use the colon modifier because the raw data is delimited. The DATA step should now look like this: data work.subset3; infile 'my-file-path\sales.csv' dlm=','; length First_Name $ 12 Last_Name $ 18 Gender $ 1 Job_Title $ 25 Country $ 2; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $

7-6

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Birth_Date :date9. Hire_Date :mmddyy10.; run; 3. Add this PROC PRINT step: proc print data=work.subset3; run; 4. Submit the program and examine the PROC PRINT output. Notice that the data set does include the two date variables, but the values don't look like dates. Remember that these are SAS date values, so to appear as recognizable dates in reports, they need to be formatted.

7-7

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Subsetting Variables and Adding Permanent Attributes

1. Suppose you have the following DATA step in your editor: data work.subset3; length First_Name $ 12 Last_Name $ 18 Gender $ 1 Job_Title $ 25 Country $ 2; infile 'my-file-path\sales.csv' dlm=','; input Employee_ID First_Name $ Last_Name $ Gender $ Salary Job_Title $ Country $ Birth_Date :date9. Hire_Date :mmddyy10.; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 2. Add a KEEP statement to include only five variables in your output data set: First_Name, Last_Name, Salary, Job_Title, and Hire_Date. Then, add a LABEL statement to call Job_Title 'Sales Title' and call Hire_Date 'Date Hired'. Finally, add a FORMAT statement to display Salary values in the DOLLAR12. format, and Hire_Date values in the MONYY7. format. Your DATA step should look like this: data work.subset3; length First_Name $ 12 Last_Name $ 18 Gender $ 1 Job_Title $ 25 Country $ 2; infile 'my-file-path\sales.csv' dlm=','; input Employee_ID First_Name $ Last_Name $ Gender $

7-8

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Salary Job_Title $ Country $ Birth_Date :date9. Hire_Date :mmddyy10.; keep First_Name Last_Name Salary Job_Title Hire_Date; label Job_Title='Sales Title' Hire_Date='Date Hired'; format Salary dollar12. Hire_Date monyy7.; run; 3. Write this PROC PRINT step: proc print data=work.subset3 label; run; 4. Submit the program and examine the output. The data set contains only 5 variables and you can see the new labels and formats.

7-9

Lesson 7: Creating SAS Data Sets from Delimited Raw Data Files

Reading Instream Data

You've seen the syntax for the DATALINES statement. Now, let's try submitting a DATA step that reads instream data in SAS. Suppose you have these three lines of data that you want to read into a data set: 120102 Tom Zhou 120103 Wilson Dawes 120121 Irenie Elvish 1. Copy and paste this DATA step to read these three lines of data into a data set named Work.Subset4: data work.subset4; input Employee_ID First_Name $ Last_Name $; datalines; 120102 Tom Zhou 120103 Wilson Dawes 120121 Irenie Elvish ; run; 2. Submit the program and check the log to verify that Work.Subset4 contains three observations and three variables. 3. Submit this PROC PRINT step to view the values in the data set: proc print data=work.subset4; run;

7-10

Lesson 8 Data

Validating and Cleaning

Lesson 8: Validating and Cleaning Data

Identifying Invalid Values in Data

This demonstration shows you how to identify invalid values in data. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Open the Orion.Nonsales data set in the VIEWTABLE window (or the equivalent SAS window in your operating environment). 2. Based on the following requirements, check each variable for invalid values: Employee_ID: unique and not missing Gender: F or M Salary: 24000 500000 Job_Title: not missing Country: AU or US Birth_Date: earlier than Hire_Date Hire_Date: 01/01/1974 or later 3. These are invalid values: Employee_ID has duplicate values in observations 6 and 7 and a missing value in observation 14. First and Last contain character data, and that's their only requirement, so they're okay. Gender has the invalid valid G in observation 12. Salary has a missing value in observation 4 and a value that's too low in observation 13. Observation 10 has a missing value for Job_Title.

8-2

Lesson 8: Validating and Cleaning Data

Country has a lowercase value in observation 2. This is an invalid value. Birth_Date must be earlier than Hire_Date, so observation 5 contains an invalid value for Birth_Date. Observation 9 contains a missing value for Hire_Date, so this value is invalid, too. Only First and Last meet all your requirements, so 7 out of 9 of the variables in the data set contain invalid values.

8-3

Lesson 8: Validating and Cleaning Data

Examining Data Errors

This demonstration shows you how to examine data errors. 1. Copy and paste this program into the editor: data work.nonsales; length Employee_ID 8 First $ 12 Last $ 18 Gender $ 1 Salary 8 Job_Title $ 25 Country $ 2 Birth_Date Hire_Date 8; infile 'my-file-path\nonsales.csv' dlm=','; input Employee_ID First $ Last $ Gender $ Salary Job_Title $ Country $ Birth_Date :date9. Hire_Date :date9.; format Birth_Date Hire_Date ddmmyy10.; run; proc print data=work.nonsales; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 2. Submit the program and view the PROC PRINT output. Notice that a few missing values occur.

8-4

Lesson 8: Validating and Cleaning Data

3. View the SAS log. How many notes about data errors appear? There are two notes about data errors. The first is a problem you saw earlier with Salary. The second is a problem with Hire_Date. What's the value in error for Hire_Date? The value is 99NOV1978. What's the problem with the value 99NOV1978? 99NOV1978 isn't a valid date, so SAS can't read it. 4. Clear the log. 5. Copy and paste this program into the editor: data work.nonsales; infile 'my-file-path\nonsales.csv' dlm=','; input Employee_ID First $ Last; run; proc print data=work.nonsales; run; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. 6. Submit the program and examine the PROC PRINT output. One missing value occurs for Employee_ID. All values for Last are missing. 7. View the log and check the raw data values for Last. Is the raw data bad, or did the programmer read the data incorrectly? The programmer read the data incorrectly. The raw data values for Last are character values, as they should be. But the INPUT statement doesn't specify Last as a character variable. So Last was read as numeric but needs to be read as character. What other clue tells you that this is likely to be a programming problem? The fact that the data is wrong for every observation suggests a programming error. Of course, it's always possible that raw data could be systematically invalid, but that's not the case here. One more clue: if we scroll to the end of the log, we see an error message from SAS: Limit set by ERRORS= option reached. Further errors of this type will not be printed. This message is often a signal to check your program for errors.

8-5

Lesson 8: Validating and Cleaning Data

Validating Data Using PROC PRINT

This demonstration shows you how to validate data using PROC PRINT. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit this PROC PRINT step: proc print data=orion.nonsales; var Employee_ID Gender Salary Job_Title Country Birth_Date Hire_Date; where Employee_ID = . Or Gender not in ('F','M') or Salary not between 24000 and 500000 or Job_Title = ' ' or Country not in ('AU','US') or Birth_Date > Hire_Date or Hire_Date < '01JAN1974'd; run; 2. Check the output for invalid values. You can see that five observations have missing values, one of them resulting from the invalid value for Hire_Date. Then you can see an invalid value for Gender and two values outside the valid range for Salary. Country has six values in the wrong case. In one observation, Birth_Date is later than Hire_Date, and another value for Hire_Date is before the earliest valid date.

8-6

Lesson 8: Validating and Cleaning Data

Validating Data Using PROC FREQ

This demonstration shows you how to validate data using PROC FREQ. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit this program: proc freq data=orion.nonsales; tables Gender Country Employee_ID; run; 2. Scroll through the output. Gender and Employee_ID each have one missing value. 3. Add the NLEVELS option to the PROC FREQ statement and submit the program again. 4. View the output. This time you see the Number of Variable Levels tables first. 5. Add the NOPRINT option to the PROC FREQ statement to suppress the individual frequency tables and submit the program again. 6. View the output. Now you see only the Number of Variable Levels tables. 7. To display the number of levels for all variables, replace the three variable names in the TABLE statement with _ALL_ and resubmit the program. 8. View the output. How many employees have the same salary? The variable Salary has 230 levels. There are 235 employees, so five employees must have the same salary.

8-7

Lesson 8: Validating and Cleaning Data

Validating Data Using PROC MEANS

This demonstration shows you one way to validate data using PROC MEANS. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit this program: proc means data=orion.nonsales; run; 2. View the output. Notice that you get results for all numeric variables. 3. Add a VAR statement to select only Salary: proc means data=orion.nonsales; var Salary; run; 4. Resubmit the program and view the output. 5. Specify the statistics N, NMISS, MIN, and MAX. proc means data=orion.nonsales n nmiss min max; var Salary; run; 6. Resubmit the program and view the output. The new output shows you the information you want for Salary.

8-8

Lesson 8: Validating and Cleaning Data

Validating Data Using PROC UNIVARIATE

This demonstration shows you one way to validate data using PROC UNIVARIATE. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Submit this program: proc univariate data=orion.nonsales; run; 2. View the output. Notice that you get a separate set of tables for each numeric variable in the data set. 3. To display output for only Salary, add a VAR statement. proc univariate data=orion.nonsales; var Salary; run; 4. Submit the program and view the output. Now you get six tables for Salary only. If you look at the Extreme Observations table, you see the default five lowest and highest values. In the Missing Values table, you see that Salary contains one missing value. 5. Add NEXTROBS=10 to to request a different number of extreme observations. proc univariate data=orion.nonsales nextrobs=10; var Salary; run; 6. Submit the program and view the output. Now you see the ten lowest and highest values.

8-9

Lesson 8: Validating and Cleaning Data

Using the VIEWTABLE Window to Clean Data

This demonstration shows you how to use the VIEWTABLE window to clean data. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Instructions for Windows, UNIX, and z/OS operating environments: 1. Open the Orion.Nonsales data set. Windows and UNIX: Activate the Explorer window. Then double-click Libraries to display the active libraries, and double-click Orion to display the members in that library. Finally, double-click Nonsales to open the data set in the VIEWTABLE window. z/OS: If menus are not turned on, type PMENU on any command line and press Enter. Select View Explorer in any window. In the left pane of the Explorer window, click Orion. Type ? next to Nonsales2 and press Enter. Select Open. The data set opens in the FSVIEW window.To change to edit mode, select Edit Edit Mode. Select Edit Update. In the Update window, click OK. In observation 7, change the value of Employee_ID to 120109. In observation 12, change the value of Gender to F. In observation 14, change the value of Employee_ID to 120116. In observation 101, change the value of Gender to F. 3. Close the VIEWTABLE window. z/OS: Close the FSVIEW window.

8-10

Lesson 8: Validating and Cleaning Data

Instructions for SAS Enterprise Guide users: 1. Submit the following code to create a copy of the Nonsales data that you can edit: data tempsales; set orion.nonsales; run; 2. The Tempsales data set opens in Read-Only mode in the Data Grid. Based on your version of SAS Enterprise Guide, switch to Update mode in one of the following ways: Version 4.1: Select Data > Read-Only. In the confirmation window, click Yes. Version 4.2: Double-click in any table cell. In the confirmation window, click Yes. 3. Click in a table cell to edit individual data values In observation 7, change the value of Employee_ID to 120109. In observation 12, change the value of Gender to F. In observation 14, change the value of Employee_ID to 120116. In observation 101, change the value of Gender to F. Close the data set when you are finished editing.

8-11

Lesson 8: Validating and Cleaning Data

Using Assignment Statements to Clean Data

This demonstration shows you how to use assignment statements and the UPCASE function to clean data. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: data work.nonsales; set orion.nonsales; run; 2. Add an assignment statement to uppercase all values of Country. data work.nonsales; set orion.nonsales; Country=upcase(Country); run; 3. Submit the program. Then view the new data set in the Work library. When you scroll through the data, you should see that all values of Country are uppercase.

8-12

Lesson 8: Validating and Cleaning Data

Using IF-THEN/ELSE Statements to Clean Data

This demonstration shows you how to use IF_THEN/ELSE statements to clean data. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: data work.nonsales; set orion.nonsales; Country=upcase(Country); run; 2. Submit the program. Then view the new data set in the Work library. When you scroll through the data, you should see that all values of Country are uppercase and that observations 4, 13, and 20 contain invalid values for Salary. Close the data set. 3. Add IF-THEN/ELSE statements to assign values to Salary based on Employee_ID. data work.clean; set orion.nonsales; Country=upcase(Country); if Employee_ID=120106 then Salary=26960; else if Employee_ID=120115 then Salary=26500; else if Employee_ID=120191 then Salary=24015; run; 4. Submit the revised program. Then view observations 4, 13, and 20 in the new data set to see that the values for Salary were assigned successfully. Close the data set.

8-13

Lesson 8: Validating and Cleaning Data

5. Now add IF-THEN/ELSE statements to correct invalid values for Hire_Date in observations 5, 19, and 214. data work.clean; set orion.nonsales; Country=upcase(Country); if Employee_ID=120106 then else if Employee_ID=120115 else if Employee_ID=120191 else if Employee_ID=120107 Hire_Date='21JAN1995'd; else if Employee_ID=120111 Hire_Date='01NOV1978'd; else if Employee_ID=121011 Hire_Date='01JAN1998'd; run;

Salary=26960; then Salary=26500; then Salary=24015; then then then

6. Submit the revised program. Then view observations 5, 19, and 214 in the new data set to see that the values for Hire_Date were assigned successfully. Close the data set.

8-14

Lesson 9

Manipulating Data

Lesson 9: Manipulating Data

Creating Variables by Using Functions

Make sure you've defined the Orion library to the location where you stored the practice files. For more infomration see Setting Up Practice Files in the Reference section. 1. In the editor, type the following code: data work.comp; set orion.sales; Bonus=500; Compensation=sum(Salary,Bonus); BonusMonth=month(Hire_Date); run; proc print data=work.comp; run; 2. Submit the code. 3. Check the log. How many variables are in Work.Comp? 4. View the output.

9-2

Lesson 9: Manipulating Data

Subsetting Variables

Make sure you've defined the Orion library to the location where you stored the practice files. For more infomration see Setting Up Practice Files in the Reference section. 1. In the editor, type the following code: data work.comp; set orion.sales; Bonus=500; Compensation=sum(Salary,Bonus); BonusMonth=month(Hire_Date); drop Gender Salary Job_Title Country Birth_Date Hire_Date; run; proc print data=work.comp; run; 2. Submit the code. 3. Check the log. How many variables are in Work.Comp? 4. View the output. Did Bonus, Compensation, and BonusMonth appear appropriately? Did Gender, Salary, Job_Title, Country, Birth_Date and Hire_Date get dropped before output?

9-3

Lesson 9: Manipulating Data

Creating Two Variables Conditionally

Make sure you've defined the Orion library to the location where you stored the practice files. For more infomration see Setting Up Practice Files in the Reference section. 1. In the editor, type the following code: data work.bonus; set orion.sales; if Country='US' then do; Bonus=500; Freq='Once a Year'; end; else if Country='AU' then do; Bonus=300; Freq='Twice a Year'; end; run; proc print data=work.bonus; var First_Name Last_Name Country Bonus Freq; run; 2. Submit the code. 3. Check the log.

9-4

Lesson 9: Manipulating Data

4. View the output. Notice that for the value Twice a Year, the final R seems to be cut off.

9-5

Lesson 9: Manipulating Data

Adjusting the Program

Make sure you've defined the Orion library to the location where you stored the practice files. For more infomration see Setting Up Practice Files in the Reference section. 1. In the editor, type the following code: data work.bonus; set orion.sales; if Country='US' then do; Bonus=500; Freq='Once a Year'; end; else if Country='AU' then do; Bonus=300; Freq='Twice a Year'; end; run; proc print data=work.bonus; var First_Name Last_Name Country Bonus Freq; run; 2. Add the following statement above the IF-THEN ELSE and DO statements: 3. length Freq $ 12;

9-6

Lesson 9: Manipulating Data

4. In the editor, change the ELSE DO statement to remove the condition. 5. Submit the code. 6. Check the log. 7. View the output. Notice that that the values for Bonus and Freq are based on the value for Country, and the full value Twice a Year is displayed for Freq.

9-7

Lesson 9: Manipulating Data

Selecting Observations by Using the Subsetting IF Statement

Make sure you've defined the Orion library to the location where you stored the practice files. For more infomration see Setting Up Practice Files in the Reference section. 1. In the editor, type the following code: data work.december; set orion.sales; BonusMonth=month(Hire_Date); if Country='AU' and BonusMonth=12; Bonus=500; Compensation=sum(Salary,Bonus); run; proc print data=work.december; var Employee_ID Bonus BonusMonth Compensation; run; 2. Submit the code. 3. Check the log. Notice the number of observations read versus the number of observations written to the output. 4. View the output.

9-8

Lesson 10 Combining SAS Data Sets Vertically

Lesson 10: Combining SAS Data Sets Vertically

Appending Data Sets That Have the Same Variables

In this demonstration, you run a PROC APPEND step to append Emps2008 to Emps. These two data sets have identical variables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. To create the temporary data sets Emps and Emps2008, submit the following code: proc copy in=orion out=work; select emps emps2008; run; 2. Open the Emps data set in the VIEWTABLE window. Notice that it contains three observations. 3. Open the Emps2008 data set in the VIEWTABLE window. Notice that it contains two observations. 4. Clear the editor. 5. Write a PROC APPEND step to append Emps2008 to Emps. That is, specify Emps as the BASE= data set, and specify Emps2008 as the DATA= data set. proc append base=emps data=emps2008; run; 6. Add a PROC PRINT step to print the new Emps data set. proc print data=emps; run;

10-2

Lesson 10: Combining SAS Data Sets Vertically

7. Submit the program and examine the log to verify that Emps now contains five observations. Then, examine the PROC PRINT output to verify that it includes observations from both input data sets.

10-3

Lesson 10: Combining SAS Data Sets Vertically

Appending to a BASE= Data Set with Additional Variables

In this demonstration, you run a PROC APPEND step to append Emps2009 to the BASE= data set Emps, which has an additional variable. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure you've completed the steps in the previous demonstration to create the correct version of the temporary data set Emps. 1. To create the temporary data set Emps2009, submit the following code: proc copy in=orion out=work; select emps2009; run; 2. Verify that Emps2009 was created, and then clear the editor. 3. Copy and paste the following PROC APPEND step, which specifies Emps as the BASE= data set and Emps2009 as the DATA= data set: proc append base=emps data=emps2009; run; 4. Add the following PROC PRINT step to view the Emps data set after Emps2009 is appended: proc print data=emps; run;

10-4

Lesson 10: Combining SAS Data Sets Vertically

5. Submit the program and examine the log. Notice the warning note indicating that the variable Gender was not found in the DATA= data set. Notice also that SAS was able to append Emps2009, and that Emps now has seven observations. 6. Examine the PROC PRINT output. Notice that the appended observations have missing values for Gender.

10-5

Lesson 10: Combining SAS Data Sets Vertically

Appending a DATA= Data Set with Additional Variables

In this demonstration, you run a PROC APPEND step to see how SAS handles a DATA= data set that has an additional variable. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure you've completed all the steps in the previous demonstrations to create the correct version of the temporary data set Emps. 1. To create the temporary data set Emps2010, submit the following code: proc copy in=orion out=work; select emps2010; run; 2. Verify that Emps2010 was created, and then clear the editor. 3. Copy and paste the following code: proc append base=emps data=emps2010; run; 4. Add a PROC PRINT step to view the output of Emps after Emps2010 has been appended. proc print data=emps; run;

10-6

Lesson 10: Combining SAS Data Sets Vertically

5. Submit the program and examine the log. Notice the warning message indicating that the variable Country is not in the BASE= data set. As you know, SAS cannot add this variable to Emps. Notice also that SAS was not able to append Emps2010, but that you can use the FORCE option to force SAS to append the data set. 6. Examine the PROC PRINT output and verify that Emps is unchanged.

10-7

Lesson 10: Combining SAS Data Sets Vertically

Appending a Data Set Using the FORCE Option

In this demonstration, you run a PROC APPEND step with the FORCE option. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure you've completed the steps in the previous demonstrations to create the correct versions of the temporary data sets Emps and Emps2010. 1. Copy and paste the following PROC APPEND step: proc append base=emps data=emps2010 force; run; Add the following PROC PRINT step to view Emps after Emps2010 has been appended: proc print data=emps; run; 2. Submit the program and examine the log. Notice the warning message about variables not found in both data sets. Notice also that SAS was able to to append Emps2010, and that Emps now contains nine observations. Another note indicates that the FORCE option was used, so dropping or truncations will occur. 3. Examine the PROC PRINT output. Emps now has nine observations and three variables. Country does not appear.

10-8

Lesson 10: Combining SAS Data Sets Vertically

Concatenating Data Sets with the Same Variables

In this demonstration, you run a DATA step to concatenate the data sets EmpsDK and EmpsFR. These data sets have the same variables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. To create the temporary data sets EmpsDK and EmpsFR, submit the following code: proc copy in=orion out=work; select empsdk empsfr; run; 2. Verify that EmpsDK and EmpsFR were created, and then clear the editor. 3. Copy and paste the following DATA step, which concatenates EmpsDK and EmpsFR: data empsall1; set empsdk empsfr; run; 4. Add the following PROC PRINT step to view the new Empsall1 data set: proc print data=empsall1; run; 5. Submit the program and examine the log. Notice that the Empsall1 data set was created with five observations and three variables.

10-9

Lesson 10: Combining SAS Data Sets Vertically

6. Examine the PROC PRINT output and verify that the three employees in Denmark are listed first, followed by the two employees in France.

10-10

Lesson 10: Combining SAS Data Sets Vertically

Concatenating Data Sets with Different Variables

In this demonstration, you run a DATA step to concatenate the two data sets EmpsCN and EmpsJP, which have different variables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. To create the temporary data sets EmpsCN and EmpsJP, submit the following code: proc copy in=orion out=work; select empscn empsjp; run; 2. Verify that EmpsCN and EmpsJP were created, and then clear the editor. 3. Copy and paste the following code, which concatenates EmpsCN and EmpsJP and displays the output data set EmpsAll2: data empsall2; set empscn empsjp; run; proc print data=empsall2; run; 4. Submit the program and check the log to verify that SAS read the input data in the stated order and created the new data set, Empsall2, with four variables.

10-11

Lesson 10: Combining SAS Data Sets Vertically

5. Examine the PROC PRINT output. Verify that there are four variables: First, Gender, Country, and Region. Notice that some observations have missing values for Country and some observations have missing values for Region. This is not the output that you want.

10-12

Lesson 10: Combining SAS Data Sets Vertically

Concatenating Data Sets When a Variable Is Renamed

In this demonstration, you run a revised DATA step, which concatenates the two data sets EmpsCN and EmpsJP while renaming a variable. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure you've completed the steps in the previous demonstration to create the temporary data sets EmpsCN and EmpsJP. 1. Copy and paste the following code, which concatenates EmpsCN and EmpsJP and displays the output data set EmpsAll2: data empsall2; set empscn empsjp(rename=(Region=Country)); run; proc print data=empsall2; run; Note: The output data set EmpsAll2 replaces the output data set that is created in the previous demonstration. 2. Submit the program and examine the log. Notice that the output data set, Empsall2, contains only three variables. 3. Examine the PROC PRINT output to verify that Empsall2 contains only First, Gender, and Country.

10-13

Lesson 10: Combining SAS Data Sets Vertically

Interleaving Data Sets

In this demonstration, you run a DATA step that interleaves the two data sets EmpsCN and EmpsJP. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure that you've completed the steps in previous demonstrations to create the temporary data sets EmpsCN and EmpsJP. 1. Copy and paste the following code, which interleaves EmpsCN and EmpsJP and displays the output data set EmpsName. data empsname; set empscn empsjp(rename=(Region=Country)); by First; run; proc print data=empsname; run; 2. Submit the program and check the log to verify that the program ran without errors. 3. Examine the PROC PRINT output to verify that the observations in EmpsName are listed in ascending alphabetical order.

10-14

Lesson 11 Combining SAS Data Sets Horizontally

Lesson 11: Combining SAS Data Sets Horizontally

Merging Data Sets One to One

48

In this demonstration, you run a DATA step to match-merge EmpsAU and PhoneH by EmpID. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. To create the temporary data sets EmpsAU and PhoneH, submit the following code: proc copy in=orion out=work; select empsau phoneh; run; 2. Verify that EmpsAU and PhoneH were created, and then clear the editor. 3. Copy and paste the following code into the editor: data empsauh; merge empsau phoneh; by EmpID; run; proc print data=empsauh; run; 4. The input data sets are already sorted on the BY variables, so submit the program and check the log. Check the log to verify that the step ran successfully. Notice that SAS read three observations from each input data set and that the merged data set EmpsAUH has three observations.

11-2

Lesson 11: Combining SAS Data Sets Horizontally

5. Examine the PROC PRINT output. You can see that SAS combined the observations by matching employee ID numbers. The first two variables came from the EmpsAU data set. The third variable, the BY variable, is common to the two input data sets. The last variable came from the PhoneH data set.

11-3

Lesson 11: Combining SAS Data Sets Horizontally

11-4

Lesson 12

Enhancing Reports

Lesson 12: Enhancing Reports

Using Default Settings of SAS System Options for Reports

In this demonstration, you create reports by using the default settings of SAS system options. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. In the PROC PRINT step, the VAR statement specifies the four variables that will appear in the report. In the PROC FREQ step, the TABLES statement specifies a frequency table for the variable Country. proc print data=orion.sales; var Employee_ID First_Name Last_Name Salary; run; proc freq data=orion.sales; tables Country; run; 2. Submit the code and examine the listing output in the Output window. At the top of the PROC PRINT report, notice the information that appears by default. After the heading The SAS System is the date and time that you started your SAS session. By default, SAS prints this same date and time on all reports that you create in one session. On the far right is a page number. By default, SAS numbers all pages of output in sequence throughout the entire SAS session, starting with 1. SAS system options determine the width of the output--the line size--as well as the number of lines per page of output--the page size.

12-2

Lesson 12: Enhancing Reports

3. Scroll to the bottom of the first page of the PROC PRINT report. Notice that 50 observations appear on this page by default. The default line size and page size vary according to your operating environment and the version of SAS that you're using. 4. Scroll to the top of the PROC PRINT report and look at the alignment of text. Notice that SAS centers the heading The SAS System and the data table by default. The other information at the top extends to the right edge of the page. 5. Examine the top of the PROC FREQ report. Notice that the top of this report shows the same information as the PROC PRINT report, and it is aligned the same way. The default SAS system option settings apply to the listing output from all reporting procedures. 6. View the HTML output for the PROC PRINT step in the Results Viewer window. What information at the top of the listing output does not appear in this HTML output? The date and time and the page number do not appear in an HTML report. In addition, the SAS system option settings for line size and page size do not affect HTML output. In fact, as we scroll down, notice that the HTML output has no page breaks. Of the SAS system options covered in this lesson, the only one that applies to both HTML and listing output is centering. Notice that the HTML report is also centered. Note: Your site might use different settings for some SAS system options. If your site's settings are different, any output that you generate on your own might not look exactly like the output that's shown in this demonstration.

12-3

Lesson 12: Enhancing Reports

Creating a Report by Changing System Option Settings

In this demonstration, you add an OPTIONS statement to your program to see how changing the settings of SAS system options affects the appearance of the report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code, which is the PROC PRINT step from the last demonstration, into the editor: proc print data=orion.sales; var Employee_ID First_Name Last_Name Salary; run; 2. Before the PROC PRINT step, copy and paste the following OPTIONS statement. This statement changes the settings of five SAS system options. options nodate nonumber ps=15 ls=80 nocenter; 3. Submit the code and view the listing output. Notice that the date and time and the page number no longer appear at the top of the report. Also, the report is now left-aligned instead of centered. 4. View the HTML output. Notice that only the NOCENTER setting had an effect on this output. 5. To display the date and time and the page number in future reports, modify the OPTIONS statement by removing NODATE and NONUMBER and then adding DATE and NUMBER. The modified OPTIONS statement is shown below:

12-4

Lesson 12: Enhancing Reports

options date number ps=15 ls=80 nocenter; 6. Remove PS= (PAGESIZE=), LS= (LINESIZE=), and NOCENTER from the OPTIONS statement. You set these options when you ran the OPTIONS statement before. Remember that SAS system option settings remain in effect until you change them or until the end of the SAS session. The modified OPTIONS statement is shown below: options date number; 7. Submit the modified program and view the listing output. Notice that the date and time and the page number appear at the top of the report again.

12-5

Lesson 12: Enhancing Reports

Resetting the Date and Time and the Page Number

In this demonstration, you modify your program to create a report that displays the date and time that SAS created the report and that starts on page number 1. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Also, make sure you've completed all the steps in the previous demonstration to set SAS system options. 1. Copy and paste the following code, which is from the last demonstration, into the editor: options date number; proc print data=orion.sales; var Employee_ID First_Name Last_Name Salary; run; 2. In the OPTIONS statement, remove DATE and NUMBER and add DTRESET and PAGENO=1, as shown below: options dtreset pageno=1; 3. Submit the program and view the listing output. Notice that the date and time have changed. Also, the page number is now 1. 4. After the PROC PRINT step, add an OPTIONS statement to restore the page number and the date and time, reset the number of lines per page to 52, and center the output. Submit the OPTIONS statement.

12-6

Lesson 12: Enhancing Reports

options number date pagesize=52 center;

12-7

Lesson 12: Enhancing Reports

Specifying Titles and Footnotes for Reports

In this demonstration, you create reports that have a title and footnotes. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: footnote1 'By Human Resources Department'; footnote3 'Confidential'; proc means data=orion.sales; var Salary; title 'Orion Star Sales Salaries'; run; 2. Submit the code and view the HTML output. Notice that the custom title Orion Star Sales Salaries appears at the top. At the bottom, the two custom footnotes appear on lines 1 and 3. Line 2 is blank. 3. View the listing output for the report, and verify that the same title and footnotes appear here as in the HTML version. 4. After the PROC MEANS step, add the following PROC FREQ step: proc freq data=orion.sales; tables Gender; run;

12-8

Lesson 12: Enhancing Reports

5. Submit the modified program and view the HTML output for the PROC FREQ report. Notice that the same title and footnotes appear in this report as in the first report. Remember that any titles or footnotes that you assign remain in effect until you change them, cancel them, or end your SAS session.

12-9

Lesson 12: Enhancing Reports

Changing and Canceling Titles and Footnotes for Reports

In this demonstration, you modify the titles and footnotes for the second PROC step in the program you ran in the previous demonstration. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. After the PROC MEANS step, the FOOTNOTE3 statement cancels the third footnote for the PROC FREQ output. In the PROC FREQ step, the TITLE statement specifies a new title for this report. At the end, the null TITLE and FOOTNOTE statements cancel all titles and footnotes for any output that you might generate later. footnote1 'By Human Resources Department'; footnote3 'Confidential'; proc means data=orion.sales; var Salary; title 'Orion Star Sales Salaries'; run; footnote3; proc freq data=orion.sales; tables Gender; title 'Sales Employees by Gender'; run; title; footnote;

12-10

Lesson 12: Enhancing Reports

2. To help you verify that the titles and footnotes are canceled, copy and paste this PROC PRINT step after the null TITLE and FOOTNOTE statements: proc print data=orion.sales; var Last_Name; run; 3. Submit the program and view the HTML output from the three PROC steps in order. In the PROC MEANS output, the title and footnotes appear as specified. In the PROC FREQ output, the title has changed and only the first footnote appears. Finally, in the PROC PRINT output, no titles or footnotes appear.

12-11

Lesson 12: Enhancing Reports

Displaying Temporary Labels in Reports

In this demonstration, you run a program that specifies temporary labels for reports. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc freq data=orion.sales; tables Gender; label Gender='Sales Employee Gender'; run; proc print data=orion.sales split='*'; var Employee_ID Job_Title Salary; label Employee_ID='Sales ID' Job_Title='Job Title' Salary='Annual*Salary'; run; 2. Submit the program and view the PROC FREQ output. Notice that a new row appears at the top of the frequency table to display the temporary label that is specified for Gender. Does the original variable name appear anywhere in the PROC FREQ report? The original variable name Gender still appears in the second row of the table. PROC FREQ always displays the variable name here. 3. View the PROC PRINT output. At the top of the table, notice that the temporary labels now appear instead of the variable names. The label Annual Salary has split at the asterisk.

12-12

Lesson 12: Enhancing Reports

Applying Temporary Formats in Reports

In this demonstration, you run a program that specifies temporary labels for reports. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc print data=orion.sales split='*'; var Employee_ID Salary Birth_Date Hire_Date; label Employee_ID='Sales ID' Salary='Annual*Salary'; format Salary dollar10.0 Birth_Date Hire_Date monyy7.; run; proc freq data=orion.sales; tables Hire_Date; format Hire_Date year4.; run; 2. Submit the code and view the PROC PRINT report. Now that the DOLLAR10.0 format is applied to the Salary variable, notice that the annual salary values have dollar signs and commas. After the MONYY7. format is applied, the birth dates and hire datesappear as a three-character month and a four-digit year. 3. View the PROC FREQ report. Now that the YEAR4. format is applied to Hire_Date, the hire dates appear as four-digit years.

12-13

Lesson 12: Enhancing Reports

Specifying a User-Defined Format for a Character Variable

In this demonstration, you apply the user-defined format $CTRYFMT. to the character variable Country in a report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc format; value $ctryfmt 'AU' = 'Australia' 'US' = 'United States' other = 'Miscoded'; run; proc print data=orion.sales label; var Employee_ID Salary Country Hire_Date; label Employee_ID='Sales ID' Hire_Date='Hire Date'; format Salary dollar10.0 Hire_Date monyy7.; run; 2. Notice that the FORMAT statement in the PROC PRINT step does not yet refer to the user-defined format $CTRYFMT. Submit the program and view the PROC PRINT output to see how the values of Country appear when no format is applied. The values of Country appear as country codes: AU and US.

12-14

Lesson 12: Enhancing Reports

3. In the FORMAT statement, apply $CTRYFMT. to Country, as shown below: format Salary dollar10.0 Hire_Date monyy7. Country $ctryfmt.; 4. Submit the modified program and view the PROC PRINT output. Now, the values of Country are full words, as specified in the format.

12-15

Lesson 12: Enhancing Reports

Specifying a User-Defined Format for a Numeric Variable

In this demonstration, you apply the user-defined format TIERS. to the numeric variable Salary in a report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc format; value tiers low-<50000 = 'Tier 1' 50000- 100000 = 'Tier 2' 100000<-high = 'Tier 3'; run; proc print data=orion.sales label; var Employee_ID Salary Country Hire_Date; label Employee_ID='Sales ID' Hire_Date='Hire Date'; format Hire_Date monyy7. Salary tiers.; run; 2. Submit the program and view the PROC PRINT output. Notice that the salary values are now classified as Tier 1, Tier 2, or Tier 3. Dollar amounts are no longer shown.

12-16

Lesson 12: Enhancing Reports

Subsetting Observations in Reports

In this demonstration, you subset data for reports. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following program into the editor: proc print data=orion.sales; var Last_Name Job_Title Country Salary; format Salary dollar10.0; where Salary>75000; title 'Sales Salaries Over $75,000'; run; proc means data=orion.sales; var Salary; where Country='AU'; title 'Australia Salaries'; run; 2. View the PROC PRINT output. Notice that all the salary values displayed are greater than $75,000. At Orion Star, only six sales employees earn a salary this high. 3. View the PROC MEANS output. Notice that the statistics are calculated only on the observations for sales employees in Australia. The N column indicates that 63 observations in Orion.Sales have non missing values for Salary, and a Country value of AU.

12-17

Lesson 12: Enhancing Reports

4. For the PROC PRINT report, suppose you want to subset the data further to show only the employees with a salary greater than $75,000 who are in Australia. To get these results, can you add a second WHERE clause to the PROC PRINT step to subset on the variable Country? To see what happens, add another WHERE statement to the PROC PRINT step, as shown below, and submit the modified PROC PRINT step. proc print data=orion.sales; var Last_Name Job_Title Country Salary; format Salary dollar10.0; where Salary>75000; where Country='AU'; title 'Australia Sales Salaries Over $75,000'; run; 5. View the PROC PRINT output. This output is not what you want. Salaries lower than $75,000 are now displayed. 6. Check the log to see what happened. A note in the log indicates that SAS replaced the first WHERE clause with the second WHERE clause. 7. In the PROC PRINT step, remove the second WHERE statement. Modify the original WHERE statement by adding the logical operator AND to specify two conditions. The modified WHERE statement is shown below. proc print data=orion.sales; var Last_Name Job_Title Country Salary; format Salary dollar10.0; where Salary>75000 and Country='AU'; title 'Australia Sales Salaries Over $75,000'; run; 8. Submit the modified PROC PRINT step and view the output. This output looks better. Only two salaries are shown. Both are above $75,000 and both employees are in Australia. Can you subset on a variable that is not shown in the report? Using the WHERE statement, you can subset on any variable in the data set, even variables that are not displayed in the report.

12-18

Lesson 12: Enhancing Reports

Grouping Observations in Reports

In this demonstration, you create a report that displays grouped output. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. The PROC SORT step sorts the data by Country in descending order, and then by Country, and then by Last_Name. In the PROC PRINT step, the BY statement groups the data by Country in descending order, and then by Gender. proc sort data=orion.sales out=work.sort; by Country descending Gender Last_Name; run; proc print data=work.sort; by Country descending Gender; run; 2. Submit the program and view the output. Notice that a title now appears at the top of the PROC PRINT report, indicating that the observations are grouped by Country and by Gender. This first section of the report contains observations for employees in Australia who are male. Notice that the next report section is for employees in Australia who are female. The Gender values are listed in descending order within each country.

12-19

Lesson 12: Enhancing Reports

12-20

Lesson 13 Using the Output Delivery System to Create Reports

Lesson 13: Using the Output Delivery System to Create Reports

Opening and Closing the LISTING Destination

In this demonstration, you run a program that opens the LISTING destination, creates a report, and then closes the LISTING destination. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. Note: You must have SAS/GRAPH installed in order to run the program in this demonstration. 1. Turn off HTML output. SAS in Windows and UNIX: Select Tools Options Preferences. On the Results tab, clear the Create HTML check box. Click OK. SAS Enterprise Guide: Select Tools Options. Select Results General, then clear the HTML check box. Click OK. 2. Submit the following code: ods listing; proc freq data=orion.sales; tables Country; run; proc gchart data=orion.sales; hbar Country / nostats; run; ods listing close;

13-2

Lesson 13: Using the Output Delivery System to Create Reports

proc print data=orion.sales; run; 3. View the output. Notice that the first two procedures created output, but no output was created from the PROC PRINT step. Note: In SAS Enterprise Guide, only the output from the FREQ procedure is created because the LISTING destination sends GRAPH output to the GRAPH window which is not available in SAS Enterprise Guide. 4. View the Log. After the PROC PRINT code, notice the warning that indicates that no output destinations were active. SAS cannot display output unless a destination is open. So, make sure that you always leave an ODS destination open at the end of your programs.

13-3

Lesson 13: Using the Output Delivery System to Create Reports

Creating Reports as HTML, PDF, and RTF Files

In this demonstration, you run a program that sends the output from two procedures to the HTML, PDF, and RTF destinations. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Turn off HTML output. SAS in Windows and UNIX: Select Tools Options Preferences. On the Results tab, clear the Create HTML check box. Click OK. SAS Enterprise Guide: Select Tools Options. Select Results General, then clear the HTML check box. Click OK. 2. Copy and paste the following code into the editor: ods ods ods ods _all_ close; html file='my-file-path\myreport.html'; pdf file='my-file-path\myreport.pdf'; rtf file='my-file-path\myreport.rtf';

proc freq data=orion.sales; tables Country; title 'Report 1'; run; proc print data=orion.sales; var First_Name Last_Name Job_Title Country Salary; where Salary > 75000;

13-4

Lesson 13: Using the Output Delivery System to Create Reports

title 'Report 2'; run; title; ods _all_ close; ods listing; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to a location where you have write access. In some cases, this might be the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. z/OS: For the HTML and RTF output, add RS=NONE to the end of your ODS statements, as shown below: ods html file='my-file-path\myreport.html' rs=none; ods pdf file='my-file-path\myreport.pdf'; ods rtf file='my-file-path\myreport.rtf' rs=none; For more information, see Using the RS= Option in the ODS Statement in Appendix F: z/OS Notes. 3. Examine the code. The first ODS statement closes all destinations. The next three ODS statements open the HTML, PDF, and RTF destinations, respectively. The program contains two PROC steps: PROC FREQ and PROC PRINT. The PROC FREQ report has the title Report 1 and the PROC PRINT report has the title Report 2. How many output files will SAS create? SAS will create three output files, one for each destination. Each file will contain the output from the two procedures. At the end of the program, ODS statements close all destinations and then open the LISTING destination. 4. Submit the program. In the Results window, right-click Freq: Report 1 and select Expand All. Notice that the program created output in HTML, PDF, and RTF format. Note: Depending on your computer's settings and the output file type, the output that you send to external files might open in other applications instead of the Results Viewer window in SAS. In SAS Enterprise Guide, shortcuts to your files will be added to the Project Explorer window. 5. Open the HTML output. Notice that both reports are shown here on the same page. HTML output does not have page breaks. 6. Open the PDF output. Notice the table of contents on the left. Also, notice that only Report 1 appears on page 1. Click the arrow to see Report 2 on the next page. In PDF output, each report starts on a new page. 7. Open the RTF output. The RTF output may open in Microsoft Word. As with the PDF output, each report appears on a separate page. When you're finished viewing the RTF output, close Word.

13-5

Lesson 13: Using the Output Delivery System to Create Reports

For more information, see Page Breaks in RTF Documents in Appendix D: Details. If you are using SAS in the Windows operating environment, you can use Windows Explorer to navigate Sto the location where you stored the output files. Find the three files named myreport. If you want, you can open your output files directly from the operating environment.

13-6

Lesson 13: Using the Output Delivery System to Create Reports

Applying Style Definitions to HTML, PDF, and RTF Output

In this demonstration, you apply style definitions to HTML, PDF, and RTF reports. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Turn off HTML output. SAS in Windows and UNIX: Select Tools Options Preferences. On the Results tab, clear the Create HTML check box. Click OK. SAS Enterprise Guide: Select Tools Options. Select Results General, then clear the HTML check box. Click OK. 2. Copy and paste the following code into the editor. Notice that style definitions are specified in the ODS statements that create HTML, PDF, and RTF output. In the last demonstration, you ran this program without style definitions. ods ods ods ods _all_ close; html file='my-file-path\myreport.html' style=sasweb; pdf file='my-file-path\myreport.pdf' style=journal; rtf file='my-file-path\myreport.rtf' style=ocean;

proc freq data=orion.sales; tables Country; title 'Report 1'; run; proc print data=orion.sales; var First_Name Last_Name Job_Title Country Salary;

13-7

Lesson 13: Using the Output Delivery System to Create Reports

where Salary > 75000; title 'Report 2'; run; title; ods _all_ close; ods listing; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to a location where you have write access. In some cases, this might be the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. z/OS: For the HTML and RTF output, add RS=NONE to the end of your ODS statements, as shown below: ods html file='my-file-path\myreport.html' style=sasweb rs=none; ods pdf file='my-file-path\myreport.pdf' style=journal; ods rtf file='my-file-path\myreport.rtf' style=ocean rs=none; For more information, see Using the RS= Option in the ODS Statement in the z/OS Operating Environment in Appendix F: z/OS Notes. 3. Submit the program. 4. View the HTML output. We applied the sasweb style definition to this output. Notice the blue headings. 5. View the PDF output, which has the journal style definition applied. 6. View the RTF output in Word. The ocean style definition uses subtle colors in the titles and the table headings. When you're finished viewing the RTF output, close Word.

13-8

Lesson 13: Using the Output Delivery System to Create Reports

Creating Files That Microsoft Excel Can Open

In this demonstration, you run a program that sends the output from two procedures to the CSVALL, MSOFFICE2K, and EXCELXP destinations. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Turn off HTML output. SAS in Windows and UNIX: Select Tools Options Preferences. On the Results tab, clear the Create HTML check box. Click OK. SAS Enterprise Guide: Select Tools Options. Select Results General, then clear the HTML check box. Click OK. 2. Copy and paste the following code into the editor: ods ods ods ods _all_ close; csvall file='my-file-path\myexcel.csv'; msoffice2k file='my-file-path\myexcel.html'; tagsets.excelxp file='my-file-path\myexcel.xml';

proc freq data=orion.sales; tables Country; title 'Report 1'; run; proc print data=orion.sales; var First_Name Last_Name Job_Title Country Salary; where Salary > 75000;

13-9

Lesson 13: Using the Output Delivery System to Create Reports

title 'Report 2'; run; title; ods _all_ close; ods listing; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to a location where you have write access. In some cases, this might be the location where your practice files are stored. If necessary, change the slash that precedes the file name to match those in the path you've entered. z/OS: For the HTML and RTF output, add RS=NONE to the end of your ODS statements, as shown below: ods csvall file='my-file-path\myexcel.csv' rs=none; ods msoffice2k file='my-file-path\myexcel.html' rs=none; ods tagsets.excelxp file='my-file-path\myexcel.xml)' rs=none; For more information, see Using the RS= Option in the ODS Statement in Appendix F: z/OS Notes. For more information about using options with the EXCELXP destination, see Using Options with the EXCELXP Destination in Appendix D: Details.

3. View the files you created. Note: You may not be able to view this files if you are not using SAS in the Windows operating environment. 4. The CSV file may open automatically in Excel. View the file. 5. In Windows Explorer, navigate to the location where your stored your output files. Right-click the HTML file and open it in Excel. View the file. 6. In Windows Explorer, right-click the XML file and open it in Excel. View the file. Notice that each of the three files has a somewhat different appearance by default. 7. Close the files you opened in Excel.

13-10

Lesson 14 Creating Summary Reports and Data Sets

Lesson 14: Creating Summary Reports and Data Sets

Creating a Frequency Report without Specifying Variables

In this demonstration, you see what happens when you create a frequency report without specifying variables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc freq data=orion.sales; run; 2. Submit the step. Notice that the Results window lists all the tables that this step produced. By default, without a TABLES statement, PROC FREQ creates a separate one-way frequency table for each variable in the data set. 3. Look at the first table, which shows frequency statistics for Employee_ID. This is what a one-way frequency table looks like by default. 4. Scroll down to see the length of the frequency table for Employee_ID. To avoid generating long frequency tables that might or might not be useful, you can use a TABLES statement to control the variables in a PROC FREQ report.

14-2

Lesson 14: Creating Summary Reports and Data Sets

Creating a Report with One-Way and Two-Way Tables

In this demonstration, you run the PROC FREQ step to create a report that has one-way and two-way tables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following PROC FREQ step into the editor to create two 1-way frequency tables: proc freq data=orion.sales; tables Gender Country; run; 2. Submit the step and look at the first table in the output. What percent of Orion Star sales employees are female? Female sales employees represent about 41% of the Orion Star sales force. 3. Look at the second table in the output. Which country has more sales employees? The United States has more sales employees than Australia. 4. To create a two-way crosstabulation table that that shows the number of female and male sales employees by country, add an asterisk between the two variable names in the TABLES statement. The modified code is shown below: proc freq data=orion.sales; tables Gender*Country;

14-3

Lesson 14: Creating Summary Reports and Data Sets

run; 5. Submit the modified step and view the output. Look at the statistics for female sales employees in Australia. Orion.Sales contains observations for 27 female employees in Australia. These 27 employees are 16% of the total Orion Star Sales force, 40% of all female sales employees, and 43% of all Australian sales employees. Which country has a higher percentage of female sales employees? In Australia, over 42% of sales employees are female. In the United States, only about 40% are female.

14-4

Lesson 14: Creating Summary Reports and Data Sets

Creating an Enhanced Frequency Report

In this demonstration, you run a program that creates an enhanced PROC FREQ report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. This program is an enhanced version of the PROC FREQ step that you ran in the last demonstration: proc format; value $ctryfmt 'AU'='Australia' 'US'='United States'; run; ods rtf file='my-file-path\myprocfreq1.rtf' style=sasweb; proc freq data=orion.sales; tables Gender*Country; where Job_Title contains 'Rep'; format Country $ctryfmt.; title 'Sales Rep Frequency Report'; run; ods rtf close; Windows: Replace my-file-path with the location where you stored the practice files. UNIX and z/OS: Specify the fully qualified path in your operating environment. SAS Enterprise Guide: Replace my-file-path with the path to a location where you have write access. In some cases, this might be the location where your practice files are stored. If necessary, change

14-5

Lesson 14: Creating Summary Reports and Data Sets

the slash that precedes the file name to match those in the path you've entered. 2. Submit the program and view the RTF output in Word. Notice that the style definition that's specified in the ODS statement changed the colors and fonts. As the title indicates, the report is now based only on sales employees who have Rep in their job title. Full country names are now displayed instead of country codes. This enhanced report looks quite different from the basic report!

14-6

Lesson 14: Creating Summary Reports and Data Sets

Controlling the Pages in a Frequency Report

In this demonstration, you run three versions of a PROC FREQ step to control the pagination of the output. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this PROC FREQ step into the editor: proc freq data=orion.sales; tables Gender Country Hire_Date; format Hire_Date date9.; run; 2. Submit the code. View the listing output to see how SAS places the frequency tables on pages by default. Notice that the first two tables appear on the first page of the report. There's a lot of empty space at the bottom of this page. However, SAS starts the third table, for Hire_Date, on page 2. If the third one-way table was small enough to fit entirely on page 1, would SAS place it on page 1? Yes, SAS does place a new one-way table on the current page if the entire table fits on that page. Notice that the Hire_Date table extends to the fourth page of the report. 3. Add the PAGE option to the PROC FREQ step, as shown below: proc freq data=orion.sales page; tables Gender Country Hire_Date; format Hire_Date date9.; run;

14-7

Lesson 14: Creating Summary Reports and Data Sets

4. Submit the modified PROC FREQ step and view the listing output. Notice that the first page now contains only the table for Gender. Each table in the report starts on a new page. The entire report has expanded to five pages. 5. Add the COMPRESS option to the PROC FREQ step, as shown below: proc freq data=orion.sales compress; tables Gender Country Hire_Date; format Hire_Date date9.; run; 6. Submit the modified PROC FREQ step and view the listing output. Notice that the first page now contains the first two tables as well as the beginning of the table for Hire_Date. The report now ends on page 3, so it is shorter.

14-8

Lesson 14: Creating Summary Reports and Data Sets

Creating an Output Data Set with the Default Variables

In this demonstration, you run three different versions of a program that creates a PROC FREQ output data set with the default variables. You compare the contents of the PROC FREQ report with the contents of the output data set, and then you suppress the PROC FREQ report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc freq data=orion.sales; tables Gender Country / out=freq1; run; proc print data=freq1; run; 2. Submit the code. In the Results window, notice that SAS created both a PROC FREQ report and a PROC PRINT report that displays the contents of the output data set. View and compare the reports. The PROC FREQ report contains one-way frequency tables for both Gender and Country, and displays four statistics by default: Frequency, Percent, Cumulative Frequency, and Cumulative Percent. The PROC PRINT report shows that the output data set contains summary statistics for the variable Country, not for Gender. This output data set has two observations, one for each distinct value of the variable Country. In addition to the Country variable, the output data set also contains the automatic variables COUNT and PERCENT. The Obs column, of course, is not a variable in the data set. Which statistics appear in the PROC FREQ report but not in the output data set? The output data set

14-9

Lesson 14: Creating Summary Reports and Data Sets

does not contain the cumulative frequency and cumulative percent. 3. Modify the PROC FREQ step so that it creates a two-way crosstabulation table for Gender and Country. Add an asterisk between the two variable names and change the name of the output data set to Freq2, as shown below: proc freq data=orion.sales; tables Gender*Country / out=freq1; run; proc print data=freq1; run; 4. Submit the code and view the two reports. In the PROC FREQ report, the crosstabulation table displays frequency counts, percentages, row percentages, column percentages, and totals. The PROC PRINT report shows that this output data set, Freq2, has more observations and variables than the output data set you generated before, Freq1. For a crosstabulation table, the output data set contains one observation for each combination of variable values. This data set contains both of the variables Gender and Country as well as the automatic variables COUNT and PERCENT. So, for a crosstabulation table, many of the statistics shown in the PROC FREQ report do not appear in an output data set by default. 5. To suppress the PROC FREQ report, add the NOPRINT option to the PROC FREQ statement, as shown below: proc freq data=orion.sales noprint; tables Gender*Country / out=freq1; run; proc print data=freq1; run; 6. Submit the code. Notice that only a PROC PRINT report is listed in the Results window.

14-10

Lesson 14: Creating Summary Reports and Data Sets

Creating Output Data Sets with Additional Statistics

In this demonstration, you use PROC FREQ to create output data sets that have additional statistics. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. The first PROC FREQ step creates an output data set based on a one-way frequency table. The other PROC FREQ step creates an output data set based on a two-way crosstabulation table. proc freq data=orion.sales noprint; tables Gender Country / out=freq3; run; proc print data=freq3; run; proc freq data=orion.sales noprint; tables Gender*Country / out=freq4; run; proc print data=freq4; run; 2. In each PROC FREQ step, add an option to the TABLES statement to add statistics to the output data set. For a one-way frequency table, the OUTCUM option adds the cumulative frequency and the cumulative percentage to the output data set. For a crosstabulation table, the OUTPCT option adds the percentage of column frequency and the percentage of row frequency to the output data set.

14-11

Lesson 14: Creating Summary Reports and Data Sets

proc freq data=orion.sales noprint; tables Gender Country / out=freq3 outcum; run; proc print data=freq3; run; proc freq data=orion.sales noprint; tables Gender*Country / out=freq4 outpct; run; proc print data=freq4; run; 3. Submit the code and look at the PROC PRINT reports for both output data sets. In the first data set, Freq3, the CUM_FREQ and CUM_PCT columns now appear. The second data set, Freq4, now contains the additional columns PCT_ROW and PCT_COL.

14-12

Lesson 14: Creating Summary Reports and Data Sets

Creating an Output Data Set with Chi-Square Statistics

In this demonstration, you use PROC FREQ to create an output data set that contains chi-square statistics. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. In this program, the TABLES statement specifies two 1-way frequency tables, for Gender and Country. In the OUTPUT statement, the OUT= option creates the output data set Freq5. The CHISQ option indicates that we want the output data set to include chi-square statistics. proc freq data=orion.sales; tables Gender Country; output out=freq5 chisq; run; proc print data=freq5; run; 2. Submit the program. In the Results window, notice that this program created only a PROC FREQ report, not a PROC PRINT report. 3. Check the log to see what happened. In the PROC PRINT step, an error message indicates that Freq5 does not exist. Closer to the PROC FREQ step, a warning message says that no statistics are requested in the TABLES statement. If you don't specify the CHISQ option in the TABLES statement, SAS cannot calculate the chi-square statistics. There were no statistics to include in an output data set, so PROC FREQ could not create the data set.

14-13

Lesson 14: Creating Summary Reports and Data Sets

4. Look at the PROC FREQ report. The TABLES statement does not request any statistics so the PROC FREQ report contains only two 1-way frequency tables. 5. Add a forward slash and the CHISQ option to the TABLES statement, as shown below: proc freq data=orion.sales; tables Gender Country / chisq; output out=freq5 chisq; run; proc print data=freq5; run; 6. Submit the modified program. This time, the Results window shows that the program generated both PROC FREQ and PROC PRINT output. 7. Look at the PROC FREQ report. Now, this report contains a table of chi-square statistics for each variable, in addition to the one-way frequency table. 8. Look at the PROC PRINT report to see the contents of the output data set. The N variable indicates that there are 165 observations. The _PCHI_ variable contains the chi-square statistic. The other two columns contain the degrees of freedom and the p-value, respectively.

14-14

Lesson 14: Creating Summary Reports and Data Sets

Creating Multiple Output Data Sets

In this demonstration, you run a PROC FREQ step to create multiple output data sets. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. In the PROC FREQ step, notice that the TABLES statement specifies only one variable, Country. After the PROC FREQ step, we've added two PROC PRINT steps to print the two output data sets. proc freq data=orion.sales; tables Country / chisq out=freq6 outcum; output out=freq7 chisq; run; proc print data=freq6; run; proc print data=freq7; run; 2. Submit the code. This program generated a PROC FREQ report and two PROC PRINT reports. The PROC FREQ report contains the default statistics from the one-way frequency table, the cumulative statistics that the OUTCUM option specifies, and a separate table containing chi-square statistics. The first PROC PRINT report shows that the output data set Freq6 contains the same statistics as the first table in the PROC FREQ report. Notice that, even though the CHISQ option appears in the TABLES statement, this output data set does not contain chi-square statistics. The second PROC PRINT report shows that the output data set Freq7 contains chi-square statistics.

14-15

Lesson 14: Creating Summary Reports and Data Sets

Creating a PROC MEANS Report with Grouped Data

In this demonstration, you create a PROC MEANS report that has grouped data. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this PROC MEANS step in the editor: proc means data=orion.sales; var Salary; class Gender Country; run; 2. Submit the code. Notice that this report has three more columns and more rows than the basic PROC MEANS report we saw earlier. The table now has a column for each of the class variables, Gender and Country. PROC MEANS reports statistics for each class level, so there's a row for each combination of class variable values. What is the mean salary for male employees in Australia? The mean salary for male employees in Australia is $32,001.39. 3. Notice that, when you use the CLASS statement, PROC MEANS displays the additional statistic N Obs in your report. Remember that the N statistic counts all non-missing values of the analysis variable. However, N Obs reports the total number of observations that PROC MEANS processes for each class level, whether or not there are missing values. If a data set has missing values, the values of N and N Obs for some levels are different. Based on the values of N and N Obs, does this data set have missing values of Salary? For each level, the values of N and N Obs are identical. This data set does not have missing values.

14-16

Lesson 14: Creating Summary Reports and Data Sets

Specifying and Formatting Statistics in PROC MEANS

In this demonstration, you run a PROC MEANS step that specifies and formats statistics in a report. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor. This is the PROC MEANS step that you ran in the last demonstration with the statistic keywords SUM, MEAN, and RANGE added. proc means data=orion.sales sum mean range; var Salary; class Gender Country; run; 2. Submit the code and view the output. The PROC MEANS report now has an N Obs column and the three statistics that are specified in the PROC MEANS statement. The default statistics do not appear. Notice that the sum, mean, and range values are displayed with two decimal places. 3. To display just the sum and the mean, with the mean displayed first, specify MEAN and SUM in the PROC MEANS statement. To set the decimal places to zero, add the option MAXDEC=0. The modified code is shown below: proc means data=orion.sales mean sum maxdec=0; var Salary; class Gender Country; run; 4. Submit the code. The PROC MEANS report displays the two statistics we specified. The values of MEAN and SUM have no decimal places.

14-17

Lesson 14: Creating Summary Reports and Data Sets

Creating an Output Data Set with Specified Statistics

In this demonstration, you run a PROC MEANS step to create an output data set with statistics that you specify. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc means data=orion.sales noprint; var Salary; class Gender Country; output out=means2 min=minSalary max=maxSalary sum=sumSalary mean=aveSalary; run; proc print data=means2; run; 2. Submit the code. As the PROC PRINT output shows, the first four variables in this data set are the same four variables that appeared before you specified statistics in the OUTPUT statement: the two class variables and the automatic variables _TYPE_ and _FREQ_. However, the _STAT_ and analysis variable columns no longer appear. Instead, here are the four variables that you specified in the OUTPUT statement.

14-18

Lesson 14: Creating Summary Reports and Data Sets

Creating Simple One-, Two-, and Three-Dimensional Tables

In this demonstration, you run a PROC TABULATE step to create simple one-, two-, and threedimensional tables. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc tabulate data=orion.sales; class Job_Title Gender Country; var Salary; table Country; table Gender, Country; table Job_Title, Gender, Country; table Country*Salary; run; 2. Submit the code and look at the HTML output. Notice that the default SAS title appears at the top of each table. As expected, the Country variable appears in the columns in the first three tables. In the last table, which crosses Country and Salary in the column dimension, both variables appear in the columns. 3. Look at the listing output. Notice that the output from each TABLE statement appears on a separate page. The three-dimensional table is a series of tables, and each one appears on its own page.

14-19

Lesson 14: Creating Summary Reports and Data Sets

Creating an Output Data Set with the Default Statistics

In this demonstration, you run a PROC TABULATE step to create an output data set with the default statistics. Make sure you've defined the Orion library to the location where you stored the practice files. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following code into the editor: proc tabulate data=orion.sales out=tabulate1; where Job_Title contains 'Rep'; class Job_Title Gender Country; table Country; table Gender, Country; table Job_Title, Gender, Country; run; proc print data=tabulate1; run; 2. Submit the program and look at the PROC PRINT output to see the contents of the output data set. Notice that the three class variables are listed first: Job_Title, Gender, and Country. The three automatic variables _TYPE_, _PAGE_, and _TABLE_ come next. The last variable is the statistic N.

14-20

Lesson 15 Creating Graphs Using SAS/GRAPH Software

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Viewing Bar Charts, Pie Charts, and Plots

Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. This demonstration shows you examples of bar charts, pie charts, and plots. 1. Submit the following program. Note: If you are using SAS Enterprise Guide, add DEV=ACTIVEX after RESET=ALL in the GOPTIONS statement. goptions reset=all; proc gchart data=orion.staff; vbar3d Job_Title / subgroup=Gender type=percent autoref; where Job_Title =:'Sales Rep'; title 'Gender by Job Title'; run; quit; proc gchart data=orion.staff; hbar job_title / group=Gender nostats; where Job_Title =:'Sales Rep'; label Gender='Gender' Job_Title='Job Title'; title 'Job Title by Gender'; run; quit; proc gchart data=orion.staff; vbar3d Salary / autoref levels=5;

15-2

Lesson 15: Creating Graphs Using SAS/GRAPH Software

where Job_Title =:'Sales Rep'; label Gender='Gender' Job_Title='Job Title'; title 'Five Levels of Salary for Sales Reps'; run; quit; proc gchart data=orion.staff; pie3d Job_Title / sumvar=Salary type=sum noheading; where Job_Title =:'Sales Rep'; title 'Total Salary by Job Title'; run; quit; proc gplot data=orion.budget; plot Yr2006*Month Yr2007*Month/ overlay haxis=1 to 12 vref=3000000 cframe="#CDD9EF"; label Yr2006='Budget'; format Yr2006 dollar12.; symbol1 i=join v=dot ci=blue cv=blue; symbol2 i=join v=triangle ci=red cv=red; title 'Plot of Budget by Month for 2006 and 2007'; run; quit; 2. View the five graphs that the program creates. The first graph is a three-dimensional vertical bar chart showing the percent of the total for each job title. Notice that the bars are subgrouped into segments based on Gender. The reference lines, the legend, and the title make the graph easier to interpret. The second graph is a two-dimensional horizontal bar chart. Once again the bars represent job titles, but here theyre grouped by Gender. The display of default statistics has been suppressed for this chart. The third graph is a three-dimensional vertical bar chart of Salary, a numeric variable. SAS/GRAPH has divided values of Salary into five ranges, and the bars represent the number of sales reps in that salary range. The fourth graph is a three-dimensional pie chart. The slices represent the total salary for the job titles shown. The final graph is an overlay plot of budget by month. The program specifies reference lines, plotting symbols, a method for connecting plotting symbols, and a background color.

15-3

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Examining a SAS/GRAPH Program

Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. This demonstration examines the SAS/GRAPH program that created the graphs in the previous demonstration more closely. 1. Copy and paste this program into the editor. Note: If you are using SAS Enterprise Guide, add DEV=ACTIVEX after RESET=ALL in the GOPTIONS statement. goptions reset=all; proc gchart data=orion.staff; vbar3d Job_Title / subgroup=Gender type=percent autoref; where Job_Title =:'Sales Rep'; title 'Gender by Job Title'; run; quit; proc gchart data=orion.staff; hbar job_title / group=Gender nostats; where Job_Title =:'Sales Rep'; label Gender='Gender' Job_Title='Job Title'; title 'Job Title by Gender'; run; quit; proc gchart data=orion.staff; vbar3d Salary / autoref levels=5;

15-4

Lesson 15: Creating Graphs Using SAS/GRAPH Software

where Job_Title =:'Sales Rep'; label Gender='Gender' Job_Title='Job Title'; title 'Five Levels of Salary for Sales Reps'; run; quit; proc gchart data=orion.staff; pie3d Job_Title / sumvar=Salary type=sum noheading; where Job_Title =:'Sales Rep'; title 'Total Salary by Job Title'; run; quit; proc gplot data=orion.budget; plot Yr2006*Month Yr2007*Month/ overlay haxis=1 to 12 vref=3000000 cframe="#CDD9EF"; label Yr2006='Budget'; format Yr2006 dollar12.; symbol1 i=join v=dot ci=blue cv=blue; symbol2 i=join v=triangle ci=red cv=red; title 'Plot of Budget by Month for 2006 and 2007'; run; quit; 2. View the structure of the program: It contains a GOPTIONS statement and five PROC steps. Each PROC step contains a PROC GCHART or PROC GPLOT statement, followed by a statement that defines the form of the chart or plot and specifies options. The steps also contain statements youre familiar with, including LABEL, FORMAT, WHERE, and TITLE statements. You can see some unfamiliar statements, too. Each PROC step also contains a RUN statement and ends with a QUIT statement. 3. Submit the program. If your preferences specify HTML output, the output appears in the Results Viewer window. When you specify listing output, the GRAPH window displays your graphs.

15-5

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Creating Bar Charts

This demonstration shows you how to create bar charts. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste the following program into the editor. Note: If you are using SAS Enterprise Guide, add DEV=ACTIVEX after RESET=ALL in the GOPTIONS statement. goptions reset=all cback=white; proc gchart data=orion.staff; vbar3d Gender; where Job_Title =:'Security'; label Gender='Gender' Job_Title='Job Title'; title 'Security Personnel'; run; quit; 2. Submit the program and view the output. 3. Change the chart variable to Job_Title, resubmit the step, and view the output. vbar3d Job_Title; 4. Change the chart to a pie chart that specifies both Gender and Job_Title as chart variables in the PIE statement. pie3d Gender Job_Title;

15-6

Lesson 15: Creating Graphs Using SAS/GRAPH Software

5. Submit the program and view the output. Notice that PROC GCHART creates two pie charts in the same step.

15-7

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Specifying Options for Bar Charts

This demonstration shows you how to specify options for bar charts. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: proc gchart data=orion.staff; vbar3d Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Security Personnel by Job Title'; run; quit; 2. Submit the program and view the output. The program creates a vertical bar chart with Job_Title as the chart variable. The statistic is frequency, and the midpoints are unique values of Job_Title. 3. Add the TYPE= option and change the statistic to PERCENT. Also, add the AUTOREF option to add reference lines to the chart, and then update the TITLE statement to reflect the new chart variable. proc gchart data=orion.staff; vbar3d Job_Title / autoref type=percent; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Security Personnel by Job Title'; run; quit;

15-8

Lesson 15: Creating Graphs Using SAS/GRAPH Software

4. Submit the program and view the output. 5. Change the chart form to a two-dimensional horizontal bar chart, change the chart variable to Salary, and update the title again. Also, add the NOSTATS option to remove the statistics to the right of the chart. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 6. Submit the program and view the output. Now the bars represent the midpoint of a numeric range. 7. Add the RANGE option to display the range instead of the midpoint value next to the bars, and add LEVELS=3 to specify three midpoints. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats range levels=3; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 8. Submit the program and view the output. Now the chart displays three bars and the ranges they represent for the numeric chart variable Salary. 9. Add the DISCRETE option to display a bar for each unique value of Salary. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats range levels=3 discrete; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 10. Submit the program and view the output. Notice that when you used DISCRETE, PROC GCHART ignored the LEVELS= option. Now the bar chart displays nine bars. What does this number of bars indicate? Nine bars indicate that Salary has nine unique values in the data set. The bars are all the same length because only one person has each particular salary.

15-9

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Summarizing, Subgrouping, and Grouping Variable Values

This demonstration shows you how to create more complex bar charts by summarizing a variable within categories, subgrouping and grouping bars, and adding some enhancements. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: proc gchart data=orion.staff; vbar3d Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Security Personnel by Job Title'; run; quit; 2. Submit the program and view the output. 3. Make the following changes to the program: Add the SUMVAR= option to summarize Salary for each security job title. Add TYPE=MEAN to request the average salary rather than the total salary. Specify the MEAN statistic above the bars. Add the PATTERNID=MIDPOINT option to assign a different color to each bar. Update the TITLE statement to describe the new chart. proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint;

15-10

Lesson 15: Creating Graphs Using SAS/GRAPH Software

where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Average Salary for Security Personnel by Job Title'; run; quit; 4. Submit another program to create a horizontal bar chart of Salary. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats range levels=3; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 5. Add the SUBGROUP= option to subgroup the chart by Gender. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats range levels=3 subgroup=gender; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 6. Submit the program and view the output. Which salary range has the highest percentage of males? The highest percentage of males are in the $27,500 and under range. 7. Add the GROUP= option to group the bars by Job_Title. proc gchart data=orion.staff; hbar Salary / autoref type=percent nostats range levels=3 subgroup=Gender group=Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title'; title 'Percent of Salary for Security Personnel'; run; quit; 8. Submit the program and view the output. Now the chart displays a group for each security job title. Notice that some combinations of variable values dont exist, so no bar appears for them. 9. Now submit this PROC GCHART step, which also specifies a grouping variable. proc gchart data=orion.staff; hbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint group=Gender; where Job_Title =:'Security';

15-11

Lesson 15: Creating Graphs Using SAS/GRAPH Software

label Job_Title='Job Title' Salary='Salary' Gender='Gender'; title 'Average Salary for Security Personnel by Job Title and Gender'; run; quit; 10. Finally, reverse the chart variable and the grouping variable: proc gchart data=orion.staff; hbar3d Gender / sumvar=Salary type=mean mean patternid=midpoint group=Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title' Salary='Salary' Gender='Gender'; title 'Average Salary for Security Personnel by Job Title and Gender'; run; quit; 11. Submit the program and view the output.

15-12

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Creating Pie Charts

This demonstration shows you how to create pie charts. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: proc gchart data=orion.staff; hbar Gender; where Job_Title =:'Security'; label Job_Title='Job Title' Gender='Gender'; title 'Security Personnel by Gender'; run; quit; 2. Submit the program and view the output. 3. Change the HBAR statement to a PIE statement and resubmit the program. proc gchart data=orion.staff; pie Gender; where Job_Title =:'Security'; label Job_Title='Job Title' Gender='Gender'; title 'Security Personnel by Gender'; run; quit;

15-13

Lesson 15: Creating Graphs Using SAS/GRAPH Software

4. Subgroup the pie chart by Job_Title: proc gchart data=orion.staff; pie Gender / subgroup=Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title' Gender='Gender'; title 'Security Personnel by Gender'; run; quit; 5. Submit the program and view the output. You see a separate concentric circle in the pie chart for each value of the subgroup variable Job_Title. 6. Change the subgrouping variable (Job_Title) to a grouping variable: proc gchart data=orion.staff; pie Gender / group=Job_Title; where Job_Title =:'Security'; label Job_Title='Job Title' Gender='Gender'; title 'Security Personnel by Gender and Job Title'; run; quit; 7. Submit the program and view the output. PROC GCHART creates a separate pie chart for each value of the group variable. 8. Change the chart variable to Job_Title, remove the group variable, add SUMVAR= to display the total salary for each security job title, and specify the NOHEADING option to suppress the heading under the title. Then update the title to describe the chart. proc gchart data=orion.staff; pie Job_Title/ sumvar=Salary noheading; where Job_Title =:'Security'; label Job_Title='Job Title' Gender='Gender'; title 'Total Salary by Job Title'; run; quit; 9. Submit the program and view the output. Now you see the total salary for each security job title.

15-14

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Creating and Enhancing Plots

This demonstration shows you how to create and enhance single and overlay plots. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: proc gplot data=orion.yearly_saleshires; plot hires*year; title 'Single Plot of Hires by Year'; run; quit; 2. Submit the program and view the output. The result is a white graph with plus signs for plotting symbols. 3. Add a SYMBOL statement to specify STAR as the plotting symbol, JOIN as the method of interpolation, and green as the color for both. Also specify a reference line at 25 on the vertical axis. proc gplot data=orion.yearly_saleshires; plot hires*year / vref=25; symbol1 v=star i=join cv=green ci=green; title 'Single Plot of Hires by Year'; run; quit; 4. Submit the program and view the output. Now the plotted points are connected, and you see two major increases in hiring, in 1974 and 2006.

15-15

Lesson 15: Creating Graphs Using SAS/GRAPH Software

5. Create an overlay plot using the data set Orion.Budget: Plot three variables, Yr2003, Yr2004, and Yr2005, against Month. To overlay the three plots, specify the OVERLAY option. Add a reference line at 2,500,000 on the vertical axis. For the two new plots, add two more SYMBOL statements with different color specifications each. Modify the title. Label Yr2003 as Yearly Budget. Add the CFRAME= option to specify the color Very Light Gray as the background color. proc gplot data=orion.budget; plot Yr2003*Month Yr2004*Month Yr2005*Month / overlay vref=2500000 cframe='Very Light Gray'; symbol1 v=star i=join cv=green ci=green; symbol2 v=star i=join cv=green ci=red; symbol3 v=star i=join cv=green ci=blue; label Yr2003='Yearly Budget'; title 'Overlay Plot of Budget by Month'; run; quit; 6. Submit the program and view the output. Whats the trend in these three plots? The budget is higher every year, but otherwise similar by month.

15-16

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Enhancing Output and Using RUN-Group Processing

This demonstration shows you how to enhance SAS/GRAPH output and use RUN-group processing. Make sure you've defined the Orion library to the location of your sample data. For more information, see Setting Up Practice Files in the Reference section. 1. Copy and paste this program into the editor: proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint; where Job_Title=:'Sales Rep'; label Salary='Average Salary' Job_Title='Job Title'; run; quit; 2. Submit the program and view the output. 3. Specify the ODS style Ocean and resubmit the program. Note: You might not be able to access ODS styles on all systems. ods html style=Ocean; proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint; where Job_Title=:'Sales Rep'; label Salary='Average Salary'

15-17

Lesson 15: Creating Graphs Using SAS/GRAPH Software

Job_Title='Job Title'; run; quit; 4. Notice that the text fonts and colors have changed, along with the background color outside the chart. PATTERNID=MIDPOINT still specifies colors for each bar, but now uses a different palette. Change the ODS style to Gears: ods html style=Gears; proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint; where Job_Title=:'Sales Rep'; label Salary='Average Salary' Job_Title='Job Title'; run; quit; 5. Submit the program and view the output. The chart colors and text features are different once again, and the chart has a background image. 6. Submit a GOPTIONS statement to change attributes for all the text in graphs. Add a TITLE statement without any text options. goptions ftext='Albany AMT' ctext=dark_blue htext=3 pct; title 'Average Salary by Job Title'; ods html style=Gears; proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean patternid=midpoint; where Job_Title=:'Sales Rep'; label Salary='Average Salary' Job_Title='Job Title'; run; quit; 7. Submit the program and view the output. You can see that the GOPTIONS options are overriding the text options in the ODS style and controlling all the text in the graph. 8. Edit the TITLE statement to add text options for color, size, and font. Specify a 5 percent high black title in the font Trebuchet MS. goptions ftext='Albany AMT' ctext=dark_blue htext=3 pct; title f='Trebuchet MS' c=black h=5 pct 'Average Salary by Job Title'; ods html style=Gears; proc gchart data=orion.staff; vbar3d Job_Title / sumvar=Salary type=mean mean

15-18

Lesson 15: Creating Graphs Using SAS/GRAPH Software

patternid=midpoint; where Job_Title=:'Sales Rep'; label Salary='Average Salary' Job_Title='Job Title'; run; quit; 9. Submit the program and view the output. The TITLE statement options override the global text options for the title only. 10. Now use RUN-group processing to add another chart to the PROC GCHART step. Note: The remainder of this demonstration is not applicable to SAS Enterprise Guide. 11. Add a PIE statement and a RUN statement. Then submit only these two statements. Note: If you are using SAS Enterprise Guide, add these statements and submit the entire program again. pie Job_Title / sumvar=Salary; run; 12. View the output. The Results window now shows both charts under the GCHART output. Both charts are stored in the same HTML file, so you can scroll in the Results Viewer window to see the pie chart. The ODS style options, GOPTIONS options, and TITLE statement options also apply to this output. 13. Submit a QUIT statement to end the procedure. Note: SAS Enterprise Guide automatically submits a QUIT statement. quit;

15-19

Lesson 15: Creating Graphs Using SAS/GRAPH Software

15-20

You might also like