You are on page 1of 35

Exercise 2

Siting an Internet caf in Orange County, California

In this exercise, you will have the opportunity to determine the best site for an Internet caf in Orange County, California using GIS, census demographic data, transportation data, and landmark features data. You will also consider the effect that public domain spatial data from the US Census Bureau has on your final decision.

Exercises for The GIS Guide to Public Domain Data

Context
Locating any business in a metropolitan area requires an appreciation of various spatial factors. As such, GIS provides numerous, powerful, analytical capabilities to tackle the problem. In this activity, you will have the opportunity to site an Internet caf using public domain spatial data using ArcGIS and spatial analysis.

Problem
You are a new franchisee for the company InstantWorld and you wish to open an Internet caf in Orange County, California. Demographic analysis indicates that Internet cafs do best in the following neighborhoods: Where the percentage of 18 to 21 year olds is over 10 percent of the total population of the neighborhood.

Within 1 kilometer of a high school, university, or college, to attract students.


Within 1 mile of a busy street (with a TIGER Census Feature Class Code (CFCC2) of A1, A2, or A3) to ensure a steady stream of customers. At least 10 kilometers from the Pacific Ocean. Studies have shown that areas near the ocean are already saturated with Internet cafs catering for tourists checking their e-mail on vacation.

Skills Required
Downloading, formatting, and understanding Census TIGER spatial data and US Census demographic data Joining tables Performing tabular and spatial sorting and querying Performing overlay and proximity analysis

Resources
TIME This exercise contains sixty-two questions and will require four to five hours to complete. SOFTWARE ArcGIS 10.0 or later, from Esri.

Creating map layouts


Solving a problem based on spatial analysis

Work package 1: Accessing, downloading, and projecting data 1) Manage your data

Steps 1 - 10

Create a folder on your computer or on the network where you will store the data. Make certain that this folder has a descriptive name so you will remember what its contents are and without spaces to avoid any problems in your GIS-based analysis. Access the following website to download the data: http://www.esri.com/data/download/census2000_tigerline/index.html

2) Accessing the data

2) Preview the data

Select Preview and Download from the navigation panel. When the Select a State page opens, select California by clicking the map or selecting from the list.

4) Submit selection
5) Select layers

On the next page, select Orange County and click Submit Selection. Do not select any layers yet. On the next page, from the Available Data Layers list, select the layers listed next. Block Groups 2000 Landmark Points Landmark Polygons Line FeaturesRoads Water Polygons

Work package 1: Accessing, downloading, and projecting data 5) Select layers (contd.)

Steps 1 - 10

Make sure you select block groups and not census blocks. A block group is a collection of census blocks, smaller in size than a census tract; it is the smallest area for which the detailed sample census data may be obtained. See the cartographic boundary files descriptions and metadata documents at the US Census Bureaus website for more information. From the Available Statewide Layers list, check Census Block Group Demographics. Again, make sure you access block groups and not census blocks. Click Proceed to Download.

6) Download the data

When you receive the notice that your data file is ready (as shown next), click Download File and save your data to the folder you created at the beginning of this exercise.

Census data download notification.

Work package 1: Accessing, downloading, and projecting data 7) Unzip the data

Steps 1 - 10

Use Winzip, 7-zip, Windows OS self-extraction, or other unzipping program to unzip this file. Note that there will be six files that are zipped underneath this master level zip file, plus a readme file in HTML format. In other words, the file you downloaded is a zip of zip files. These will be named similar to the following, with the master zip file shown first:

Downloaded Census data zip files.

8) Examine the file names

The Federal Information Processing Standard (FIPS) code contains a unique identifier for each geographic area. The FIPS code for California is 06. 8.1) What do you think that the FIPS code for Orange County is? Unzip each of the archive files to extract the data. You should have the following files:
tgr06000sf1grp.dbf tgr06059grp00.dbf, .shp, .shx tgr06059lkA.dbf, .shp, .shx readme.html tgr06059lpt.dbf, .shp, .shx tgr06059lpy.dbf, .shp, ,shx tgr06059wat.dbf, .shp, .shx

Work package 1: Accessing, downloading, and projecting data 9) Examine the data

Steps 1 - 10

The files with the extension .shp are shapefiles, and may be thought of as the G part of GIS. Recall the discussion in chapter 2 about the relationship between counties, census tracts, block groups, and blocks. Block groups are analogous to neighborhoods and are the smallest statistical area for which the more detailed census data is available. You have just downloaded the block groups for Orange County, California. Read the table that is embedded in the metadata file called readme.html and use the information to answer the following questions: 9.1) In what year was the data collected? 9.2) Indicate the filename prefix for the files that contain each of the following features:
Feature Streets Block Group Boundaries Census Demographic data on population and housing for Block Groups Landmark points Landmark polygons Water polygons File Prefix

Work package 1: Accessing, downloading, and projecting data 10) Establish a map projection

Steps 1 - 10

Examine the coordinate system information on the location from which you downloaded the data: http://www.esri.com/data/download/census2000_tigerline/index.html 10.1) What is the coordinate system of the data?

Start ArcGIS, open the ArcCatalog, and navigate to the workspace where you unzipped your data. Click the metadata tab and look up the spatial metadata for each layer. 10.2) Has a map projection or coordinate system been explicitly defined for these datasets? You would be fine working statistically with the data as downloaded, but since you want to work with data spatially (drawing buffers and other functions that require spatial analysis), you should define the data coordinate system for each data set using information from the TIGER download site. Otherwise, a buffer drawn from a point in the northern part of your dataset would not have the same dimensions as a buffer in the southern part of your dataset, due to the curvature of the earth. Start ArcMap using Programs > ArcGIS > ArcMap or start ArcMap by selecting the ArcMap tool from ArcCatalog.

Work package 1: Accessing, downloading, and projecting data 10) Establish a map Projection (contd.)

Steps 1 - 10

Once inside ArcMap, open the Window > Catalog window and navigate to System Toolboxes. Go to Data Management Tools > Projections and Transformations > Define Projection, and define the projection for the tgr06059lpt.shp file. Use Select, and on the following screens, navigate to Geographic Coordinate Systems > World > WGS 1984. For the other four layers, you can save time by using the Import function to define the projection. When you use import, import the defined coordinates for the tgr06059lpt file and apply that projection to the other layers. Add all of the layers (shapefiles) you downloaded (streets, block group boundaries, landmark points, landmark polygons, and water polygons) to your data frame. Based on the information you entered into the table in question 9.2, rename the layers to streets, block group boundaries, landmark points, landmark polygons, and water polygons so that the file names are more intuitive. Change your data frame name from Layers to Orange County. Save the map document in your working folder with a descriptive name, such as Internetcafsites. 10.3) What is the coordinate system of your spatial data? 10.4) What are your display units? Set your display units to kilometers. Your datasets are now ready to analyze.

End of work package

Work package 2: Analyzing street and neighborhood data 11) Examine your Street data

Steps 11 - 16

Make a unique value map for streets, symbolizing the streets based on the attribute cfcc2 (Census Feature Class Code 2). Be sure to add all values to see all the possible values for cfcc2. When you have finished, zoom in on the map and click the I (identify) button. 11.1) Identify the three major arterial road types that carry the most traffic. What is the CFCC2 attribute for each of these road types? 11.2) What other information exists in the attribute table for streets?

12) Prepare demographic data

Using the Add Data tool, add the table containing your census data on housing and population for Block Groups (the dbf file tgr06000sf1grp). Open the table and examine the data. 12.1) How many records, or block groups, are contained in the table? Open the attribute table for the Block Groups for Orange County. This indicates how many block groups exist in Orange County

12.2) How many records are contained in this table, and therefore, how many block groups are there in Orange County?
12.3) What is the geographic extent of the records contained in the tgr06000sf1.grp table?

12.4) What is the name of the field, common to both tables, that contains a unique code representing the state, county, tract, and block group numbers?

Work package 2: Analyzing street and neighborhood data 12) Prepare demographic data (contd.)

Steps 11 - 16

Clear any previously selected records in both tables. Notice that your block groups layer (block groups boundaries) contains block group boundaries but no census data. Your data table tgr06000sf1grp contains census data but no map data. In the next step, you will join the data table to the map layer so that you will be able to map and spatially analyze your block group data. Right-click the block group boundaries layer. Join attributes from the demographic data table to this block groups boundaries layer as shown next. Click Yes if prompted to spatially index the data.

13) Join demographic data to map layer

Joining Block Groups to demographic data.

Work package 2: Analyzing street and neighborhood data 13) Join demographic data to map layer (contd.) 14) Make the join permanent

Steps 11 - 16

13.1) What is unique about the STFID field that qualifies it to be a suitable field on which to base the join? Right-click the Block Groups layer, and use Data > Export Data to export your joined data to a permanent data file named OrangeCountyBlockGroups. Use the same coordinate system as the data frame. When prompted, add the exported map as a layer to your map document table of contents. 14.1) What is unique about the STFID field that qualifies it to be a suitable field on which to base the join? Save your map document.

15) Examine population Spatial pattern in Orange County

Change the symbology on your OrangeCountyBlockGroups layer to examine the countys demographic patterns by creating a graduated color thematic map, using a natural breaks classification for each demographic attribute that you wish to examine. 15.1) Describe the pattern of median age in Orange County. 15.2) Which of the following population characteristics would you say shows the most concentrated patternWhite, Black, Ameri_Es (American Indian), Asian, or Hispanic?

Work package 2: Analyzing street and neighborhood data 16) 18 - 21 age group

Steps 11 - 16

Remember that one of your criteria for siting the Internet caf is to consider the 18 to 21 age group. Symbolize your new OrangeCountyBlockGroups layer as a graduated color thematic map of population that ranges in age from 18- to 21year-olds. Use a five-class Natural Breaks for the classification. 16.1) Describe the resulting pattern of raw numbers of 18- to 21-year-olds by block group in Orange County. Next, symbolize the block groups as a graduated color thematic map of the population that ranges in age from 18 to 21, normalized by the 2000 population (POP2000) of the block group.

16.2) What does normalizing the data do to the data and to the map display?
16.3) Describe the geographic pattern of percentage 18- to 21-year-olds by block group in Orange County. 16.4) Examine the landmark points and polygons. Does the location of any of these landmarks influence the distribution of 18- to 21-year-olds? Why?

Work package 2: Analyzing street and neighborhood data 16) 18 to 21 age group (contd.)

Steps 11 - 16

It will be easier to solve your Internet caf siting problem if you create a field that contains the percentage of 18- to 21-year-olds by block group. To do this, open the attribute table of your OrangeCountyBlockGroups layer, click Table Options, and then click Add field. Name the new field p1821, make it a Float (floating point) data type, and give it a precision (width) of 5 and a scale (number of decimal places) of 2, as shown next.

Adding a new field to the OrangeCountyBlockGroups attribute table.

Right-click the new p1821 field and access the Field Calculator. Fill out the calculation as shown next, to divide the raw number of 18- to 21-year-olds by the total population of that block group in the year 2000.

Work package 2: Analyzing street and neighborhood data

Steps 11 - 16

16) 18 to 21 age group (contd.)

Calculating the percentage of 18- to 21-year-olds by block group.

Note: You will likely receive a VBA overflow or some other error message. If you do, cancel the calculation and examine the table at the row where the percentage vacant field stopped calculating.
16.5) Can you determine why the calculation stopped working at this point? If so, state why. 16.6) Can you determine how to select certain records in the table so this problem will not occur again? If so, recalculate and state how you solved the VBA overflow problem.

Work package 2: Analyzing street and neighborhood data

Steps 11 - 16

16) 18 to 21 age group (contd.)

Make a thematic map based upon your new field p1821. 16.7) Indicate the area(s) in Orange County in which the highest percentage of 18- to 21-year-olds are found. Sort the attribute table on the percentage of 18- to 21-year-olds. Select the block groups that have the five highest percentages of 18- to 21-year-olds. 16.8) Display and label the landmark polygons and the landmark points. Now, zoom in on each of the five block groups with the highest percentages of 18to 21-year-olds and name the universities and colleges that are responsible for those high percentages. Next, make a pie chart map. Create a pie chart for each of the five block groups you identified. Have the two parts of each pie represent the population of 18- to 21-year-olds and the total population in each block group. Experiment with the size and the overlap functions. 16.9) Describe your pie chart map. Next, make a dot density map of the raw numbers of 18- to21-year-olds. 16.10) Describe your dot density map. 16.11) In your judgment, which of the mapsgraduated color, pie chart, or dot mapis the easiest to understand? Which type would you recommend for mapping this type of variable at a countywide scale, and why?

Work package 2: Analyzing street and neighborhood data

Steps 11 - 16

16) 18 to 21 age group (contd.)

16.12) How many block groups have 10 percent or more of their population in the 18 to 21 age range? 16.13) How did you obtain the above results? 16.14) Describe the spatial pattern of these block groups.

Save your map document again.

End of work package

Work package 3: Analyzing schools and university data

Steps 17 - 19

17) High schools, colleges and universities

You have determined one of the three criteria needed to site your InstantWorld franchiseneighborhoods where the percentage of 18- to 21-year-olds is at least 10 percent. Next, determine where the high schools, colleges, and universities are located. Examine the table for your point landmark map layer. 17.1) Which CFCC code indicates a school point landmark feature?

Select the schools in the landmark points feature layer using Select By Attribute. Notice that the schools you have selected include all schools, rather than only the high schools. Therefore, start over with selecting: Only select high schools by using the LIKE command, as shown next.

Selecting schools by attribute.

Work package 3: Analyzing schools and university data 17) High schools, colleges and universities (contd.)

Steps 17 - 19

Add to the current selection by adding the colleges, using the expression: NAME LIKE %College% Repeat the process for universities, using the expression: NAME LIKE %University% Make certain that you use the add to current selection method each time. 17.2) How many high schools, colleges, and universities do you have selected? In System Toolboxes, use Analysis Tools> Proximity > Buffer to create 1 kilometer buffers around the selected features (schools), as shown next:

Creating a 1-kilometer buffer around the selected schools.

Work package 3: Analyzing schools and university data 17) High schools, colleges and universities (contd.)

Steps 17 - 19

Next, you need to consider the schools, colleges, and universities that are stored as polygons, rather than as points. Examine the attribute table for your polygon landmark feature class. 17.3) What CFCC indicates that a polygon landmark feature is a school, college, or university? As you did for the point feature class earlier, select only the high schools, colleges, and universities in the polygon feature class. 17.4) How many high schools, colleges, and university polygons are in your final selection? Buffer these polygons by 1 kilometer as you did for the landmark points, earlier. You now have two of the required three criteria for locating your businessyou know what areas are close to schools, colleges, and universities, and you know which neighborhoods have at least 10 percent of the population in the 18- and 21year age range. You have also picked up some valuable skills downloading, working with, and analyzing public domain spatial data along the way.

Work package 3: Analyzing schools and university data 18) Identify land near education institutions

Steps 17 - 19

The land adjacent to education institutions is stored in two different map layers. In this step, you will use the intersect function to combine these two layers into one layer. In the System Toolboxes, use the Analysis Tools > Overlay > Intersect function to intersect the two 1 kilometer buffer layers. Name your output 1km_buffer_intersect.shp, as shown next:

Intersecting the point and polygon buffers.

Work package 3: Analyzing schools and university data 18) Identify land near education institutions (contd.) 19) Identify areas near education institutions and with a high % of 18 to 21 year-olds
Examine your new intersect layer.

Steps 17 - 19

18.1) Does the new intersect layer contain more land or less land area than the separate input layers? Why?

Use the Intersect tool again to intersect your new layer that represents all land in Orange County that is near schools with the layer that contains the selected block groups containing at least 10 percent 18 to 21 year-olds. Name your resulting layer education_and_18_21. Open the attribute table for your new layer and notice the number of educational institutions listed under the field LANDNAME. Summarize this field and name the output table education_uniquevalues. Save the output table in your working database, or if you wish to save it outside the geodatabase, change the output type to dBase table. 19.1) How many areas near schools, colleges, or universities are now under consideration for the InstantWorld site? 19.2) Name the schools that are near the location that you are considering for InstantWorld. 19.3) Which schools in the list are probably not actually schools. Were some locations included on the list just because they have the word school in their title? Why are they in the list?

19.4) Which name(s) in the list could represent more than one school?

Work package 3: Analyzing schools and university data 19) Identify areas near education institutions and with a high % of 18- to 21-year-olds (contd.)

Steps 17 - 19

It may be helpful for you to set a spatial bookmark at each of these locations. Choose View > Bookmarks > Create.

19.5) Find Cypress College and Fullerton College on your map. Why arent any proposed neighborhoods near the Cypress College or Fullerton College under consideration for the InstantWorld site?
Save your map document again.

End of work package

Work package 4: Investigating street and coastline data

Steps 20 - 21

You have now considered proximity to campuses and the percentage of 18 to 21 year olds in your analysis. Recall that the third criterion for the InstantWorld site is that it has to be on a busy street, with a cfcc2 code of A1, A2, or A3.

20) Select A1, A2, A3 streets

Select the streets that are classified A1, or A2, or A3. 20.1) How many street segments meet this busy streets criterion? Choose the Selection > Select By Location function to intersect the education_and_18_21 layer with the busy streets segments, using a buffer distance of 1 mile, as shown next:

Identifying schools within 1 mile of busy streets.

Work package 4: Investigating street and coastline data 20) Select A1, A2, A3 streets (contd.)

Steps 20 - 21

20.2) Describe the spatial pattern of the resulting selected polygons that are now under consideration. 20.3) How many polygons are under consideration for the InstantWorld site? 20.4) Why isnt the area near Santa Ana High School under consideration any longer? Choose Data > Export Data to export these polygons from education_and_18_21 into a dataset named education_and_18_21_and_busy_streets.shp. This option requires an ArcInfo license of ArcGIS. 20.5) Could you have used another Overlay > Intersect operation to achieve the same results as the Select by Location function?

21) Include the coastline data

The final criterion for siting considers the coastline. Remember that a marketing study recommended that you avoid the area near the Pacific coast; that area is already oversaturated with Internet cafs. Turn on the Water Polygons layer, open the attribute table, and select Pacific Ocean. From the System Toolboxes, choose Analysis Tools > Proximity > Buffer and create a 10-kilometer buffer around the Pacific Ocean shoreline. Name the output dataset shoreline_buffer_10km.shp and store it in your data folder.

Work package 4: Investigating street and coastline data 21) Include the coastline data (contd.)

Steps 20 - 21

Now that you know what land is near the ocean, from the System Toolboxes, choose Analysis Tools > Overlay > Erase to create a final map layer that indicates the areas under consideration for InstantWorld. Erase creates a new layer by erasing the input features (the sites under consideration before the shoreline was considered) that fall within the polygons of the erase features (the area near the shoreline). The resulting layer will include suitable areas for siting the Internet caf; and none of them will be near the shoreline. Alternatively, you could select by location, then switch selection, and then export your selected set. Read about the Erase tool by accessing the tool and then selecting Tool Help. 21.1) Which areas will be erased from the consideration of the final Internet caf sites? Complete the Erase tool dialog box in the as shown next and name your output internetcafe_finalsites.shp.

Removing sites that are close to the coast.

Work package 4: Investigating street and coastline data

Steps 20 - 21

21) Include the coastline data (contd.)

Zoom to the final sites under consideration and label the streets. 21.2) Near which two campuses (Landname) are you considering siting a new InstantWorld? 21.3) Near which class A1, A2, or A3 streets are you considering siting a new InstantWorld? 21.4) Examine these streets on your map. Once you understand the type of street this is, what would prohibit you from placing your caf directly adjacent to these streets? 21.5) Find Disneyland in the landmark polygons layer. How far are your proposed InstantWorld Internet caf sites from Disneyland?

End of work package

Work package 5: Geocoding addresses 22) Setting up a locator service

Steps 22 - 26

Now that you have found the ideal neighborhoods for your Internet caf, you need proposed addresses so that you can file a request with the city. Remember that your TIGER streets are encoded with addresses, so you already have downloaded the files containing the address information. You just need to establish an Address Locator service so that you can access those addresses.
22.1) What are the fields that contain the from and to addresses for the left and right sides of the streets, respectively? 22.2) Are the left and right sides of the streets, as encoded in the street data, consistent as you move up or down multiple blocks? Why do you suppose this is the case?

23) Selecting a locator

Access the Catalog by using Windows > Catalog and navigate to your working folder. Right-click your working folder and select New > Address Locator. Select the US Address-Dual Ranges style for your address locator as shown next:

Adding a new US Streets locator.

Work package 5: Geocoding addresses 24) Renaming the locator

Steps 22 - 26

Next, point to your streets file as your reference data, name your output address locator as TIGER-Orange County, and accept the defaults, as shown next.

Creating an address locator to geocode addresses in Orange County.

Click OK.

Work package 5: Geocoding addresses 25) Geocode addresses


You are now ready to geocode your addresses.

Steps 22 - 26

In ArcMap, display the Geocoding toolbar; choose Customize > Toolbars > Geocoding. On the geocoding toolbar, select your new geocoding service from the list of available geocoding services (if it is not already selected). Then, click the Address Inspector. Address Inspector provides a reverse geocoding operation; in other words, you already know the locations of your proposed cafs, and now you just need their street addresses.

Geocoding toolbar.

26) Labelling suitable locations

Zoom to the two neighborhoods that meet your criteria. Select at least three locations in those neighborhoods that you believe would be the most suitable. Hold down the left mouse button and press L (to add a Labeled Point) and O (for a callout label). The next illustration shows a sample result.

Work package 5: Geocoding addresses

Steps 22 - 26

26) Labeling suitable Locations (contd.)

Selecting locations in neighborhoods suitable for the internet caf.

26.1) Based on the data and the labels, what are the odd and even sides of the north-south streets in your chosen neighborhoods? What are the odd and even sides of the east-west streets in your chosen neighborhoods?
26.2) Write down the addresses of your three proposed InstantWorld locations.

End of work package

Work package 6: Final considerations

Steps 27 - 28

27) Create a layout

Now that you know where you will concentrate your efforts to site a new Internet caf, create a map that you can share with the other stakeholders in the project.
27.1) Create an ArcMap layout that indicates the final neighborhoods under consideration for your Internet caf. Include the following information in your layout: streets street names map title scale bar north arrow source statement indicating the public domain spatial data (name and date) you are using your name map legend Export the layout as a graphic and insert it into a Microsoft Word document.

28) Review process

Consider the process you used to make decisions using public domain spatial data within a GIS environment, and what steps need to be taken from this point forward. Thus far, you used a population of 18 to 21 year-olds, the location of colleges, universities, and high schools, busy streets, and the Pacific Ocean shoreline to determine the best location for your Internet caf.

Work package 6: Final considerations 28) Review process (contd.)

Steps 27 - 28

28.1) What other criteria would also be useful to be able to make an even better decision to site this kind of business? 28.2) What other public domain data layers would you need to consider the criteria that you mentioned above? Why? 28.3) What data do you need that probably are not public domain spatial data? Would you be able to obtain such data? Why or why not? 28.4) Do you think that the additional data you need will be freely available or something you would need to pay for? 28.5) Summarize in a few sentences what you learned in this exercise about public domain spatial data from the US Census Bureau. 28.6) Summarize in a few sentences what you learned in this exercise about GIS and about site selection.

End of work package

Chapter 2 quiz
1) What do the terms fitness for use and truth in labeling have to do with spatial data quality? 2) A GIS application allows you to zoom in on your spatial data in your GIS at a very large scale. Why is scale still a critical issue with regards to spatial data quality? 3) What is the National Atlas of the United States, and in which formats are the vector files stored on the National Atlas site? 4) Define TIGER and DLG vector data and indicate at least four differences between the two datasets. 5) Fill in the two blanks with the correct statistical area names, from coarse (large) to fine (small) census statistical areas: COUNTIES ___________________ ___________________ BLOCKS 6) List at least five map layers that you can extract from a TIGER/Line file. 7) Describe the process (major steps) for downloading and formatting TIGER and census demographic data to make them ready for the creation and analysis of thematic maps of population and housing characteristics by statistical areas. 8) Describe two ways in which you can map the percentage of a numeric variable within a GIS, as opposed to mapping that variable as raw numbers.

Chapter 2 quiz (contd.)


9) Name at least three characteristics of the census data that you used in the Internet caf exercise. 10) Name at least three procedures that you used in the Internet caf analysis in Orange County using ArcMap.

You might also like