tig2aprs: Convert US Census TIGER/Line(R) to DOS APRS

Copyright (c) 1996,1997 E. Alan Crosswell

N2YGK

20 June 1996

Introduction

tig2aprs is a utility to convert US Census TIGER/Line(R) 1994 map data into maps suitable for use with the Automatic Packet/Position Reporting System (APRS) by Bob Bruninga, WB4APR. In the documentation that follows, you are assumed to be very familiar with APRS, which can be found at the TAPR FTP site. tig2aprs is a C program that was developed and only been tested under Linux, a free Unix system. If you port it to other platforms, or otherwise improve upon it, please send me your changes so that I can incorporate them.

tig2aprs reads US Census TIGER/Line(R) 1994 files and generates DOS/APRS maps. tig2aprs reads the collection of TIGER/Line files as a single standard input stream and creates one or more output maps plus one or more maplists. This is well suited to dealing with the TIGER/Line data on CD-ROM or even across the network since you can "unzip -p" the files directly from the CD and pipe them into tig2aprs.

About the US Census TIGER/Line(R) 1994 Files

What Are They

TIGER/Line files consist of point, line, and polygon data that are useful to graph geographic areas of interest within the USA and its territories. These graphical elements are supplemented by data useful for geographicly-linked statistical analyses including location information such as street addresses, zip codes, Census blocks, municipalities, etc.

The best thing about TIGER/Line files is that they are in the public domain, so you can redistribute APRS maps made from them (unlike copyrighted maps such as those found with commerical packages such as Delorme Street Atlas -- which are generally derived from TIGER/Line in the first place!).

The map data in TIGER/Line come from US Geological Survey 1:100K Digital Line Graph (DLG) maps which are supplemented with USGS 1:24K, and US Census GBF/DIME and other sources.

Unlike USGS DLG's which are organized by 7.5 or 15 minute map quadrants, TIGER/Line's basic file unit is the county (or county-equivalent).

A new set of TIGER/Line maps appears to come out from the US Census every couple of years (they've published them in 1992, 1994, and expect to be releasing the next set in 3rd quarter 1996). Besides being used for the decennial census, other US government agencies have collaborated with the Census Bureau to augment the maps. For example, the US Postal Service added geographic linking of ZIP+4 codes.

Documentation

Thorough documentation of the TIGER/Line(R) 1994 Files can be found at the Census Bureau's TIGER/Line home page. It is also included on the TIGER/Line CD-ROMs.

You will need to be familiar with that documentation to understand what I am babbling about when I talk about "record types" below.

Where to find TIGER/Line files

TIGER/Line files are published by the US Census Bureau periodically. They are not, as of this writing, available on the Internet. The 1994 set is available in a series of six CD-ROMs that cover the United States of America, including all its territories. The Census Burea sells these CD's for $250 a piece or $1500 for the set!

But, before you change channels, there's hope! Many libraries are official depositories of US government documents and data. There is likely a library near you that has a set of TIGER/Line CD's. Whether or not access to them is practical is another question. An example of a library that does it "right" is Columbia University's Electronic Data Service which provides access to the CDs to members of the general public on Internet-connected PCs. The CDs are not available to be borrowed but you can copy the files you care about onto floppies or FTP them somewhere. Check your local large public or research university library to see if they have a similar program in place.

Choosing TIGER/Line files of interest

TIGER/Line files are organized into ZIP archives arranged by FIPS state and county code. For example, on the appropriate CD-ROM that includes New York State (FIPS state code 36), Westchester County (FIPS state/county code 36/119) is located in the file
36/tgr36119.zip
The ZIP archive contains:
TGR36119.F61
TGR36119.F62
TGR36119.F63
TGR36119.F64
TGR36119.F65
TGR36119.F66
TGR36119.F67
TGR36119.F68
TGR36119.F69
TGR36119.F6A
TGR36119.F6C
TGR36119.F6H
TGR36119.F6I
TGR36119.F6P
TGR36119.F6R
TGR36119.F6S
TGR36119.F6Z
The last letter of the file extension is the TIGER/Line record type, so TGR36119.F61 contains the type 1 records for Westchester County, NY.

The complete list of state and county codes is in Appendix A of the TIGER/Line 1994 documentation file, app_a.asc.

Using tig2aprs

Order of Input Data Matters

tig2aprs reads a stream of TIGER/Line records from stdin, builds a bunch of internal data structures, crunches for a long time, trying to reduce the data into something manageable by DOS APRS, and then spits out a series of gridded, overlapping maps. The order that records are read in by the program is important. If type 1 records are being read in and a type 2 record shows up, then no further type 1 records will be accepted. Make sure you concatenate the input files together in the right order!

The tig2aprs file input ordering requirements are:

Example

For example, here's how I do this for Westchester County, NY, which is bordered by something like fix or six counties in the tri-state NY-NJ-CT area. I bring in the other counties' data and then proceed to ignore most of it 'cuz the map is rectangular and the County isn't and I want to fill in the edges of the map.

$ cat domap
#!/bin/sh
#36119: Westchester (= ../*1, etc.)
#36087: Rockland
#36071: Orange
#36079: Putnam
#36027: Dutchess
#36111: Ulster
#36061: New York (Manhattan)
#36081: Queens
#36047: Kings (Brooklyn)
#36085: Richmond (Staten Island)
#36005: Bronx
#36059: Nassau
#36103: Suffolk
#09001: Fairfield
#09005: Litchfield
#09009: New Haven
#34003: Bergen
#34031: Passaic
#34013: Essex
#34023: Middlesex
#34039: Union

#more=${more:-""}
two=${two:-"2"}
counties="36005 36087 36071 36079 36061 09001 34003 36059 36081 $more"

for t in 1 $two P C S 7 8 9
do
# I have Westchester already unzipped in the .. directory
   if [ $t = 1 ]; then
     ../src/fixup <../*1
   else	
     cat ../*$t
   fi
# the other Counties' zip files are copied off CD into /other/tmp.
   for c in $counties
   do
     st=`echo $c|sed -e 's/^\(..\).*$/\1/'`
     if [ $st = 34 -a $t = C ]; then
	:
     else
        unzip -p /other/tmp/$st/tgr$c TGR$c.F6$t
     fi
   done
done | $*

$ domap tig2aprs -ao -r 16 -d 4 -t 4 -p nywc4 -l maplist.wc4 2>log.wc4

Data Reduction

DOS APRS maps are constrained by design to contain no more than 2999 data points per map. This permits the program to work even on small PCs. About 180 maps can be in a MAPLIST, so this limit isn't really all that much of a problem for detailed mapping, which is what tig2aprs is all about. Most of what tig2aprs does is figure out how to stay within the points limit. [Note for WinAPRS users: tig2aprs can generate unlimited size DOS APRS-format maps which WinAPRS can read, so it is possible to generate single large, detailed maps for use with WinAPRS.]

The data reduction techniques tig2aprs uses include:

Feature Cutoffs
TIGER/Line identifies all features with a Census Feature Class Code (CFCC) which distinguish Interstate highways from dirt roads, for example. tig2aprs has a table of feature cutoffs to use based on the level of detail requested.
Water Filtering
There are many "trivial" water features such as small lakes and ponds that are unfortunately not properly identified as such with a CFCC. These features lead to noisy maps and waste precious data points. The tig2aprs water filter will drop all water features that cover less than a specified percentage of the map area.
Joining segments
Pieces of roads or segments (TIGER/Line jargon for these is Complete Chains) are joined together, eliminating unneccesary duplicate data points at each intersection.
Line Smoothing
tig2aprs includes an algorithm to "straighten" out wiggly lines, thereby eliminating intermediate wiggles and their associated data points.
Point Fuzzing
tig2aprs allows you to control the "focus" such that data points that are "near" each other are considered equivalent, allowing for more point elimination.
Tiling
If the above-mentioned point reduction techniques still fail to get down below 3000 points in a map, tig2aprs is able to recursively split a map into quadrants and so on until each resulting map covers a small enough area that the maximum points limit is achieved.

Seamless Map Transitions

When DOS APRS maps are tiled, it is necessary to overlap them as well since APRS will only display a map if its borders entirely cover the screen at the current zoom. In practice, this means that a map of range N will only display when APRS is zoomed in to N/2 and the display "window" is completely within the map borders. By overlapping each map with three other maps by 50% each, it is possible to have seemless panning across an area; APRS automatically reloads the next overlapping map at the same level of detail. tig2aprs will automatically generate the three overlap maps for each base map.

Labels

The main advantage of TIGER/Line over standard USGS DLG maps is the added value of place and landmark labels. tig2aprs generates a variety of types of labels, using the DOS APRS special symbols, for places, hospitals, schools, airports, cemeteries, parks, etc.

Labels for areas (towns, villages, parks, etc.) are automatically centered within the visible portion of an area on a given map. Multiword labels (names longer than 12 characters) are broken up and stacked to get around the DOS APRS limit.

Runtime Options

Here's some documentation of the runtime options:
Usage: tig2aprs [-vjnoaCTD] -c lat,lon -r range -d detail -t min_tile
       -f fuzz -F maxfuzz -w filt% -W maxfilt% -p map_prefix
       -s slopefuzz -S maxslope -l maplist -m maxmaplist -M maxAPRSpoints
       -R resolutuion -L places|landmarks|kgls 
 -v = verbose (more -v's for more verbose)
 -j = flip and join road segments
 -n = match segments by name as well as CFCC
 -o = make three overlapping maps
 -a = all the usual (same as -vjnC -w .05 -W .1 -s .05 -S .2)
 -T = print all a road's TLIDs in comment
 -D = create a *.dat file before fuzzing a map
 -c = map center lat,lon in decimal degrees
 -r = map radius in miles
 -d = level of map detail in miles
 -t = make tiles no smaller than this radius in miles
 -f = initial map fuzziness (reducing # of points)
 -F = worst map fuzziness
 -w = toss lakes smaller than x% of map
 -W = worst lake fuzz%
 -s = line smoothing factor
 -S = worst line smoothing factor
 -p = filename prefix for map files (default is 'map')
 -l = maplist filename
 -m = max map names per maplist before splitting up
 -M = max APRS/DOS map points (default 2999)
 -R = resolution (a/k/a pixels per degree)
 -L = label places, landmarks, or key geographic locations (kgls)

Customization

Unfortunately, a bit of customization of tig2aprs requires modifying some source code. Making this driven by a config file is on the "to do" list, but, for now, take a look at the comments in the source and especially: defcuts, cfccrange, and fipsrange.

aprs.tk

aprs.tk is a simple Tcl/Tk script that allows one to view an APRS map without having to run Dos APRS. It is a very rudimentary tool and could use a lot of improvement.