Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,16 @@ uuid = "9a9a8258-a423-4c9c-ac3d-7cc63de3c137"
authors = ["Anshul Singhvi <[email protected]>", "Jacob Zelko <[email protected]>", "and contributors"]
version = "0.1.0-DEV"

[deps]
Downloads = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
Format = "1fa38f19-a742-5d3f-a2b9-30dd87b9d5f8"
TidierVest = "969b988e-7aed-4820-b60d-bdec252047c4"

[compat]
julia = "1.6"
Aqua = "0.8"
Format = "1.3"
TidierVest = "0.4.3"
julia = "1.10"

[extras]
Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
Expand Down
13 changes: 12 additions & 1 deletion src/TigerLine.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
module TigerLine

# Write your package code here.
using Downloads:
download
using Format:
FormatExpr,
printfmt
using TidierVest:
html_elements,
html_table,
read_html

include("constants.jl")
include("downloads.jl")

end
125 changes: 125 additions & 0 deletions src/constants.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@

"""
Base URL for TIGER/Line data with two parameters for year and layer.
"""
BASE_TIGER_URL = FormatExpr("https://www2.census.gov/geo/tiger/TIGER{}/{}/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe it could be a function?

base_tiger_url(year, layer) = "https://www2.census.gov/geo/tiger/TIGER$(year)/$(layer)/"

which would neatly remove the Format.jl dependency as well.


# "https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_5m.zip"

"""
A dictionary mapping human-readable keys to TIGER/Line dataset codes and their associated descriptions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite true as of now? There seem to be no descriptions


> **Source:** `https://www2.census.gov/geo/tiger/TIGER2017/2017_TL_Shapefiles_File_Name_Definitions.pdf`

## Keys

- `"address_range_rel"` (**ADDR**) - Address Range Relationship File
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the three different things here? julia name, tiger name, description? We should describe this explicitly

- `"address_range_feat"` (**ADDRFEAT**) - Address Range Feature
- `"address_range_feat_name"` (**ADDRFN**) - Address Range-Feature Name Relationship
- `"native_areas"` (**AIANNH**) - American Indian / Alaska Native / Native Hawaiian Areas
- `"native_subdivision"` (**AITSN**) - American Indian Tribal Subdivision National
- `"alaska_native_corp"` (**ANRC**) - Alaska Native Regional Corporation
- `"area_landmark"` (**AREALM**) - Area Landmark
- `"area_water"` (**AREAWATER**) - Area Hydrography
- `"block_group"` (**BG**) - Block Group
- `"metro_micro_area"` (**CBSA**) - Metropolitan Statistical Area / Micropolitan Statistical Area
- `"congressional_district"` (**CD**) - Congressional District
- `"combined_new_england_city_town"` (**CNECTA**) - Combined New England City and Town Area
- `"coastline"` (**COASTLINE**) - Coastline
- `"consolidated_city"` (**CONCITY**) - Consolidated City
- `"county"` (**COUNTY**) - County
- `"county_subdivision"` (**COUSUB**) - County Subdivision
- `"combined_statistical_area"` (**CSA**) - Combined Statistical Area
- `"all_lines"` (**EDGES**) - All Lines
- `"elementary_school_district"` (**ELSD**) - Elementary School District
- `"estate"` (**ESTATE**) - Estate
- `"topo_faces"` (**FACES**) - Topological Faces (Polygons with All Geocodes)
- `"topo_faces_area_hydro"` (**FACESAH**) - Topological Faces-Area Hydrography Relationship File
- `"topo_faces_area_landmark"` (**FACESAL**) - Topological Faces-Area Landmark Relationship File
- `"topo_faces_military"` (**FACESMIL**) - Topological Faces-Military Installation Relationship File
- `"feature_names"` (**FEATNAMES**) - Feature Names Relationship File
- `"linear_hydro"` (**LINEARWATER**) - Linear Hydrography
- `"metro_division"` (**METDIV**) - Metropolitan Division
- `"military_installation"` (**MIL**) - Military Installation
- `"new_england_city_town"` (**NECTA**) - New England City and Town Area
- `"new_england_city_town_div"` (**NECTADIV**) - New England City and Town Area Division
- `"place"` (**PLACE**) - Place
- `"point_landmark"` (**POINTLM**) - Point Landmark
- `"primary_roads"` (**PRIMARYROADS**) - Primary Roads
- `"primary_secondary_roads"` (**PRISECROADS**) - Primary and Secondary Roads
- `"public_microdata_area"` (**PUMA**) - Public Use Microdata Area
- `"rails"` (**RAILS**) - Rails
- `"all_roads"` (**ROADS**) - All Roads
- `"secondary_school_district"` (**SCSD**) - Secondary School Districts
- `"state_legislative_lower"` (**SLDL**) - State Legislative District – Lower Chamber
- `"state_legislative_upper"` (**SLDU**) - State Legislative District – Upper Chamber
- `"state"` (**STATE**) - State and Equivalent
- `"subbarrio"` (**SUBBARRIO**) - SubMinor Civil Division (Subbarios in Puerto Rico)
- `"tabulation_block"` (**TABBLOCK**) - Tabulation (Census) Block
- `"tribal_block_group"` (**TBG**) - Tribal Block Group
- `"census_tract"` (**TRACT**) - Census Tract
- `"tribal_census_tract"` (**TTRACT**) - Tribal Census Tract
- `"urban_area_cluster"` (**UAC**) - Urban Area/Urban Cluster
- `"unified_school_district"` (**UNSD**) - Unified School District
- `"zip_code_area"` (**ZCTA5**) - 5-Digit ZIP Code Tabulation Area

## Example

```julia-repl
julia> TIGER_DICT["county"]
"COUNTY"
```
"""
const TIGER_DICT = Dict(
"address_range_rel" => "ADDR",
"address_range_feat" => "ADDRFEAT",
"address_range_name_rel" => "ADDRFN",
"native_areas" => "AIANNH",
"tribal_subdivision_nat" => "AITSN",
"alaska_native_region" => "ANRC",
"area_landmark" => "AREALM",
"area_water" => "AREAWATER",
"block_group" => "BG",
"metro_micro_area" => "CBSA",
"congressional_district" => "CD",
"combined_necta" => "CNECTA",
"coastline" => "COASTLINE",
"consolidated_city" => "CONCITY",
"county" => "COUNTY",
"county_subdivision" => "COUSUB",
"combined_stat_area" => "CSA",
"all_lines" => "EDGES",
"elementary_school_district" => "ELSD",
"estate" => "ESTATE",
"topo_faces" => "FACES",
"faces_area_hydro" => "FACESAH",
"faces_area_landmark" => "FACESAL",
"faces_military" => "FACESMIL",
"feature_names_rel" => "FEATNAMES",
"linear_hydrography" => "LINEARWATER",
"metro_division" => "METDIV",
"military_installation" => "MIL",
"necta" => "NECTA",
"necta_division" => "NECTADIV",
"place" => "PLACE",
"point_landmark" => "POINTLM",
"primary_roads" => "PRIMARYROADS",
"primary_secondary_roads" => "PRISECROADS",
"puma" => "PUMA",
"rails" => "RAILS",
"all_roads" => "ROADS",
"secondary_school_district" => "SCSD",
"state_leg_district_lower" => "SLDL",
"state_leg_district_upper" => "SLDU",
"state" => "STATE",
"subbarrio" => "SUBBARRIO",
"tab_block" => "TABBLOCK",
"tribal_block_group" => "TBG",
"census_tract" => "TRACT",
"tribal_census_tract" => "TTRACT",
"urban_area_cluster" => "UAC",
"unified_school_district" => "UNSD",
"zip_code_area" => "ZCTA5"
)

export TIGER_DICT
56 changes: 56 additions & 0 deletions src/downloads.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
"""
```julia
download_tiger(output_dir;
year = 2020,
layer = "state"
)
```
Downloads TIGER/Line geographic data from the US Census Bureau for the specified year and geographic layer,
saving the data as shapefiles.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite - seems it saves as zip files? Shapefile.jl has zip support via ZipFile.jl but it's a bit dicey.


## Arguments
- `output_dir::String`: The directory where downloaded files will be saved.

## Keyword Arguments
- `year::Int=2020` (optional): The year of the TIGER/Line data to retrieve (e.g., 2020).
- `layer::String="state"` (optional): The geographic layer of the data; look at `TIGER_DICT` for more options.

## Returns
#
- This function does not return anything.

## Example
#
```julia-repl
julia> ?TIGER_DICT

• "county" (COUNTY) - County
• "state" (STATE) - State and Equivalent

julia> download_tiger("./data", year=2020, layer="county")
```

This will download county-level TIGER/Line data for 2020 and store the shapefiles in `./data`.
"""
function download_tiger(output_dir; year = 2020, layer = "state")

url = sprint(printfmt, BASE_TIGER_URL, year, TIGER_DICT[layer])

html = read_html(url)
tables = html_elements(html, ["body", "table"])
data = tables[1] |> html_table
files = data.Name[2:end]

for f in files
@info "Downloading $f for layer, \"$(TIGER_DICT[layer])\", and year, $year."
download(
joinpath(url, f),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely be incorrect on windows (backslash)

joinpath(output_dir, f)
)
end

@info "Requested \"$(TIGER_DICT[layer])\" data for $year has been downloaded! 🎉";

end

export download_tiger
Loading