Skip to content Skip to sidebar Skip to footer

Best Way to Download Acs Data in R

Mapping Demography data with R

Information from the United States Census Agency are commonly visualized using maps, given that Census and ACS data are aggregated to enumeration units. This chapter will comprehend the procedure of map-making using the tidycensus R package. Notably, tidycensus enables R users to download simple characteristic geometry for mutual geographies, linking demographic data with their geographic locations in a dataset. In turn, this data model facilitates the creation of both static and interactive demographic maps.

In this chapter, readers will larn how to utilise the geometry parameter in tidycensus functions to download geographic data along with demographic data from the US Census Bureau. The chapter will so embrace how to brand static maps of Census demographic information using the pop ggplot2 and tmap visualization packages. The closing parts of the chapter will then turn to interactive mapping, with a focus on the mapview and Leaflet R packages for interactive cartographic visualization.

Using geometry in tidycensus

As covered in the previous chapter, Census geographies are bachelor from the tigris R package as simple features objects, using the data model from the sf R package. tidycensus wraps several mutual geographic data functions in the tigris package to allow R users to return unproblematic feature geometry pre-linked to downloaded demographic data with a single function call. The key statement to accomplish this is geometry = True, which is available in the cadre information download functions in tidycensus, get_acs(), get_decennial(), and get_estimates().

Traditionally, getting "spatial" Demography data requires a tiresome multi-step procedure that tin involve several software platforms. These steps include:

  • Fetching shapefiles from the Census website;

  • Downloading a CSV of data, then cleaning and formatting it;

  • Loading geometries and information into your desktop GIS of selection;

  • Adjustment key fields in your desktop GIS and joining your data.

A major motivation for developing tidycensus was my frustration with having to go through this procedure over and over again before making a simple map of Census data. geometry = Truthful combines the automatic information download functionality of tidycensus and tigris to allow R users to featherbed this process entirely. The following example illustrates the use of the geometry = TRUE argument, fetching information on median household income for Census tracts in the District of Columbia. As discussed in the previous chapter, the choice tigris_use_cache = True is used to cache the downloaded geographic data on the user's figurer.

                              library                (                tidycensus                )                options                (tigris_use_cache                =                TRUE                )                dc_income                <-                get_acs                (                geography                =                "tract",    variables                =                "B19013_001",   state                =                "DC",    geometry                =                True                )                dc_income                          
            ## Unproblematic feature collection with 179 features and 5 fields ## Geometry type: POLYGON ## Dimension:     XY ## Bounding box:  xmin: -77.11976 ymin: 38.79164 xmax: -76.9094 ymax: 38.99511 ## Geodetic CRS:  NAD83 ## First ten features: ##          GEOID                                                           Name ## 1  11001009509 Census Tract 95.09, District of Columbia, Commune of Columbia ## 2  11001010100   Census Tract 101, District of Columbia, District of Columbia ## 3  11001008301 Census Tract 83.01, District of Columbia, District of Columbia ## four  11001002101 Census Tract 21.01, District of Columbia, District of Columbia ## 5  11001004100    Demography Tract 41, District of Columbia, District of Columbia ## 6  11001008001 Census Tract fourscore.01, District of Columbia, District of Columbia ## 7  11001002900    Demography Tract 29, Commune of Columbia, District of Columbia ## viii  11001005600    Census Tract 56, Commune of Columbia, District of Columbia ## 9  11001007605 Demography Tract 76.05, Commune of Columbia, District of Columbia ## x 11001005900    Census Tract 59, District of Columbia, Commune of Columbia ##      variable guess   moe                       geometry ## 1  B19013_001    75515 19621 POLYGON ((-77.00201 38.9510... ## ii  B19013_001    94861 16089 POLYGON ((-77.03653 38.9056... ## 3  B19013_001   138487 30838 POLYGON ((-77.00352 38.9000... ## four  B19013_001    67984 11327 POLYGON ((-77.02803 38.9610... ## v  B19013_001   156625 27218 POLYGON ((-77.05832 38.9177... ## half dozen  B19013_001   154423 28910 POLYGON ((-76.99025 38.8973... ## vii  B19013_001   116875 20769 POLYGON ((-77.03273 38.9341... ## viii  B19013_001    79357 16304 POLYGON ((-77.05779 38.9025... ## 9  B19013_001    38659  7543 POLYGON ((-76.98436 38.8666... ## 10 B19013_001    98750 21107 POLYGON ((-77.01901 38.8946...          

As shown in the example call, the structure of the object returned by tidycensus resembles the object we've become familiar with to this point in the book. For instance, median household income data are found in the estimate cavalcade with associated margins of fault in the moe column, along with a variable ID, GEOID, and Demography tract name. Nevertheless, there are some notable differences. The geometry column contains polygon characteristic geometry for each Census tract, assuasive for a linking of the estimates and margins of error with their corresponding locations in Washington, DC. Beyond that, the object is associated with coordinate arrangement information - using the NAD 1983 geographic coordinate organisation in which Census geographic datasets are stored by default.

Basic mapping of sf objects with plot()

Such geographic information can exist difficult to understand without visualization. As the returned object is a simple features object, both geometry and attributes tin be visualized with plot(). Key hither is specifying the name of the cavalcade to be plotted inside of brackets, which in this example is "approximate".

                                  plot                  (                  dc_income                  [                  "estimate"                  ]                  )                              

Base R plot of median household income by tract in DC

Figure half dozen.1: Base R plot of median household income by tract in DC

The plot() office returns a simple map showing income variation in Washington, DC. Wealthier areas, as represented with warmer colors, tend to be located in the northwestern part of the District. NA values are represented on the map in white. If desired, the map tin exist modified further with base plotting functions.

The remainder of this chapter, however, will focus on map-making with additional data visualization packages in R. This includes the popular ggplot2 parcel for visualization, which supports directly visualization of simple features objects; the tmap package for thematic mapping, and the leaflet package for interactive map-making which calls the Leaflet JavaScript framework directly from R.

Map-making with ggplot2 and geom_sf

Every bit illustrated in Section 5.2, geom_sf() in ggplot2 can be used for quick plotting of sf objects using familiar ggplot2 syntax. geom_sf() goes far beyond elementary cartographic display. The full power of ggplot2 is available to create highly customized maps and geographic data visualizations.

Choropleth mapping

One of the most mutual ways to visualize statistical information on a map is with choropleth mapping. Choropleth maps use shading to represent how underlying data values vary by feature in a spatial dataset. The income plot of Washington, DC shown earlier in this affiliate is an example of a choropleth map.

In the instance below, tidycensus is used to obtain linked ACS and spatial information on median historic period by state for the l US states plus the District of Columbia and Puerto Rico. For national maps, it is often preferable to generate insets of Alaska, Hawaii, and Puerto Rico so that they tin can all be viewed comparatively with the continental United States. We'll employ the shift_geometry() function in tigris to shift and rescale these areas for national mapping. The argument resolution = "20m" is necessary here for appropriate results, every bit it will omit the long archipelago of islands to the northwest of Hawaii.

Plot of shifted and rescaled US geometry

Effigy 6.2: Plot of shifted and rescaled United states geometry

The country polygons can be styled using ggplot2 conventions and the geom_sf() function. With ii lines of ggplot2 code, a basic map of median age past land can exist created with ggplot2 defaults.

US choropleth map with ggplot2 defaults

Effigy 6.3: US choropleth map with ggplot2 defaults

The geom_sf() part in the to a higher place case interprets the geometry of the sf object (in this instance, polygon) and visualizes the event as a filled choropleth map. In this case, the ACS estimate of median age is mapped to the default blueish night-to-low-cal color ramp in ggplot2, highlighting the youngest states (such as Utah) with darker blues and the oldest states (such equally Maine) with lighter blues.

Customizing ggplot2 maps

In many cases, map-makers using ggplot2 will want to customize this graphic further. For example, a designer may desire to modify the color palette and reverse it and so that darker colors represent older areas. The map would also do good from some boosted data describing its content and information sources. These modifications tin be specified in the same style a user would update a regular ggplot2 graphic. The scale_fill_distiller() part allows users to specify a ColorBrewer palette to use for the map, which includes a broad range of sequential, diverging, and qualitative color palettes (Brewer, Hatchard, and Harrower 2003). The labs() role can then be used to add a title, caption, and improve legend label to the plot. Finally, ggplot2 cartographers will often desire to use the theme_void() function to remove the background and gridlines from the map.

                                  ggplot                  (data                  =                  us_median_age,                  aes                  (fill up                  =                  estimate                  )                  )                  +                  geom_sf                  (                  )                  +                  scale_fill_distiller                  (palette                  =                  "RdPu",                         direction                  =                  1                  )                  +                  labs                  (title                  =                  "  Median Historic period by Land, 2019",        caption                  =                  "Data source: 2019 1-year ACS, U.s.a. Census Bureau",        fill                  =                  "ACS estimate"                  )                  +                  theme_void                  (                  )                              

Styled choropleth of US median age with ggplot2

Figure half-dozen.4: Styled choropleth of US median age with ggplot2

Map-making with tmap

For ggplot2 users, geom_sf() offers a familiar interface for mapping data obtained from the US Census Agency. All the same, ggplot2 is far from the only option for cartographic visualization in R. The tmap bundle (Tennekes 2018) is an splendid alternative for mapping in R that includes a wide range of functionality for custom cartography. The section that follows is an overview of several cartographic techniques implemented with tmap for visualizing US Census data. A full treatment of all-time practices in cartographic design is beyond the scope of this department; recommended resource for learning more than include Peterson (2020) and Brewer (2016).

To brainstorm, let'due south grab some ACS data on race and ethnicity from the American Customs Survey. Nosotros'll be looking at information on non-Hispanic white, not-Hispanic Black, Asian, and Hispanic populations for Census tracts in Hennepin County, Minnesota.

                              hennepin_race                <-                get_acs                (                geography                =                "tract",   land                =                "MN",   canton                =                "Hennepin",   variables                =                c                (White                =                "B03002_003",                 Black                =                "B03002_004",                 Native                =                "B03002_005",                 Asian                =                "B03002_006",                 Hispanic                =                "B03002_012"                ),   summary_var                =                "B03002_001",   geometry                =                TRUE                )                %>%                mutate                (percent                =                100                *                (                estimate                /                summary_est                )                )                          
Table 6.1: Race and ethnicity in Hennepin County, MN
GEOID Proper noun variable estimate moe summary_est summary_moe geometry percent
27053103900 Census Tract 1039, Hennepin Canton, Minnesota White 2940 421 4221 447 MULTIPOLYGON (((-93.23693 iv… 69.6517413
27053103900 Census Tract 1039, Hennepin County, Minnesota Black 372 276 4221 447 MULTIPOLYGON (((-93.23693 iv… 8.8130775
27053103900 Census Tract 1039, Hennepin County, Minnesota Native 10 xv 4221 447 MULTIPOLYGON (((-93.23693 4… 0.2369107
27053103900 Census Tract 1039, Hennepin County, Minnesota Asian 504 153 4221 447 MULTIPOLYGON (((-93.23693 4… 11.9402985
27053103900 Census Tract 1039, Hennepin County, Minnesota Hispanic 214 76 4221 447 MULTIPOLYGON (((-93.23693 4… 5.0698887
27053021602 Census Tract 216.02, Hennepin Canton, Minnesota White 4951 402 5753 316 MULTIPOLYGON (((-93.38036 4… 86.0594472
27053021602 Census Tract 216.02, Hennepin Canton, Minnesota Blackness 175 277 5753 316 MULTIPOLYGON (((-93.38036 4… 3.0418912
27053021602 Census Tract 216.02, Hennepin Canton, Minnesota Native 77 85 5753 316 MULTIPOLYGON (((-93.38036 4… 1.3384321
27053021602 Census Tract 216.02, Hennepin County, Minnesota Asian 24 xxx 5753 316 MULTIPOLYGON (((-93.38036 4… 0.4171736
27053021602 Census Tract 216.02, Hennepin County, Minnesota Hispanic 250 175 5753 316 MULTIPOLYGON (((-93.38036 4… four.3455588

Nosotros've returned ACS data in tidycensus'southward regular "tidy" or long format, which will be useful in a moment for comparative map-making, and completed some bones data wrangling tasks learned in Chapter 3 to calculate group percentages. To become started mapping this data, we'll excerpt a single group from the dataset to illustrate how tmap works.

Choropleth maps with tmap

tmap'due south map-making syntax will be somewhat familiar to users of ggplot2, as information technology uses the concept of layers to specify modifications to the map. The map object is initialized with tm_shape(), which and so allows united states of america to view the Demography tracts with tm_polygons(). We'll first filter our long-form spatial dataset to get a unique set of tract polygons, so visualize them.

Basic polygon plot with tmap

Figure half dozen.five: Basic polygon plot with tmap

Nosotros go a default view of Demography tracts in Hennepin County, Minnesota. Alternatively, the tm_fill() function can be used to produce choropleth maps, every bit illustrated in the ggplot2 examples higher up.

Basic choropleth with tmap

Figure half-dozen.6: Basic choropleth with tmap

You lot'll find that tmap uses a classed color scheme rather than the continuous palette used by ggplot2, past default. This involves the identification of "classes" in the distribution of information values and mapping a color from a color palette to data values that belong to each class. The default nomenclature scheme used by tm_fill() is "pretty", which identifies clean-looking intervals in the information based on the data range. In this instance, data classes alter every x percent. However, this approach will always be sensitive to the distribution of data values. Allow's take a await at our data distribution to empathize why:

                                  hist                  (                  hennepin_black                  $                  per centum                  )                              

Base R histogram of percent Black by Census tract

Effigy 6.7: Base R histogram of percent Black by Census tract

As the histogram illustrates, almost Census tracts in Hennepin County have Black populations below 20 per centum. In plow, variation within this bucket is not visible on the map given that most tracts fall into one course. The manner argument in tm_fill() supports a number of other methods for classification, including quantile breaks ("quantile"), equal intervals ("equal"), and Jenks natural breaks ("jenks"). Allow'due south switch to quantiles below, where each class will incorporate the same number of Census tracts. Nosotros can also change the color palette and add together some contextual text as we did with ggplot2.

                                  tm_shape                  (                  hennepin_black,           projection                  =                  sf                  ::                  st_crs                  (                  26915                  )                  )                  +                  tm_polygons                  (col                  =                  "pct",           style                  =                  "quantile",           n                  =                  5,           palette                  =                  "Purples",           title                  =                  "ACS estimate"                  )                  +                  tm_layout                  (title                  =                  "Pct Black\nby Demography tract",             frame                  =                  False,             legend.outside                  =                  Truthful                  )                              

tmap choropleth with options

Figure vi.viii: tmap choropleth with options

Switching from the default nomenclature scheme to quantiles reveals additional neighborhood-level heterogeneity in Hennepin County'southward Blackness population in suburban areas. However, it does mask some heterogeneity in Minneapolis as the peak class at present includes values ranging from 22 pct to 71 percentage. A "compromise" solution usually used in GIS cartography applications is the Jenks natural-breaks method, which uses an algorithm to identify meaningful breaks in the data for bin boundaries (Jenks 1967). To assist with agreement how the unlike classification methods work, the legend.hist statement in tm_polygons() can exist set to Truthful, adding a histogram to the map with bars colored past the values used on the map.

                                  tm_shape                  (                  hennepin_black,           projection                  =                  sf                  ::                  st_crs                  (                  26915                  )                  )                  +                  tm_polygons                  (col                  =                  "pct",           way                  =                  "jenks",           n                  =                  5,           palette                  =                  "Purples",           championship                  =                  "ACS judge",           legend.hist                  =                  Truthful                  )                  +                  tm_layout                  (title                  =                  "Per centum Black\nby Census tract",             frame                  =                  FALSE,             legend.outside                  =                  TRUE,             bg.color                  =                  "grey70",             legend.hist.width                  =                  5,             fontfamily                  =                  "Verdana"                  )                              

Styled tmap choropleth

Figure six.ix: Styled tmap choropleth

The tm_layout() function is used to customize the styling of the map and histogram, and has many more options across those shown that can be viewed in the office's documentation.

Calculation reference elements to a map

The choropleth map as illustrated in the previous example represents the information as a statistical graphic, with both a map and histogram showing the underlying data distribution. Cartographers coming to R from a desktop GIS background will exist accustomed to adding a diversity of reference elements to their map layouts to provide additional geographical context to the map. These elements may include a basemap, north arrow, and calibration bar, all of which tin exist accommodated past tmap.

The tmaptools add-in bundle to tmap includes a part, read_osm(), that helps users larn basemap tiles from OpenStreetMap for use in tmap. read_osm() has a dependency on the rJava bundle which tin can be difficult to install, nevertheless, and it limits users to pre-made basemaps. The example shown hither uses the mapboxapi R bundle Thousand. Walker (2021c), which gives users with a Mapbox account access to pre-designed Mapbox basemaps besides equally custom-designed basemaps in Mapbox Studio.

To utilize Mapbox basemaps with mapboxapi and tmap, you'll offset demand a Mapbox account and access token. Mapbox accounts are costless; register at the Mapbox website then find your access token. In R, this token can exist set with the mb_access_token() function in mapboxapi.

Once set, basemap tiles for an input spatial dataset can be fetched with the get_static_tiles() function in mapboxapi, which interacts with the Mapbox Static Tiles API. Mapbox Studio users can design a custom basemap style and use the custom style ID along with their username to fetch tiles from that style for mapping; whatever Mapbox user can too use the default Mapbox styles by supplying username = "mapbox" and the appropriate fashion ID. The level of detail of the underlying basemap can exist adapted with the zoom argument.

                                  # If you don't have a Mapbox style to utilise, replace style_id with "low-cal-v9"                  # and username with "mapbox".  If yous do, replace those arguments with your                                    # way ID and user proper noun.                  hennepin_tiles                  <-                  get_static_tiles                  (                  location                  =                  hennepin_black,   zoom                  =                  10,   style_id                  =                  "ckedp72zt059t19nssixpgapb",   username                  =                  "kwalkertcu"                  )                              

In most cases, users should cull a muted, monochrome basemap when designing a map with a choropleth overlay to avert confusing blending of colors.

These basemap tiles are layered into the familiar tmap workflow with the tm_rgb() part. To bear witness the underlying basemap, users should modify the transparency of the Census tract polygons with the alpha argument. Additional tmap functions and so add ancillary map elements. tm_scale_bar() adds a scale bar; tm_compass() adds a north arrow; and tm_credits() helps cartographers give credit for the basemap, which is required when using Mapbox and OpenStreetMap tiles.

                                  tm_shape                  (                  hennepin_tiles                  )                  +                  tm_rgb                  (                  )                  +                  tm_shape                  (                  hennepin_black                  )                  +                  tm_polygons                  (col                  =                  "percent",           style                  =                  "jenks",           n                  =                  5,           palette                  =                  "Purples",           title                  =                  "ACS estimate",           blastoff                  =                  0.seven                  )                  +                  tm_layout                  (title                  =                  "Percent Black\nby Census tract",             legend.exterior                  =                  True,             fontfamily                  =                  "Verdana"                  )                  +                  tm_scale_bar                  (position                  =                  c                  (                  "left",                  "bottom"                  )                  )                  +                  tm_compass                  (position                  =                  c                  (                  "right",                  "tiptop"                  )                  )                  +                  tm_credits                  (                  "(c) Mapbox, OSM    ",               bg.colour                  =                  "white",              position                  =                  c                  (                  "Right",                  "Lesser"                  )                  )                              

Map of percent Black in Hennepin County with reference elements

Effigy 6.10: Map of per centum Black in Hennepin County with reference elements

Depending on the shape of your Census data, the position arguments in tm_scale_bar(), tm_compass(), and tm_credits() can be modified to organize coincident map elements appropriately. When capitalized (as used in tm_credits()), the element will exist positioned tighter to the map frame.

Choosing a color palette

The examples shown in this affiliate thus far have used a variety of color palettes to display statistical variation on choropleth maps. Software packages like sf, ggplot2, and tmap will have colour palettes congenital in as "defaults"; I've shown the default palettes for all iii, then changed the palettes used in ggplot2 and in tmap. Then how do you get about choosing an appropriate color palette? In that location are a multifariousness of considerations to take into account.

Outset, it is important to empathise the type of information you are working with. If your data are quantitative - that is, expressed with numbers, which you'll unremarkably be working with using Census data - you lot'll want a color palette that can show the statistical variation in your data correctly. In the demographic examples shown above, ACS data range from a low value to a loftier value. This type of information is effectively represented with a sequential color palette. Sequential colour palettes apply either a single hue or related hues and then modify the color lightness or intensity to generate a sequence of colors. An example single-hue palette is the "Purples" ColorBrewer palette used in the map above.

Sequential 'Purples' color palette

Figure 6.11: Sequential 'Purples' color palette

With this palette, lighter colors should generally be used to represent lower data values, and darker values should represent college values, suggesting a greater density/concentration of that attribute. In other color palettes, nonetheless, the more than intense colors may be the lighter colors and should be used accordingly to stand for data values. This is the case with the popular viridis colour palettes, implemented in R with the viridis parcel (Garnier et al. 2021) and shown below.

Sequential 'viridis' color palette

Effigy vi.12: Sequential 'viridis' colour palette

Diverging color palettes are best used when the cartographer wants to highlight extreme values on either end of the data distribution and represent neutral values in the eye. The case shown below is the ColorBrewer "RdBu" palette.

Diverging 'RdBu' color palette

Figure 6.xiii: Diverging 'RdBu' color palette

For Census data mapping, diverging palettes are well-suited to maps that visualize change over time. A map of population modify using a diverging palette would highlight extreme population loss and farthermost population proceeds with intense colors on either cease of the palette, and represent minimal population change with a muted, neutral color in the middle.

Qualitative palettes are appropriate for categorical data, as they represent data values with unique, unordered hues. A practiced example is the "Set1" color palette shown below.

Categorical 'Set1' color palette

Figure 6.14: Categorical 'Set1' color palette

While maps of Census data every bit returned past tidycensus will generally use sequential or diverging colour palettes (given the quantitative nature of Census information), derived information products may require qualitative palettes. Illustrative examples in this volume include chiselled dot-density maps (addressed later in this chapter) and visualizations of geodemographic clusters, explored in Section eight.five.one.

Choosing an appropriate color palette for your maps can be a challenge. Fortunately, the ColorBrewer and viridis palettes are appropriate for a wide range of cartographic utilize cases and accept built-in back up in ggplot2 and tmap. An splendid tool for helping decide on a colour palette is tmap's Palette Explorer app, accessible with the command tmaptools::palette_explorer(). Run this command in your R console to launch an interactive app that helps you explore different colour scenarios using ColorBrewer and viridis palettes. Notably, the app includes a color blindness simulator to help you cull colour palettes that are color blindness friendly.

Culling map types with tmap

Choropleth maps are a core part of the Demography data analyst's toolkit, simply they are not ideal for every application. In particular, choropleth maps are best suited for visualizing rates, percentages, or statistical values that are normalized for the population of areal units. They are not ideal when the annotator wants to compare counts (or estimated counts) themselves, even so. Choropleth maps of count data may ultimately reflect the underlying size of a baseline population; additionally, given that the counts are compared visually relative to the irregular shape of the polygons, choropleth maps can make comparisons difficult.

Graduated symbols

An alternative commonly used to visualize count information is the graduated symbol map. Graduated symbol maps employ shapes referenced to geographic units that are sized relative to a data attribute. The example beneath uses tmap's tm_bubbles() function to create a graduated symbol map of the Black population in Hennepin Canton, mapping the gauge cavalcade.

                                      tm_shape                    (                    hennepin_black,           projection                    =                    sf                    ::                    st_crs                    (                    26915                    )                    )                    +                    tm_polygons                    (                    )                    +                    tm_bubbles                    (size                    =                    "estimate", alpha                    =                    0.5,               col                    =                    "navy",              title.size                    =                    "Non-Hispanic Black - ACS estimate"                    )                    +                    tm_layout                    (legend.exterior                    =                    TRUE,             fable.exterior.position                    =                    "bottom"                    )                                  

Graduated symbols with tmap

Figure six.fifteen: Graduated symbols with tmap

The visual comparisons on the map are made betwixt the circles, not the polygons themselves, reflecting differences between population sizes.

Faceted maps

Given that the long-grade race & ethnicity dataset returned by tidycensus includes data on v groups, a cartographer may want to visualize those groups comparatively. A single choropleth map cannot effectively visualize five demographic attributes simultaneously, and creating five split up maps can exist tedious. A solution to this is using faceting, a concept introduced in Chapter ??.

Faceted maps in tmap are created with the tm_facets() function. The by argument specifies the column to be used to identify unique groups in the data. The remaining lawmaking is familiar tmap lawmaking; in this example, tm_fill() is preferred to tm_polygons() to hide the Census tract borders given the smaller sizes of the maps. The fable is besides moved with the legend.position argument in tm_layout() to fill the empty space in the faceted map.

                                      tm_shape                    (                    hennepin_race,          projection                    =                    sf                    ::                    st_crs                    (                    26915                    )                    )                    +                    tm_facets                    (by                    =                    "variable", scale.factor                    =                    4                    )                    +                    tm_fill                    (col                    =                    "pct",           mode                    =                    "quantile",           n                    =                    half dozen,           palette                    =                    "Blues",           title                    =                    "Percent (2015-2019 ACS)",)                    +                    tm_layout                    (bg.color                    =                    "grey",              legend.position                    =                    c                    (                    -                    0.7,                    0.two                    ),             console.characterization.bg.color                    =                    "white"                    )                                  

Faceted map with tmap

Figure 6.16: Faceted map with tmap

The faceted maps exercise a skillful job of showing variations in each group in comparative context. Nevertheless, the mutual legend and nomenclature scheme used means that within-grouping variation is suppressed relative to the need to show consequent comparisons betwixt groups. In turn, the "White" subplot shows little variation among Census tracts in Hennepin County given the large size of that group in the area. One additional disadvantage of carve up maps by group is that they exercise not show neighborhood heterogeneity and multifariousness as well every bit they could. A pop alternative for visualizing within-unit heterogeneity is the dot-density map, covered below.

Dot-density maps

Dot-density maps scatter dots inside areal units relative to the size of a information aspect. This cartographic method is intended to show aspect density through the dot distributions; for a Census map, in areas where the dots are dense, more people live there, whereas sparsely positioned dots reflect sparsity of population. Dot-density maps can also incorporate categories in the data to visualize densities of dissimilar subgroups simultaneously.

Generating a dot-density map in R requires some more advanced code that uses some of the programmatic skills learned in prior chapters and introduces some new techniques. The code below converts the data in the hennepin_race object to a new object, hennepin_dots, that is of geometry type POINT instead of MULTIPOLYGON. Each bespeak in the output spatial dataset will roughly represent an estimated 100 residents of each Census tract who identified as office of a particular racial or ethnic group on the 2015-2019 ACS.

purrr's map_dfr() function will be used to achieve this task. We'll iterate over each unique racial/indigenous group in the dataset, creating dots relative to population sizes for each group. The workhorse function hither is the st_sample() role from the sf packet, which samples points at random positions inside polygons. The size argument within st_sample() can either be a stock-still integer (for the aforementioned number of points in each polygon) or a vector; we'll apply a newly derived column est100, which is the population estimate for each group divided by 100. This point sampling is computed for all five groups, then combined into a common dataset of points. The slice_sample() function from dplyr is critical here, as it retains all points (prop = ane) but shuffles them randomly. This volition ensure that the unlike groups announced in random gild visually on the dot-density map, appropriately showing heterogeneity.

This code can take a few minutes to run, and so be patient when reproducing. If speed of processing is an issue, consider modifying the dots to data ratio and/or setting exact = FALSE in st_sample(), which volition approximate the number of dots in each tract.

The map itself is created with the tm_dots() role, which in this example is combined with a background map using tm_polygons() to prove the relative racial and ethnic heterogeneity of Census tracts in Hennepin County.

                                      background_tracts                    <-                    filter                    (                    hennepin_race,                    variable                    ==                    "White"                    )                    tm_shape                    (                    background_tracts,           projection                    =                    sf                    ::                    st_crs                    (                    26915                    )                    )                    +                    tm_polygons                    (col                    =                    "white",                border.col                    =                    "grey"                    )                    +                    tm_shape                    (                    hennepin_dots                    )                    +                    tm_dots                    (col                    =                    "grouping",            palette                    =                    "Set1",           size                    =                    0.005,            championship                    =                    "Race/ethnicity"                    )                                  

Dot-density map of race and ethnicity in Hennepin County, Minnesota

Figure 6.17: Dot-density map of race and ethnicity in Hennepin County, Minnesota

Problems with dot-density maps can include overplotting of dots which can brand legibility a problem; experiment with different dot sizes and dots to data ratios to amend this. Additionally, the use of Census tract polygons to generate the dots can cause visual bug. As dots are placed randomly within Census tract polygons, they in many cases will be placed in locations where no people live (such as lakes in Hennepin County). Dot distributions will also follow tract boundaries, which can create an bogus impression of abrupt changes in population distributions along polygon boundaries (equally seen on the example map). A solution is the dasymetric dot-density map (K. Due east. Walker 2018), which starting time removes areas from polygons which are known to be uninhabited then runs the dot-generation algorithm over those modified areas.

Cartographic workflows with non-Demography information

In many instances, an analyst may possess data that is available at a Census geography simply is not available through the ACS or decennial Census. This means that the geometry = Truthful functionality in tidycensus, which automatically enriches data with geographic information, is not possible. In these cases, Demography shapes obtained with tigris can be joined to tabular data and so visualized.

This section covers two such workflows. The first reproduces the popular red/blue election map common in presidential election cycles. The 2d focuses on mapping zip code tabulation areas, or ZCTAs, a geography that represents the spatial location of zip codes (postal codes) in the United states.

National election mapping with tigris shapes

While enumeration units like Census tracts and block groups volition generally be used to map Census data, Census shapes representing legal entities are useful for a variety of cartographic purposes. A pop example is the political map, which shows the winner or poll results from an election in a region. We'll utilize data from the Cook Political Study to generate a basic carmine state/blue state map of the 2020 United states of america Presidential election results. This dataset was downloaded on June 5, 2021 and is available at "data/us_vote_2020.csv" in the book GitHub repository.

              ##  [i] "state"            "called"           "final"            "dem_votes"        ##  [v] "rep_votes"        "other_votes"      "dem_percent"      "rep_percent"      ##  [9] "other_percent"    "dem_this_margin"  "margin_shift"     "vote_change"      ## [xiii] "stateid"          "EV"               "X"                "Y"                ## [17] "State_num"        "Center_X"         "Center_Y"         "...20"            ## [21] "2016 Margin"      "Total 2016 Votes"            

The information include a wide variety of columns that can be visualized on a map. Every bit discussed in the previous affiliate, a comparative map of the U.s. tin employ the shift_geometry() function in the tigris package to shift and rescale Alaska and Hawaii. Country geometries are bachelor in tigris with the states() function, which should exist used with the arguments cb = TRUE and resolution = "20m" to appropriately generalize the state geometries for national mapping.

To create the map, the geometry data obtained with tigris must be joined to the election data from the Cook Political Report. This is accomplished with the left_join() function from dplyr. dplyr's *_join() family of functions are supported by elementary features objects, and piece of work in this context coordinating to the mutual "Join" operations in desktop GIS software. The join functions work past matching values in i or more "primal fields" between two tables and merging data from those 2 tables into a single output table. The most mutual join functions you'll utilise for spatial information are left_join(), which retains all rows from the starting time dataset and fills non-matching rows with NA values, and inner_join(), which drops not-matching rows in the output dataset.

Allow's try this out by obtaining depression-resolution state geometry with tigris, shifting and rescaling with shift_geometry(), then merging the political data to those shapes, matching the NAME column in us_states to the state column in vote2020.

Before proceeding we'll want to practice some quality checks. In left_join(), values must match exactly betwixt NAME and state to merge correctly - and this is not always guaranteed when using data from unlike sources. Let's check to see if nosotros have any problems:

              ##  ## False  ##    51            

We've matched all the 50 states plus the District of Columbia correctly. In turn, the joined dataset has retained the shifted and rescaled geometry of the US states and now includes the ballot information from the tabular dataset which can be used for mapping. To achieve this structure, specifying the directionality of the bring together was critical. For spatial data to be retained in a join, the spatial object must be on the left-hand side of the join. In our pipeline, we specified the us_states object first and used left_join() to join the election information to usa object. If we had done this in reverse, we would have lost the spatial course information necessary to make the map.

For a basic ruby-red state/blue country map using ggplot2 and geom_sf(), a transmission color palette can be supplied to the scale_fill_manual() part, filling state polygons based on the chosen column which represents the party for whom the country was called.

                                  ggplot                  (                  us_states_joined,                  aes                  (fill                  =                  chosen                  )                  )                  +                  geom_sf                  (color                  =                  "white", lwd                  =                  0.2                  )                  +                  scale_fill_manual                  (values                  =                  c                  (                  "blue",                  "red"                  )                  )                  +                  theme_void                  (                  )                  +                  labs                  (fill                  =                  "Party",        title                  =                  " 2020 Us presidential ballot results by state",        caption                  =                  "Annotation: Nebraska and Maine split balloter college votes by congressional district"                  )                              

Map of the 2020 US presidential election results with ggplot2

Effigy 6.18: Map of the 2020 US presidential election results with ggplot2

Understanding and working with ZCTAs

The most granular geography at which many agencies release data is at the zip lawmaking level. This is not an ideal geography for visualization, given that zip codes correspond collections of U.s. Postal Service routes (or sometimes even a single building, or Post Part box) that are non guaranteed to form coherent geographies. The U.s.a. Census Bureau allows for an approximation of zippo code mapping with Zippo Lawmaking Tabulation Areas, or ZCTAs. ZCTAs are shapes built from Census blocks in which the most common zip code for addresses in each block determines how blocks are allocated to corresponding ZCTAs. While ZCTAs are non recommended for spatial analysis due to these irregularities, they can be useful for visualizing data distributions when no other granular geographies are bachelor.

An example of this is the Internal Revenue Service'southward Statistics of Income (SOI) information, which includes a broad range of indicators derived from tax returns. The well-nigh detailed geography available is the aught code level in this dataset, significant that within-county visualizations require using ZCTAs. Let's read in the data for 2018 from the IRS website:

                                  irs_data                  <-                  read_csv                  (                  "https://world wide web.irs.gov/pub/irs-soi/18zpallnoagi.csv"                  )                  ncol                  (                  irs_data                  )                              
              ## [one] 153            

The dataset contains 153 columns which are identified in the linked codebook. Geographies are identified by the ZIPCODE cavalcade, which shows aggregated information by country (ZIPCODE == "000000") and past zippo code. We might exist interested in agreement the geography of self-employment income within a given region. We'll retain the variables N09400, which represents the number of revenue enhancement returns with self-employment tax, and N1, which represents the total number of returns.

                                  self_employment                  <-                  irs_data                  %>%                  select                  (                  ZIPCODE, self_emp                  =                  N09400, total                  =                  N1                  )                              

From hither, we'll need to place a region of nix codes for assay. In tigris, the zctas() part allows u.s. to fetch a Zip Lawmaking Tabulation Areas shapefile. Given that some ZCTA geography is irregular and sometimes stretches beyond multiple states, a shapefile for the entire The states must first be downloaded. It is recommended that shapefile caching with options(tigris_use_cache = TRUE) be used with ZCTAs to avoid long data download times.

In the next chapter, you'll learn how to use spatial overlay to excerpt geographic data within a specific region. That said, the starts_with parameter in zctas() allows users to filter down ZCTAs based on a vector of prefixes, which tin can identify an area without using a spatial process. For example, we can become ZCTA data near Boston, MA by using the appropriate prefixes.

Next, we can utilize mapview() to inspect the results:

Figure 6.19: ZCTAs in the Boston, MA area

The ZCTA prefixes 021, 022, and 024 cover much of the Boston metropolitan surface area; "holes" in the region represent areas like Boston Common which are not covered past ZCTAs. Let's take a quick wait at its attributes:

              ## [i] "ZCTA5CE10"  "AFFGEOID10" "GEOID10"    "ALAND10"    "AWATER10"   ## [6] "geometry"            

Either the ZCTA4CE10 column or the GEOID10 column can be matched to the appropriate goose egg code information in the IRS dataset for visualization. The code below joins the IRS data to the spatial dataset and computes a new column representing the percentage of returns with self-employment income.

                                  boston_se_data                  <-                  boston_zctas                  %>%                  left_join                  (                  self_employment, by                  =                  c                  (                  "GEOID10"                  =                  "ZIPCODE"                  )                  )                  %>%                  mutate                  (pct_self_emp                  =                  100                  *                  (                  self_emp                  /                  full                  )                  )                  %>%                  select                  (                  GEOID10,                  self_emp,                  pct_self_emp                  )                              
Table 6.2: Cocky-employment percentages by ZCTA in the Boston, MA area
GEOID10 self_emp pct_self_emp geometry
02461 860 24.15730 MULTIPOLYGON (((-71.22275 4…
02141 930 xi.90781 MULTIPOLYGON (((-71.09475 4…
02139 2820 14.58872 MULTIPOLYGON (((-71.1166 42…
02180 1680 xiii.45076 MULTIPOLYGON (((-71.11976 4…
02457 NA NA MULTIPOLYGON (((-71.27642 4…

At that place are a diverseness of means to visualize this data. Ane such method is a choropleth map, which you've learned about earlier this affiliate:

                                  tm_shape                  (                  boston_se_data, projection                  =                  26918                  )                  +                  tm_fill                  (col                  =                  "pct_self_emp",            palette                  =                  "Purples",            title                  =                  "% cocky-employed,\n2018 IRS SOI information"                  )                              

Simple choropleth of self-employment in Boston

Effigy 6.20: Uncomplicated choropleth of self-employment in Boston

The choropleth map shows that self-employment filings are more common in suburban Boston ZCTAs than nearer to the urban cadre, generally speaking. Even so, nosotros might also be interested in understanding where about self-employment income filings are located rather than their share relative to the total number of returns filed. This requires visualizing the self_emp column straight. As discussed before in this chapter, a graduated symbol map with tm_bubbles() is preferable to a choropleth map for this purpose.

                                  tm_shape                  (                  boston_se_data                  )                  +                  tm_polygons                  (                  )                  +                  tm_bubbles                  (size                  =                  "self_emp",               alpha                  =                  0.5,               col                  =                  "navy",              title.size                  =                  "Self-employed filers,\n2018 IRS SOI data"                  )                              

Graduated symbol map of self-employment by ZCTA in Boston

Effigy 6.21: Graduated symbol map of self-employment past ZCTA in Boston

Interactive mapping

The examples addressed in this chapter thus far have all focused on static maps, where the output is fixed after rendering the map. Mod web technologies - and the integration of those technologies into R with the htmlwidgets parcel, as discussed in Section iv.7.4 - make the creation of interactive, explorable Census data maps straightforward.

Interactive mapping with Leaflet

With over 31,000 GitHub stars as of July 2021, the Leaflet JavaScript library (Agafonkin 2020) is one of the most popular frameworks worldwide for interactive mapping. The RStudio squad brought the Leaflet to R with the leaflet R parcel (Cheng, Karambelkar, and Xie 2021), which now powers several approaches to interactive mapping in R. The following examples cover how to visualize Census data on an interactive Leaflet map using approaches from mapview, tmap, and the cadre leaflet package.

Allow'due south commencement by getting some illustrative information on the pct of the population aged 25 and upward with a bachelor's caste or higher from the 2015-2019 ACS. We'll look at this information by Census tract in Dallas Canton, Texas.

                                  dallas_bachelors                  <-                  get_acs                  (                  geography                  =                  "tract",   variables                  =                  "DP02_0068P",   year                  =                  2019,   land                  =                  "TX",   county                  =                  "Dallas",   geometry                  =                  TRUE                  )                              

In Affiliate five, you learned how to quickly visualize geographic data obtained with tigris on an interactive map by using the mapview() part in the mapview package. The mapview() part likewise includes a parameter zcol that takes a column in the dataset as an argument, and visualizes that column with an interactive choropleth map.

Effigy 6.22: Interactive mapview choropleth

Conversion of tmap maps to interactive Leaflet maps is also straightforward with the command tmap_mode("view"). After inbound this command, all subsequent tmap maps in your R session will exist rendered as interactive Leaflet maps using the same tmap syntax you'd use to make static maps.

Figure half dozen.23: Interactive map with tmap in view mode

To switch dorsum to static plotting mode, run the command tmap_mode("plot").

For more than fine-grained control over your Leaflet maps, the core leaflet package can be used. Below, nosotros'll reproduce the mapview/tmap examples using the leaflet package's native syntax. First, a color palette will exist divers using the colorNumeric() function. This office itself creates a role we're calling pal(), which translates data values to color values for a given color palette. Our chosen color palette in this instance is the viridis magma palette.

              ## [i] "#170F3C" "#430F75" "#6E1E81" "#9B2E7F" "#C83E73"            

The map itself is built with a magrittr pipeline and the following steps:

  • The leaflet() role initializes the map. A data object can be specified here or in a role that comes later in the pipeline.

  • addProviderTiles() helps you add a basemap to the map that will be shown below your data equally a reference. Several providers are built-in to the Leaflet bundle, including the pop Stamen reference maps. If you are just interested in a basic basemap, the addTiles() function returns the standard OpenStreetMap basemap. Utilize the congenital-in providers object to try out different basemaps; it is good practice for choropleth mapping to use a greyscale or muted basemap.

  • addPolygons() adds the tract polygons to the map and styles them. In the lawmaking below, we are using a series of options to specify the input data; to color the polygons relative to the defined colour palette; and to arrange the smoothing betwixt polygon borders, the opacity, and the line weight. The label argument will create a hover tooltip on the map for additional information about the polygons.

  • addLegend() then creates a legend for the map, providing critical information well-nigh how the colors on the map chronicle to the data values.

                                  leaflet                  (                  )                  %>%                  addProviderTiles                  (                  providers                  $                  Stamen.TonerLite                  )                  %>%                  addPolygons                  (data                  =                  dallas_bachelors,               color                  =                  ~                  pal                  (                  approximate                  ),               weight                  =                  0.v,               smoothFactor                  =                  0.ii,               fillOpacity                  =                  0.five,               label                  =                  ~                  estimate                  )                  %>%                  addLegend                  (                  position                  =                  "bottomright",     pal                  =                  pal,     values                  =                  dallas_bachelors                  $                  estimate,     title                  =                  "% with bachelor'south<br/>degree"                  )                              

Effigy vi.24: Interactive leaflet map

This example only scratches the surface of what the leaflet R package can accomplish; I'd encourage y'all to review the documentation for more examples.

Alternative approaches to interactive mapping

Like near interactive mapping platforms, Leaflet uses tiled mapping in the Web Mercator coordinate reference system. Web Mercator works well for tiled web maps that demand to fit within rectangular computer screens, and preserves angles at large scales (zoomed-in areas) which is useful for local navigation. However, information technology grossly distorts the area of geographic features almost the poles, making it inappropriate for small-scale thematic mapping of the globe or world regions (Battersby et al. 2014).

Allow's illustrate this past mapping median home value by state from the 1-year ACS using leaflet. Nosotros'll commencement learn the information with geometry using tidycensus, setting the output resolution to "20m" to get depression-resolution boundaries to speed upwards our interactive mapping.

                                  us_value                  <-                  get_acs                  (                  geography                  =                  "state",   variables                  =                  "B25077_001",   yr                  =                  2019,   survey                  =                  "acs1",   geometry                  =                  Truthful,   resolution                  =                  "20m"                  )                              

The acquired ACS data for the US can be mapped using the same techniques every bit with the educational attainment map for Dallas County.

                                  library                  (                  leaflet                  )                  us_pal                  <-                  colorNumeric                  (                  palette                  =                  "plasma",   domain                  =                  us_value                  $                  estimate                  )                  leaflet                  (                  )                  %>%                  addProviderTiles                  (                  providers                  $                  Stamen.TonerLite                  )                  %>%                  addPolygons                  (data                  =                  us_value,               color                  =                  ~                  us_pal                  (                  approximate                  ),               weight                  =                  0.five,               smoothFactor                  =                  0.2,               fillOpacity                  =                  0.5,               label                  =                  ~                  judge                  )                  %>%                  addLegend                  (                  position                  =                  "bottomright",     pal                  =                  us_pal,     values                  =                  us_value                  $                  judge,     championship                  =                  "Median habitation value"                  )                              

Figure 6.25: Interactive US map using Spider web Mercator

The disadvantages of Spider web Mercator - as well as this general approach to mapping the United states - are on total display. Alaska's surface area is grossly distorted relative to the rest of the United States. It is also hard on this map to compare Alaska and Hawaii to the continental U.s. - which is particularly important in this example as Hawaii'south median home value is the highest in the entire country. The solution proposed elsewhere in this book is to employ tigris::shift_geometry() which adopts appropriate coordinate reference systems for Alaska, Hawaii, and the continental US and arranges them in a better comparative fashion on the map. Even so, this arroyo risks losing the interactivity of a Leaflet map.

A compromise solution tin can involve other R packages that permit for interactive mapping. An excellent option is the ggiraph package (Gohel and Skintzos 2021), which similar the plotly bundle tin can convert ggplot2 graphics into interactive plots. In the example below, interactivity is added to a ggplot2 plot with ggiraph, allowing for panning and zooming with a hover tooltip on a shifted and rescaled map of the United states of america.

                                  library                  (                  ggiraph                  )                  us_value_shifted                  <-                  us_value                  %>%                  shift_geometry                  (position                  =                  "outside"                  )                  %>%                  mutate                  (tooltip                  =                  paste                  (                  NAME,                  estimate, sep                  =                  ": "                  )                  )                  gg                  <-                  ggplot                  (                  us_value_shifted,                  aes                  (fill up                  =                  estimate                  )                  )                  +                  geom_sf_interactive                  (                  aes                  (tooltip                  =                  tooltip, data_id                  =                  Proper name                  ),                        size                  =                  0.1                  )                  +                  scale_fill_viridis_c                  (option                  =                  "plasma", labels                  =                  scales                  ::                  dollar                  )                  +                  labs                  (title                  =                  "Median housing value by State, 2019",        explanation                  =                  "Data source: 2019 1-twelvemonth ACS, Usa Census Bureau",        fill                  =                  "ACS estimate"                  )                  +                  theme_void                  (                  )                  girafe                  (ggobj                  =                  gg                  )                  %>%                  girafe_options                  (                  opts_hover                  (css                  =                  "fill:cyan;"                  ),                  opts_zoom                  (max                  =                  10                  )                  )                              

Effigy half dozen.26: Interactive Usa map with ggiraph

Advanced examples

The examples discussed in this chapter thus far probable cover a large proportion of cartographic use cases for Demography data analysts. Nevertheless, R allows cartographers to get across these cadre map types. This final section of the affiliate introduces some options for more than advanced visualization using data from tidycensus.

Mapping migration flows

In 2021, tidycensus co-author Matt Herman added support for the ACS Migration Flows API in the package, covered briefly in Section two.5. One notable feature of this new functionality, bachelor in the get_flows() part, is built-in support for flow mapping with the argument geometry = TRUE. Geometry operates differently for migration flows data given that the geography of interest is not a single location for a given row, but rather the connexion between those locations. In turn, for get_flows(), geometry = TRUE returns two POINT geometry columns: one for the location itself, and one for the location to which it is linked.

Let's accept a look for one of the nearly popular recent migration destinations in the United States: Travis Canton Texas, home to Austin.

Table vi.3: Top origins for migrants to Travis County, TX
GEOID1 GEOID2 FULL1_NAME FULL2_NAME variable estimate moe centroid1 centroid2
48453 48491 Travis County, Texas Williamson County, Texas MOVEDIN 10147 1198 POINT (-97.78195 30.33469) POINT (-97.60076 30.64804)
48453 48201 Travis County, Texas Harris County, Texas MOVEDIN 5746 742 POINT (-97.78195 thirty.33469) Betoken (-95.39234 29.85728)
48453 48209 Travis County, Texas Hays County, Texas MOVEDIN 4240 839 Betoken (-97.78195 thirty.33469) Signal (-98.03107 30.05815)
48453 48029 Travis County, Texas Bexar County, Texas MOVEDIN 3758 631 Point (-97.78195 30.33469) Signal (-98.52002 29.44896)
48453 48113 Travis County, Texas Dallas Canton, Texas MOVEDIN 3005 657 POINT (-97.78195 xxx.33469) POINT (-96.77787 32.76663)
48453 48439 Travis Canton, Texas Tarrant County, Texas MOVEDIN 2053 527 POINT (-97.78195 30.33469) Betoken (-97.29124 32.77156)
48453 06037 Travis County, Texas Los Angeles County, California MOVEDIN 1770 426 Bespeak (-97.78195 30.33469) Betoken (-118.2339 34.33299)
48453 48021 Travis County, Texas Bastrop County, Texas MOVEDIN 1423 334 Betoken (-97.78195 thirty.33469) Signal (-97.31201 30.10361)
48453 48085 Travis Canton, Texas Collin County, Texas MOVEDIN 1172 514 Point (-97.78195 30.33469) POINT (-96.57237 33.18791)
48453 48141 Travis County, Texas El Paso County, Texas MOVEDIN 1108 445 Point (-97.78195 30.33469) Betoken (-106.2348 31.76855)

The dataset is filtered to focus on in-migration, represented by the MOVEDIN variable, and drops migrations from outside the United States with na.omit() (as these areas do not have a GEOID value).

The mapdeck R bundle (Cooley 2020) offers excellent back up for interactive menstruation mapping with minimal code. mapdeck is a wrapper of deck.gl, a tremendous visualization library originally developed at Uber that offers 3D mapping capabilities congenital on top of Mapbox's GL JS library. Users will need to sign upwardly for a Mapbox account and get a Mapbox access token to apply mapdeck; see the mapdeck documentation for more than information.

Once gear up, flow maps can be created past initializing a mapdeck map with mapdeck() and then using the add_arc() office, which tin can link either X/Y coordinate columns or Signal geometry columns, equally shown below. In this instance, we are using the top thirty origins for migrants to Travis County, and generating a new weight column that makes the proportional arc widths less bulky.

Flow map of in-migration to Travis County, TX with mapdeck

Figure half dozen.27: Menstruation map of in-migration to Travis County, TX with mapdeck

Linking maps and charts

This affiliate and Chapter 4 are linked in many ways. The visualization principles discussed in each chapter apply to each other; the primal divergence is that this chapter focuses on geographic visualization whereas Affiliate 4 does not. In many cases, y'all'll desire to have advantage of the strengths of both geographic and not-geographic visualization. Maps are excellent at showing patterns and trends over space, and offering a familiar reference to viewers; charts are better at showing rankings and ordering of information values betwixt places.

The example below illustrates how to combine two chart types discussed in this book: a choropleth map and a margin of mistake plot. Margins of fault are notoriously difficult to brandish on maps; possible options include superimposing patterns on choropleth maps to highlight areas with high levels of doubtfulness (Wong and Dominicus 2013) or using bivariate choropleth maps to simultaneously visualize estimates and MOEs (Lucchesi and Wikle 2017).

R's visualization tools offering an culling approach: interactive linking of a choropleth map with a chart that clearly visualizes the doubtfulness effectually estimates. This approach involves generating a map and chart with ggplot2, and then combining the plots with patchwork and rendering them as interactive, linked graphics with ggiraph. The fundamental aesthetic to be used here is data_id, which if set in the lawmaking for both plots will highlight corresponding data points on both plots on user hover.

Example code to generate such a linked visualization is beneath.

                                  library                  (                  tidycensus                  )                  library                  (                  ggiraph                  )                  library                  (                  tidyverse                  )                  library                  (                  patchwork                  )                  vt_income                  <-                  get_acs                  (                  geography                  =                  "canton",   variables                  =                  "B19013_001",   state                  =                  "VT",   year                  =                  2019,   geometry                  =                  Truthful                  )                  %>%                  mutate                  (Proper noun                  =                  str_remove                  (                  Proper name,                  " Canton, Vermont"                  )                  )                  vt_map                  <-                  ggplot                  (                  vt_income,                  aes                  (fill                  =                  approximate                  )                  )                  +                  geom_sf_interactive                  (                  aes                  (data_id                  =                  GEOID                  )                  )                  +                  scale_fill_distiller                  (palette                  =                  "Greens",                        direction                  =                  ane,                         guide                  =                  "none"                  )                  +                  theme_void                  (                  )                  vt_plot                  <-                  ggplot                  (                  vt_income,                  aes                  (x                  =                  judge, y                  =                  reorder                  (                  NAME,                  estimate                  ),                                   fill up                  =                  guess                  )                  )                  +                  geom_errorbarh                  (                  aes                  (xmin                  =                  estimate                  -                  moe, xmax                  =                  estimate                  +                  moe                  )                  )                  +                  geom_point_interactive                  (colour                  =                  "blackness", size                  =                  4, shape                  =                  21,                  aes                  (data_id                  =                  GEOID                  )                  )                  +                  scale_fill_distiller                  (palette                  =                  "Greens", management                  =                  one,                        labels                  =                  scales                  ::                  dollar                  )                  +                  scale_x_continuous                  (labels                  =                  scales                  ::                  dollar                  )                  +                  labs                  (title                  =                  "Household income by county in Vermont",        subtitle                  =                  "2015-2019 American Customs Survey",        y                  =                  "",        10                  =                  "ACS estimate (confined represent margin of error)",        fill                  =                  "ACS estimate"                  )                  +                  theme_minimal                  (base_size                  =                  14                  )                  girafe                  (ggobj                  =                  vt_map                  +                  vt_plot, width_svg                  =                  x, height_svg                  =                  5                  )                  %>%                  girafe_options                  (                  opts_hover                  (css                  =                  "fill up:cyan;"                  )                  )                              

Figure 6.28: Linked map and chart with ggiraph

This instance largely re-purposes visualization code readers have learned in other examples in this volume. Try hovering your cursor over any county in Vermont on the map - or any information point on the chart - and notice what happens on the other plot. The corresponding county or data betoken will also be highlighted, allowing for a linked representation of geographic location and margin of error visualization.

Reactive mapping with Shiny

Advanced Demography cartographers may want to have these examples a step farther and build them into full-fledged data dashboards and web-based visualization applications. Fortunately, R users don't have to do this from scratch. Shiny (Chang et al. 2021) is a tremendously popular and powerful framework for the development of interactive web applications in R that can execute R code in response to user inputs. A full treatment of Shiny is beyond the scope of this book; however, Shiny is a "must-acquire" for R information analysts.

An example Shiny visualization app that extends the race/ethnicity example from this chapter is shown below. It includes a drop-downward bill of fare that allows users to select a racial or ethnic grouping in the Twin Cities and visualizes the distribution of that group on an interactive choropleth map that uses Leaflet and the viridis colour palette.

Figure 6.29: Interactive mapping app with Shiny

The code used to create the app is found below; copy-paste this code into your own R script, set your Census API central, and run it! To learn more, I encourage you to review Hadley Wickham'southward Mastering Shiny book and the Leaflet package'south documentation on Shiny integration.

                                  # app.R                  library                  (                  tidycensus                  )                  library                  (                  shiny                  )                  library                  (                  leaflet                  )                  library                  (                  tidyverse                  )                  census_api_key                  (                  "YOUR Cardinal HERE"                  )                  twin_cities_race                  <-                  get_acs                  (                  geography                  =                  "tract",   variables                  =                  c                  (                  hispanic                  =                  "DP05_0071P",     white                  =                  "DP05_0077P",     blackness                  =                  "DP05_0078P",     native                  =                  "DP05_0079P",     asian                  =                  "DP05_0080P"                  ),   state                  =                  "MN",   county                  =                  c                  (                  "Hennepin",                  "Ramsey",                  "Anoka",                  "Washington",                  "Dakota",                  "Carver",                  "Scott"                  ),   geometry                  =                  TRUE                  )                  groups                  <-                  c                  (                  "Hispanic"                  =                  "hispanic",                  "White"                  =                  "white",                  "Black"                  =                  "black",                  "Native American"                  =                  "native",                  "Asian"                  =                  "asian"                  )                  ui                  <-                  fluidPage                  (                  sidebarLayout                  (                  sidebarPanel                  (                  selectInput                  (                  inputId                  =                  "group",         characterization                  =                  "Select a group to map",         choices                  =                  groups                  )                  ),                  mainPanel                  (                  leafletOutput                  (                  "map", height                  =                  "600"                  )                  )                  )                  )                  server                  <-                  function                  (                  input,                  output                  )                  {                  # Reactive function that filters for the selected group in the drib-down menu                  group_to_map                  <-                  reactive                  (                  {                  filter                  (                  twin_cities_race,                  variable                  ==                  input                  $                  grouping                  )                  }                  )                  # Initialize the map object, centered on the Minneapolis-St. Paul area                  output                  $                  map                  <-                  renderLeaflet                  (                  {                  leaflet                  (options                  =                  leafletOptions                  (zoomControl                  =                  FALSE                  )                  )                  %>%                  addProviderTiles                  (                  providers                  $                  Stamen.TonerLite                  )                  %>%                  setView                  (lng                  =                  -                  93.21,               lat                  =                  44.98,               zoom                  =                  eight.v                  )                  }                  )                  observeEvent                  (                  input                  $                  group,                  {                  pal                  <-                  colorNumeric                  (                  "viridis",                  group_to_map                  (                  )                  $                  guess                  )                  leafletProxy                  (                  "map"                  )                  %>%                  clearShapes                  (                  )                  %>%                  clearControls                  (                  )                  %>%                  addPolygons                  (data                  =                  group_to_map                  (                  ),                   color                  =                  ~                  pal                  (                  estimate                  ),                   weight                  =                  0.v,                   fillOpacity                  =                  0.5,                   smoothFactor                  =                  0.2,                   label                  =                  ~                  estimate                  )                  %>%                  addLegend                  (                  position                  =                  "bottomright",         pal                  =                  pal,         values                  =                  group_to_map                  (                  )                  $                  estimate,         title                  =                  "% of population"                  )                  }                  )                  }                  shinyApp                  (ui                  =                  ui, server                  =                  server                  )                              

Working with software exterior of R for cartographic projects

The examples shown in this chapter display all maps inside R. RStudio users running the code in this chapter, for example, will display static plots in the Plots pane, or interactive maps in the interactive Viewer pane. In many cases, R users volition want to export out their maps for display on a website or in a report. In other cases, R users might be interested in using R and tidycensus as a "data pipeline" that tin can generate appropriate Census data for mapping in an external software package like Tableau, ArcGIS, or QGIS. This section covers those use-cases.

Exporting maps from R

Cartographers exporting maps made with ggplot2 will likely want to use the ggsave() command. The map export procedure with ggsave() is like to the procedure described in Section 4.2.3. If theme_void() is used for the map, all the same, the cartographer may desire to supply a colour to the bg parameter, equally information technology would otherwise default to "transparent" for that theme.

The tmap_save() command is the best option for exporting maps made with tmap. tmap_save() requires that a map be stored as an object first; this example volition re-use a map of Hennepin County from earlier in the affiliate and assign information technology to a variable named hennepin_map.

                                  hennepin_map                  <-                  tm_shape                  (                  hennepin_black,           projection                  =                  sf                  ::                  st_crs                  (                  26915                  )                  )                  +                  tm_polygons                  (col                  =                  "pct",           style                  =                  "jenks",           n                  =                  5,           palette                  =                  "Purples",           title                  =                  "ACS estimate",           fable.hist                  =                  Truthful                  )                  +                  tm_layout                  (championship                  =                  "Percent Black\nby Demography tract",             frame                  =                  FALSE,             legend.exterior                  =                  TRUE,             bg.color                  =                  "grey70",             legend.hist.width                  =                  v,             fontfamily                  =                  "Verdana"                  )                              

That map can be saved using similar options to ggsave(). tmap_save() allows for specification of width, height, units, and dpi. If small values are passed to width and top, tmap volition assume that the units are inches; if big values are passed (greater than 50), tmap volition assume that the units correspond pixels.

                                  tmap_save                  (                  tm                  =                  hennepin_map,   filename                  =                  "~/images/hennepin_black_map.png",   height                  =                  v.5,   width                  =                  viii,   dpi                  =                  300                  )                              

Interactive maps designed with leaflet can be written to HTML documents using the saveWidget() function from the htmlwidgets package. A Leaflet map should kickoff be assigned to a variable, which is then passed to saveWidget() along with a specified name and path for the output HTML file. Interactive maps created with mapview() are written to HTML files the same fashion, though the object to exist saved will need to exist accessed from the map slot with the notation @map, as shown beneath.

The argument selfcontained = TRUE is an important 1 to consider when writing interactive maps to HTML files. If True as shown in the example, saveWidget() will packet whatever necessary assets (e.thou. CSS, JavaScript) as a base64-encoded string in the HTML file. This makes the HTML file more portable but tin can lead to large file sizes. The alternative, selfcontained = FALSE, places these avails into an accompanying directory when the HTML file is written out. For interactive maps generated with tmap, tmap_save() tin also be used to write out HTML files in this way.

Interoperability with other visualization software

Although R packages have rich capabilities for designing both static and interactive maps likewise as map-based dashboards, some analysts will want to turn to other specialized tools for information visualization. Such tools might include drag-and-drop dashboard builders like Tableau, or dedicated GIS software like ArcGIS or QGIS that let for more manual command over cartographic layouts and outputs.

This workflow will often involve the use of R, and packages like tidycensus, for data conquering and wrangling, then the apply of an external visualization programme for data visualization and cartography. In this workflow, an R object produced with tidycensus can be written to an external spatial file with the st_write() function in the sf bundle. The code beneath illustrates how to write Census data from R to a shapefile, a common vector spatial data format readable by desktop GIS software and Tableau.

The output file dc_income.shp volition exist written to the user'south electric current working directory. Other spatial data formats like GeoPackage (.gpkg) and GeoJSON (.geojson) are available past specifying the appropriate file extension.

QGIS cartographers tin also accept reward of functionality within the software to run R (and consequently tidycensus functions) straight within the platform. In QGIS, this is enabled with the Processing R Provider plugin. QGIS users should install the plugin from the Plugins drop-downward menu, and then click Processing > Toolbox to access QGIS's suite of tools. Clicking the R icon then Create New R Script… volition open up the R script editor.

The example script beneath will be translated by QGIS into a GIS tool that uses the version of tidycensus installed on a user's system to add a demographic layer at the Census tract level from the ACS to a QGIS project.

                                  ##Variable=cord                  ##State=string                  ##County=string                  ##Output=output vector                  library                  (                  tidycensus                  )                  Output                  =                  get_acs                  (                  geography                  =                  "tract",     variables                  =                  Variable,     state                  =                  Land,     canton                  =                  Canton,     geometry                  =                  True                  )                              

Tool parameters are defined at the beginning of the script with ## notation. Variable, Country, and County will all have strings (text) every bit input, and the result of get_acs() in the tool will exist written to Output, which is added to the QGIS project as a vector layer. Once finished, the script should be saved with an appropriate proper noun and the file extension .rsx in the suggested output directory, then located from the R section of the Processing Toolbox and opened. The GIS tool will look something similar the image below.

Example tidycensus tool in QGIS

Figure 6.30: Example tidycensus tool in QGIS

Fill in the text boxes with a desired ACS variable, state, and county, so click Run. The QGIS tool will call the user's R installation to execute the tool with the specified inputs. If everything runs correctly, a layer volition be added to the user's QGIS project ready for mapping with QGIS's suite of cartographic tools.

Styled layer from tidycensus in QGIS

Effigy 6.31: Styled layer from tidycensus in QGIS

The case shown displays Census tracts in Cook County, Illinois obtained from tidycensus in a QGIS project; these tracts have been styled as a choropleth in QGIS past median household income (the requested variable) later the tool added the layer to the projection.

Exercises

Using one of the mapping frameworks introduced in this chapter (either ggplot2, tmap, or leaflet) complete the post-obit tasks:

  • If you are only getting started with tidycensus/the tidyverse, make a race/ethnicity map by adapting the code provided in this section simply for a unlike county.

  • Next, find a different variable to map with tidycensus::load_variables(). Review the give-and-take of cartographic choices in this chapter and visualize your information appropriately.

DOWNLOAD HERE

Posted by: swanhavoing.blogspot.com

Post a Comment for "Best Way to Download Acs Data in R"