parse_country parses irregular country names to the ISO 3166-1 Alpha-2 code or other standardized code or name format.

parse_country(
  x,
  to = "iso2c",
  how = c("regex", "google"),
  language = c("en", "de"),
  factor = is.factor(x)
)

Arguments

x

A character or factor vector of country names to standardize

to

Format to which to convert. Defaults to "iso2c"; see codes for more options.

how

How to parse; defaults to "regex". `"google"`` uses the Google Maps geocoding API. See "Details" for more information.

language

If how = "regex", the language from which to parse country names. Currently accepts "en" (default) and "de". Ignored if how = "google".

factor

If TRUE, returns factor instead of character vector. If not supplied, defaults to is.factor(x)

Value

A character vector or factor of ISO 2-character country codes or other specified codes or names. Warns of any parsing failure.

Details

parse_country tries to parse a character or factor vector of country names to a standardized form: by default, ISO 3166-1 Alpha-2 codes.

When how = "regex" (default), parse_country uses regular expressions to match irregular forms.

If regular expressions are insufficient, how = "google" will use the Google Maps geocoding API instead, which permits a much broader range of input formats and languages. The API allows 2500 calls per day, and should thus be called judiciously. parse_country will make one call per unique input. For more calls, see options that allow passing an API key like ggmap::geocode() with output = "all" or googleway::google_geocode().

Note that due to their flexibility, the APIs may fail unpredictably, e.g. parse_country("foo", how = "google") returns "CH" whereas how = "regex" fails with a graceful NA and warning.

Examples

parse_country(c("United States", "USA", "U.S.", "us", "United States of America"))
#> [1] "US" "US" "US" "US" "US"
if (FALSE) { # Unicode support for parsing accented or non-Latin scripts parse_country(c("\u65e5\u672c", "Japon", "\u0698\u0627\u067e\u0646"), how = "google") #> [1] "JP" "JP" "JP" "JP" # Parse distinct place names via geocoding APIs parse_country(c("1600 Pennsylvania Ave, DC", "Eiffel Tower"), how = "google") #> [1] "US" "FR" }