| Type: | Package |
| Title: | Travel Smoothly Between Country Name and Code Formats |
| Version: | 0.3.0 |
| Description: | Smooths the process of working with country names and codes via powerful parsing, standardization, and conversion utilities arranged in a simple, consistent API. Country name formats include multiple sources including the Unicode Common Locale Data Repository (CLDR, http://cldr.unicode.org/) common-sense standardized names in hundreds of languages. |
| Depends: | R (≥ 3.1.0) |
| Imports: | stats, utils |
| Suggests: | covr, dplyr, DT, gapminder, ggplot2, jsonlite, knitr, mockr, rmarkdown, testthat, tidyr |
| License: | GPL-3 | file LICENSE |
| URL: | https://github.com/alistaire47/passport, https://alistaire47.github.io/passport/ |
| BugReports: | https://github.com/alistaire47/passport/issues |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.1.1 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2020-11-07 07:09:00 UTC; alistaire |
| Author: | Edward Visel |
| Maintainer: | Edward Visel <edward.visel@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2020-11-07 07:30:03 UTC |
Convert standardized country names to country codes
Description
as_country_code converts a vector of standardized country names or codes to
country codes
Usage
as_country_code(x, from, to = "iso2c", factor = is.factor(x))
Arguments
x |
A character, factor, or numeric vector of country names or codes |
from |
Format from which to convert. See Details for more options. |
to |
Code format to which to convert. Defaults to |
factor |
If |
Details
as_country_code takes a character, factor, or numeric vector of country
names or codes to translate into the specified code format. The default for
to is "iso2c", the ISO 3166-1 Alpha-2 character codes, but many
alternatives are available.
Several non-unique codes are available as well, including "continent",
"is_independent", ISO 4217 currency codes, etc. Backwards conversion will
not work for such cases.
See codes for all options, or run DT::datatable(codes) for a
searchable widget.
Value
A vector of country codes. Warns if new NA values are added.
See Also
For converting to country names, use as_country_name(), which
offers control of short and variant forms. For parsing non-standardized
country names to codes, use parse_country().
Examples
# Codifies standardized names
as_country_code(c("US", "Taiwan", "Myanmar", "Kosovo", "South Korea"), from = "en")
# Translates codes; if passed a factor, returns a releveled one
as_country_code(factor(c("SAH", "PCN", "OMA", "JPN")),
from = "fifa", to = "iso4217_3c")
Convert standardized country codes to country names
Description
as_country_name converts a vector of standardized country codes to
country names.
Usage
as_country_name(
x,
to = "en",
from = "iso2c",
short = TRUE,
variant = FALSE,
factor = is.factor(x)
)
Arguments
x |
A character, factor, or numeric vector of country codes or names |
to |
Language code of country names desired. Defaults to |
from |
Code format from which to convert. Defaults to |
short |
Whether to use short alternative name when available. Can be
length 1 or the same length as |
variant |
Whether to use variant alternative name when available. Can
be length 1 or the same length as |
factor |
If |
Details
as_country_name takes a character, factor, or numeric vector of country
codes (or names in another standardized format) and converts them to
country names in the specified format. If you are trying to standardize an
existing set of names, see parse_country().
The default "en" is
from Unicode Common Locale Data Repository (CLDR),
which aspires to use the most customary name
e.g. "Switzerland" instead of official ones, which are frequently awkward for
common usage, e.g. "Swiss Confederation". CLDR also supplies names in a huge
variety of languages, allowing for easy translation. Short and variant
alternates are available for some countries; if not, the function will fall
back to the standard form. See LICENSE file for terms of use.
Other name sets are available from
-
the UN Statistics Division(UNSD), which maintains standardized names in English, Chinese, Russian, French, Spanish, and Arabic, here named as
"en_un"etc. -
the ISO,
"en_iso"and"fr_iso", and -
-
"en_cia", which include many longer official forms and shorter practical forms, -
"en_cia_local", which includes transliterations, and -
"en_cia_abbreviation", which includes commonly-used abbreviations.
-
See codes for all options, or run DT::datatable(codes) for a
searchable widget.
Value
A character or factor vector of country names. Warns if new NA
values are added.
See Also
For converting standardized names to codes, use as_country_code().
For standardizing names to codes, use parse_country().
Examples
# Usable names for tough-to-standardize places
as_country_name(c("US", "TW", "MM", "XK", "KR"))
# If passed a factor, will return a releveled one
as_country_name(factor(c("US", "NF", "CD", "SJ")), short = FALSE, variant = TRUE)
# Speaks a lot of languages, knows a lot of codes
as_country_name(c("SAH", "PCN", "OMA", "JPN"), from = "fifa", to = "cy") # to Welsh
Country code and name details and documentation
Description
A codebook data.frame of codes and details for country code and name
conversions available. Contains
Internet Engineering Task Force (IETF) language tags
(e.g. "en-nz" for New Zealand English) for
Unicode Common Locale Data Repository (CLDR) names,
similar approximations for institutional names (e.g. "en-iso"), and short
names (e.g. "iso2c") for country codes.
Usage
codes
Format
A data.frame of 427 rows and 9 variables.
Variables:
columnThe column name in the internal
passport:::countriesdata.frame. Valid for use infromandtoparameters.codecolumnwith hyphens for underscores, which is a valid IANA language tag for Unicode CLDR country names. Valid for use infromandtoparameters.nameFull name or code name for non-CLDR options.
notesThings to note, including deprecations, oddities, etc.
languageFull language name parsed from
code.regionFull country or region name parsed from
code.scriptFull language script name parsed from
code.variantFull variant parsed from
code. Also used for organization-standardized names.extensionFurther specification of name type.
Details
All functions can accept codes separated with underscores _, hyphens -,
or periods ..
Examples
# A searchable widget to find a code or name
if (requireNamespace("DT", quietly = TRUE)) {
DT::datatable(codes)
}
Construct formatter function to format country codes as country names
Description
country_format is a constructor function that returns a function
to format country codes as country names suitable for passing to ggplot2's
scale functions' label parameters.
Usage
country_format(
from = "iso2c",
to = "en",
short = TRUE,
variant = FALSE,
factor
)
Arguments
from |
Code format from which to convert. Defaults to |
to |
Language code of country names desired. Defaults to |
short |
Whether to use short alternative name when available. Can be
length 1 or the same length as |
variant |
Whether to use variant alternative name when available. Can
be length 1 or the same length as |
factor |
If |
Details
A frequent reason to convert country codes back to country names is to make
data visualizations more readable. While both a code and name could be
stored in a data frame, the computation and extra storage required can be
avoided by transforming codes to names directly within the visualization via
a formatter function. as_country_name() could be used without
parentheses to format ISO 2-character codes as English names, but
format_country allows greater flexibility, returning a formatter
function with the specified parameters set.
Value
A function that accepts a vector of country codes and returns them as country names.
See Also
For controlling the order of a discrete scale, pass the results of
order_countries() to limits.
Examples
if (require(ggplot2, quietly = TRUE)) {
ggplot(data.frame(country = c("KOR", "MMR", "TWN", "COG"),
y = 1:4),
aes(x = country, y = y)) +
geom_col() +
scale_x_discrete(labels = country_format(from = "iso3c"))
}
NATO Member Defense Expenditures
Description
A sample dataset of NATO/OTAN member defense expenditures.
Usage
nato
Format
A data.frame of 232 rows and 14 variables.
Variables:
country_stanagCountry code in NATO STANAG format
yearYear, from 2012 to 2019. 2018-2019 numbers may be estimates.
Defense expenditure (USD, current prices)-
Defense expenditures in US dollars, using current prices and exchange rates.
Defense expenditure (USD, 2015 prices)-
Defense expenditures in US dollars, using 2015 prices and exchange rates.
Defense expenditure (% real GDP)-
Defense expenditure as a percentage of real gross domestic product. Based on 2015 prices.
Defense expenditure annual real change (% GDP)-
Annual change in defense expenditure as a percentage of real gross domestic product. Based on 2015 prices.
Real GDP (2015 prices)-
Real gross domestic product in 2015 US dollars and at 2015 exchange rates.
GDP per capita (USD)-
Gross domestic product per capita in 2015 US dollars and at 2015 exchange rates.
Defense expenditure per capita (USD)-
Defense expenditure per capita in 2015 US dollars.
Military personnelNumber of military personnel
Equipment expenditure (%)-
Percent of defense expenditure spent on equipment. Includes major equipment expenditure and R&D devoted to major equipment.
Personnel expenditure (%)-
Percentage of defense expenditure spent on personnel. Includes both military and civilian expenditure and pensions.
Infrastructure expenditure (%)-
Percentage of defense expenditure spent on infrastructure. Includes NATO common infrastructure and national military construction.
Other expenditure (%)-
Percentage of defense expenditure spent on other categories besides equipment, personnel, and infrastructure. Includes operations and maintenance expenditure, other R&D expenditure, and other expenditure not otherwise captured.
Source
https://www.nato.int/cps/en/natohq/news_167080.htm
Examples
as_country_name(nato$country_stanag, from = 'stanag')
Order a vector of countries
Description
order_countries reorders a vector of countries, returning a result useful
for passing to ggplot2's scale functions' limits parameters.
Usage
order_countries(
x,
by,
...,
from = "iso2c",
short = TRUE,
variant = FALSE,
factor = is.factor(x)
)
Arguments
x |
A character, factor, or numeric vector of country codes or names |
by |
Either a length-one country code from |
... |
Parameters passed on to |
from |
Code format from which to convert. Defaults to |
short |
Whether to use short alternative name when available. Can be
length 1 or the same length as |
variant |
Whether to use variant alternative name when available. Can
be length 1 or the same length as |
factor |
If |
Details
order_countries orders a vector of countries by
itself converted to a country code or name if
byis a code fromcodesto which to converta sortable vector if
byis a vector of the same length asx-
xitself if neither is supplied.
Value
The original vector of countries, ordered according to the parameters
passed. Note that factors are not releveled, but are reordered. To
relevel, pass the results to levels<-()
See Also
To change labels of a discrete scale, pass the results of
country_format() to the labels parameter.
Examples
countries <- c("FR", "CP", "UZ", "BH", "BR")
order_countries(countries)
order_countries(countries, "ja")
order_countries(countries, rnorm(5))
order_countries(countries, grepl("F", countries), 1:5, decreasing = TRUE)
if (require(ggplot2, quietly = TRUE)) {
df_countries <- data.frame(country = countries,
y = exp(1:5))
ggplot(df_countries, aes(country, y)) +
geom_col() +
scale_x_discrete(
limits = order_countries(df_countries$country,
df_countries$y)[df_countries$y > 5],
labels = country_format(to = "en-cia-local")
)
}
Parse country names to standardized form
Description
parse_country parses irregular country names to the ISO 3166-1 Alpha-2 code
or other standardized code or name format.
Usage
parse_country(
x,
to = "iso2c",
how = c("regex", "google"),
language = c("en", "de"),
factor = is.factor(x)
)
Arguments
x |
A character or factor vector of country names to standardize |
to |
Format to which to convert. Defaults to |
how |
How to parse; defaults to |
language |
If |
factor |
If |
Details
parse_country tries to parse a character or factor vector of country names
to a standardized form: by default, ISO 3166-1 Alpha-2 codes.
When how = "regex" (default), parse_country uses regular expressions to
match irregular forms.
If regular expressions are insufficient, how = "google" will use the
Google Maps geocoding API instead, which permits a much broader range of
input formats and languages. The API allows 2500 calls per day, and should
thus be called judiciously. parse_country will make one call per unique
input. For more calls, see options that allow passing an API key like
ggmap::geocode() with output = "all" or googleway::google_geocode().
Note that due to their flexibility, the APIs may fail unpredictably, e.g.
parse_country("foo", how = "google") returns "CH" whereas how = "regex"
fails with a graceful NA and warning.
Value
A character vector or factor of ISO 2-character country codes or other specified codes or names. Warns of any parsing failure.
Examples
parse_country(c("United States", "USA", "U.S.", "us", "United States of America"))
## Not run:
# Unicode support for parsing accented or non-Latin scripts
parse_country(c("\u65e5\u672c", "Japon", "\u0698\u0627\u067e\u0646"), how = "google")
#> [1] "JP" "JP" "JP" "JP"
# Parse distinct place names via geocoding APIs
parse_country(c("1600 Pennsylvania Ave, DC", "Eiffel Tower"), how = "google")
#> [1] "US" "FR"
## End(Not run)