istat package allows you to obtain data from Istat databases within R environment. As of September 2024, there are 2 sources of data: I.Stat and IstatData. Istat is replacing I.Stat with IstatData platform, but I.Stat can still be used as a source. Searching and downloading data sets from the new platform allows you to have access to more data sets (they can be found at https://esploradati.istat.it/databrowser/). This package allows you to search, get, filter and plot data sets. It will follow the explanation of provider related functions, then the explanation of filter and plot functions and lastly shinyIstat will be introduced.
Note that, when using get_i_stat or get_istatdata, the function may take some time to download data sets.
The functions list_istatdata and search_istatdata have been updated but they work in the same way as the previous version, while get_istatdata has been updated and it now requires less parameters: there is no need to specify the version and the agencyId anymore and it works just like get_i_stat function. Moreover, the website dati.istat.it has been dismissed, but it is still possible to retrieve data from this provider. We do not know if and when it will not be possible anymore thus we recommend to rely on the new functions (_istatdata). Lastly, all of the functions accessing ISTAT web services have been updated to fail gracefully when Internet resources are unavailable or have changed, with informative messages.
I.Stat is the old Istat data warehouse that is still accessible. Functions that retrieve data from I.Stat end with i_stat. Available functions are:
list_i_stat
search_i_stat
get_i_stat
This function allows you to obtain the complete list of available I.Stat data, with their ID and name. Default language is Italian (“ita”), but you can also select English as follows:
head(list_i_stat(lang = "eng"))
#> [rsdmx][INFO] Fetching 'http://sdmx.istat.it/SDMXWS/rest/dataflow/all/all/latest/'
#> ID Name
#> 1 101_1015 Crops
#> 2 101_1030 PDO, PGI and TSG quality products
#> 3 101_1033 slaughtering
#> 4 101_1039 Agritourism - municipalities
#> 5 101_1077 PDO, PGI and TSG products: operators - municipalities data
#> 6 101_12 Agricoltural pricesIf you find the data that you were looking for, take note of its ID: you will need it to download it through get_i_stat.
If you are looking for a specific data set, you can search it by keywords. Let’s suppose you are looking for data about ‘water’. You can search it as follows (as before, default language is Italian) as follows:
search_i_stat("water", lang = "eng")
#>[rsdmx][INFO] Fetching 'http://sdmx.istat.it/SDMXWS/rest/dataflow/all/all/latest/'
#> id name
#> [1,] "12_323" "Urban wastewater treatment plants"
#> [2,] "12_340" "Water abstraction for drinkable use"
#> [3,] "12_60" "Public water supply use" You decide that you want to download “Public water supply use” data set. You will need its id, which is “12_60”, and will be used as an example.
get_i_stat(id_dataset = "12_60",
start_period = NULL,
end_period = NULL,
recent = FALSE,
csv = FALSE,
xlsx = FALSE,
lang = "both")This code downloads the entire data set, without any filter, but you can customize it through the parameters of the function:
start_period: time value for the start (NULL by default).
end_period: time value for the end (NULL bu default).
recent: FALSE by default, if TRUE, the function retrieves data from last 10 years.
csv or xlsx: FALSE by default, if TRUE, the function saves the dataset to directory as .csv/.xlsx.
lang: language parameter for labels (“ita” for Italian, “eng” for English).
cache: TRUE by default. If FALSE, the function retrieves the data set without caching.
update_cache: FALSE by default. If TRUE, the cache is updated.
compress_file: TRUE by default. It compresses the RDS file in caching.
cache_dir: by default it saves the cache directory into the current working directory.
Note that if recent is TRUE, then both start_period and end_period has to be NULL, and viceversa.
IstatData is the new Istat data warehouse. Functions that retrieve data from IstatData end with _istatdata. Available functions are:
list_istatdata
search_istatdata
get_istatdata
This function allows you to obtain the complete list of available IstatData data, with their ID and name. Default language is Italian (“ita”), but you can also select English as follows:
head(list_istatdata(lang = "eng"))
ID Name
#> 1 101_1015 Crops
#> 2 101_1015_DF_DCSP_COLTIVAZIONI_1 Areas and production - overall data
#> 3 101_1015_DF_DCSP_COLTIVAZIONI_10 Sowing forecast
#> 4 101_1015_DF_DCSP_COLTIVAZIONI_2 Areas and production - overall data - provinces
#> 5 101_1030 PDO, PGI and TSG quality products
#> 6 101_1030_DF_DCSP_DOPIGP_1 Operators by sectorIf you find the data that you were looking for, take note of its ID: you will need it to download it through get_istatdata.
If you are looking for a specific data set, you can search it by keywords. Let’s suppose you are looking for data about ‘water’. You can search it as follows (as before, default language is Italian) as follows:
search_istatdata("water", lang = "eng")
#>
#> id name
#> [1,] "12_323_DF_DCCV_IMPDEP_1" "Urban wastewater treatment plants - reg."
#> [2,] "12_323_DF_DCCV_IMPDEP_2" "Urban wastewater treatment plants - ato"
#> [3,] "12_340_DF_DCCV_PRELACQ_1" "Water abstraction for drinkable use"
#> [4,] "12_60_DF_DCCV_CONSACQUA_2" "Public water supply use - municipalities"
#> [5,] "18_635_DF_DCCV_CENERG_8" "Water system - availability, type and source - reg."
#> [6,] "18_635_DF_DCCV_CENERG_9" "Water system - Type of system and energy source"
#> [7,] "609_1_DF_DCCV_URBANENV_1" "Water - consumption"
#> [8,] "609_1_DF_DCCV_URBANENV_2" "Water - rationing"
#> [9,] "82_87_DF_DCCV_AVQ_FAMIGLIE_19" "House costs, water and other problems with the house"
#>[10,] "83_85_DF_DCCV_AVQ_PERSONE1_211" "Water and carbonate beverages - age detail"
#>[11,] "83_85_DF_DCCV_AVQ_PERSONE1_212" "Water and carbonate beverages - age, educational level"
#>[12,] "83_85_DF_DCCV_AVQ_PERSONE1_213" "Water and carbonate beverages - occupational position"
#>[13,] "83_85_DF_DCCV_AVQ_PERSONE1_214" "Water and carbonate beverages - regions and type of municipality"
#>[14,] "9_951_DF_DCCV_CAVE_MIN_4" "Natural mineral waters extracted for production purposes (in units of weight and volume)"You decide that you want to download “Public water supply use - municipalities” data set. You will need its id, which is “12_60_DF_DCCV_CONSACQUA_2”, and it will be used as an example.
get_istatdata(id_dataset = "12_60_DF_DCCV_CONSACQUA_2",
start_period = NULL,
end_period = NULL,
recent = FALSE,
csv = FALSE,
xlsx = FALSE,
lang = "both")This code downloads the entire data set, without any filter, but you can customize it through the parameters of the function:
start_period: time value for the start (NULL by default).
end_period: time value for the end (NULL bu default).
recent: FALSE by default, if TRUE, the function retrieves data from last 10 years.
csv or xlsx: FALSE by default, if TRUE, the function saves the data set to directory as .csv/.xlsx.
lang: language parameter for labels (“ita” for Italian, “eng” for English).
cache: TRUE by default. If FALSE, the function retrieves the data set without caching.
update_cache: FALSE by default. If TRUE, the cache is updated.
compress_file: TRUE by default. It compresses the RDS file in caching.
cache_dir: by default it saves the cache directory into the current working directory.
Note that if recent is TRUE, then both start_period and end_period has to be NULL, and viceversa.
The package offers you the possibility to filter data set through the function filter_istat; filter_istat_interactive is the same function but interactive. To show how they work, we will use ‘iris’ data.
You can filter a data set by selecting the column(s) to filter, and then selecting for which value of the column to filter the data set through datatype. In this example, we filtered for one column:
data(iris)
filter_istat(iris, columns = "Species", datatype = "setosa")
> filter_istat(iris, columns = "Species", datatype = "setosa")
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> ... Now, let’s filter for more than one column:
data(iris)
> filter_istat(iris, columns = c("Species", "Petal.Length"), datatype = c("setosa", "1.5"))
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 4 4.6 3.1 1.5 0.2 setosa
#> 8 5.0 3.4 1.5 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> 11 5.4 3.7 1.5 0.2 setosa
#> 16 5.7 4.4 1.5 0.4 setosa
#> 20 5.1 3.8 1.5 0.3 setosa
#> 22 5.1 3.7 1.5 0.4 setosa
#> 28 5.2 3.5 1.5 0.2 setosa
#> 32 5.4 3.4 1.5 0.4 setosa
#> 33 5.2 4.1 1.5 0.1 setosa
#> 35 4.9 3.1 1.5 0.2 setosa
#> 40 5.1 3.4 1.5 0.2 setosa
#> 49 5.3 3.7 1.5 0.2 setosaAnd for more than one value per column:
> filter_istat(iris, columns = c("Species","Petal.Width"), datatype = list(c("virginica","setosa"), c("0.1","1.9")))
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 10 4.9 3.1 1.5 0.1 setosa
#> 13 4.8 3.0 1.4 0.1 setosa
#> 14 4.3 3.0 1.1 0.1 setosa
#> 33 5.2 4.1 1.5 0.1 setosa
#> 38 4.9 3.6 1.4 0.1 setosa
#> 102 5.8 2.7 5.1 1.9 virginica
#> 112 6.4 2.7 5.3 1.9 virginica
#> 131 7.4 2.8 6.1 1.9 virginica
#> 143 5.8 2.7 5.1 1.9 virginica
#> 147 6.3 2.5 5.0 1.9 virginicaHere, the function filtered the data set ‘iris’ for the values ‘virginica’ and ‘setosa’ of the column ‘Species’ and for the values ‘0.1’ and ‘1.9’ of the column ‘Petal.Width’.
This function works the same as the previous one, with the difference that in this case you will be guided through the filtering process. An example:
> filter_istat_interactive(iris, lang = "eng")
#> Available columns:
#> [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
#> Enter the column(s) (separated by comma): Petal.Width, Species
#> Available values for column Petal.Width :
#> [1] 0.2 0.4 0.3 0.1 0.5 0.6 1.4 1.5 1.3 1.6 1.0 1.1 1.8 1.2 1.7 2.5 1.9 2.1 2.2 2.0 2.4 2.3
#> Enter the chosen values for column Petal.Width (separated by comma): 0.1
#> Available values for column Species :
#> [1] setosa versicolor virginica
#> Levels: setosa versicolor virginica
#> Enter the chosen values for column Species (separated by comma): setosa
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 10 4.9 3.1 1.5 0.1 setosa
#> 13 4.8 3.0 1.4 0.1 setosa
#> 14 4.3 3.0 1.1 0.1 setosa
#> 33 5.2 4.1 1.5 0.1 setosa
#> 38 4.9 3.6 1.4 0.1 setosaThe function plot_interactive allows you to graphically visualize your data, and it is intended to be use with exploratory purposes only. The available plots are:
scatter plot
bar plot
pie chart
shinyIstat is a shiny application which integrates the functions of the istat package in a user friendly interface. This app aims to provide a useful tool to search, get, filter and plot those data sets. Here are the main features:
Use the menu on the left to navigate through the app. Inside each panel you will find further help by simply clicking on the green question marks ?.