Calculate a lot of statistics defined in a table with various caveats:
series always padded to complete years
the same data threshold for all calculations
to h8gl: only mean from h1
max_gap only by to y1, and definition in days
usage of
default_statisticand_inputs_can contain suprises in the result
All statistics are defined in a table with the columns "parameter", "statistic", "from" and "to". Each row contains
one statistic for one parameter with a basis interval ("from") and the target interval ("to"). The rows are then
grouped with "from", then with "to" and then with "parameter". This results in a list of statistic for each
parameter. This list is compatible with resample(). If no default_statistic is defined,
default_statistic = "drop" is added. Multi-step statistics are possible. _inputs_ can be used as substitute for
default_statistic in multi-step calculation if the input in "from" already contains calculated statistics.
The statstable can be written in a compact form with comma seperated values in each cells. For each value the table
will be expanded and a row added. See statstable_expand()
Usage
calculate_statstable(
  data,
  statstable,
  sep = "\\s*,\\s*",
  keep_input = FALSE,
  data_thresh = 0.8,
  max_gap = 10,
  order = c("input", "h1", "h8gl", "d1", "m1", "y1")
)Arguments
- data
 input data in rolf format
- statstable
 description of statistics to calculate in table form
- sep
 seperator for combined values in statstable
- keep_input
 should the input data be kept in return list as item input. Default FALSE
- data_thresh
 minimum data capture threshold 0 - 1.0 to use. Default 0.8
- max_gap
 in days. Only used in calculation to y1. Set to NULL to disable usage. Default 10 days
- order
 defines the order of calculation in the from column
Examples
# calculate LRV statisitcs
lrv_table <- tibble::tribble(
  ~parameter, ~statistic, ~from, ~to,
  "SO2, NO2, PM10", "mean", "input", "y1",
  "SO2, NO2", "perc95", "input", "y1",
  "O3", "perc98", "input", "m1",
  "O3", "mean", "input", "h1",
  "O3", "n>120", "h1", "y1",
  "SO2, NO2, CO, PM10", "mean", "input", "d1",
  "SO2", "n>100", "d1", "y1",
  "NO2", "n>80", "d1", "y1",
  "CO", "n>8", "d1", "y1",
  "PM10", "n>50", "d1", "y1"
)
fn <- system.file("extdata", "Zch_Stampfenbachstrasse_min30_2017.csv",
                   package = "rOstluft.data", mustWork = TRUE)
data <- read_airmo_csv(fn)
# convert volume concentrations to mass concentrations
data <- calculate_mass_concentrations(data)
stats <- calculate_statstable(data, lrv_table)
# we are only interested in the m1 and y1 results
stats <- dplyr::bind_rows(stats$y1, stats$m1)
stats
#> # A tibble: 22 × 6
#>    starttime           site                    parameter    interval unit  value
#>    <dttm>              <fct>                   <fct>        <fct>    <fct> <dbl>
#>  1 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2          y1       µg/m3 30.4 
#>  2 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2_95%_min… y1       µg/m3 68.8 
#>  3 2017-01-01 00:00:00 Zch_Stampfenbachstrasse PM10         y1       µg/m3 15.9 
#>  4 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2          y1       µg/m3  1.05
#>  5 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2_95%_min… y1       µg/m3  2.17
#>  6 2017-01-01 00:00:00 Zch_Stampfenbachstrasse O3_nb_h1>120 y1       1     81   
#>  7 2017-01-01 00:00:00 Zch_Stampfenbachstrasse CO_nb_d1>8   y1       1      0   
#>  8 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2_nb_d1>80 y1       1      1   
#>  9 2017-01-01 00:00:00 Zch_Stampfenbachstrasse PM10_nb_d1>… y1       1      8   
#> 10 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2_nb_d1>1… y1       1      0   
#> # ℹ 12 more rows
# calculate clima indicators
clima_table <- tibble::tribble(
   ~parameter, ~statistic, ~from, ~to,
   "T", "mean", "input", "h1",
   "T", "max, min", "h1", "d1",
   "T_max_h1", "Sommertage, Hitzetage, Eistage", "d1", "y1",
   "T_min_h1", "Tropennächte, Frosttage", "d1", "y1",
)
clima_stats <- calculate_statstable(data, clima_table)
clima_stats$y1
#> # A tibble: 5 × 6
#>   starttime           site                    parameter    interval unit  value
#>   <dttm>              <fct>                   <fct>        <fct>    <fct> <dbl>
#> 1 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Sommertage   y1       1        66
#> 2 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Hitzetage    y1       1        21
#> 3 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Eistage      y1       1        13
#> 4 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Tropennächte y1       1        14
#> 5 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Frosttage    y1       1        38
