Skip to contents

Calculate a lot of statistics defined in a table with various caveats:

  • series always padded to complete years

  • the same data threshold for all calculations

  • to h8gl: only mean from h1

  • max_gap only by to y1, and definition in days

  • usage of default_statistic and _inputs_ can contain suprises in the result

All statistics are defined in a table with the columns "parameter", "statistic", "from" and "to". Each row contains one statistic for one parameter with a basis interval ("from") and the target interval ("to"). The rows are then grouped with "from", then with "to" and then with "parameter". This results in a list of statistic for each parameter. This list is compatible with resample(). If no default_statistic is defined, default_statistic = "drop" is added. Multi-step statistics are possible. _inputs_ can be used as substitute for default_statistic in multi-step calculation if the input in "from" already contains calculated statistics. The statstable can be written in a compact form with comma seperated values in each cells. For each value the table will be expanded and a row added. See statstable_expand()

Usage

calculate_statstable(
  data,
  statstable,
  sep = "\\s*,\\s*",
  keep_input = FALSE,
  data_thresh = 0.8,
  max_gap = 10,
  order = c("input", "h1", "h8gl", "d1", "m1", "y1")
)

Arguments

data

input data in rolf format

statstable

description of statistics to calculate in table form

sep

seperator for combined values in statstable

keep_input

should the input data be kept in return list as item input. Default FALSE

data_thresh

minimum data capture threshold 0 - 1.0 to use. Default 0.8

max_gap

in days. Only used in calculation to y1. Set to NULL to disable usage. Default 10 days

order

defines the order of calculation in the from column

Value

list with one item for every to interval

Examples

# calculate LRV statisitcs
lrv_table <- tibble::tribble(
  ~parameter, ~statistic, ~from, ~to,
  "SO2, NO2, PM10", "mean", "input", "y1",
  "SO2, NO2", "perc95", "input", "y1",
  "O3", "perc98", "input", "m1",
  "O3", "mean", "input", "h1",
  "O3", "n>120", "h1", "y1",
  "SO2, NO2, CO, PM10", "mean", "input", "d1",
  "SO2", "n>100", "d1", "y1",
  "NO2", "n>80", "d1", "y1",
  "CO", "n>8", "d1", "y1",
  "PM10", "n>50", "d1", "y1"
)

fn <- system.file("extdata", "Zch_Stampfenbachstrasse_min30_2017.csv",
                   package = "rOstluft.data", mustWork = TRUE)

data <- read_airmo_csv(fn)

# convert volume concentrations to mass concentrations
data <- calculate_mass_concentrations(data)

stats <- calculate_statstable(data, lrv_table)

# we are only interested in the m1 and y1 results
stats <- dplyr::bind_rows(stats$y1, stats$m1)
stats
#> # A tibble: 22 × 6
#>    starttime           site                    parameter    interval unit  value
#>    <dttm>              <fct>                   <fct>        <fct>    <fct> <dbl>
#>  1 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2          y1       µg/m3 30.4 
#>  2 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2_95%_min… y1       µg/m3 68.8 
#>  3 2017-01-01 00:00:00 Zch_Stampfenbachstrasse PM10         y1       µg/m3 15.9 
#>  4 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2          y1       µg/m3  1.05
#>  5 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2_95%_min… y1       µg/m3  2.17
#>  6 2017-01-01 00:00:00 Zch_Stampfenbachstrasse O3_nb_h1>120 y1       1     81   
#>  7 2017-01-01 00:00:00 Zch_Stampfenbachstrasse CO_nb_d1>8   y1       1      0   
#>  8 2017-01-01 00:00:00 Zch_Stampfenbachstrasse NO2_nb_d1>80 y1       1      1   
#>  9 2017-01-01 00:00:00 Zch_Stampfenbachstrasse PM10_nb_d1>… y1       1      8   
#> 10 2017-01-01 00:00:00 Zch_Stampfenbachstrasse SO2_nb_d1>1… y1       1      0   
#> # ℹ 12 more rows

# calculate clima indicators
clima_table <- tibble::tribble(
   ~parameter, ~statistic, ~from, ~to,
   "T", "mean", "input", "h1",
   "T", "max, min", "h1", "d1",
   "T_max_h1", "Sommertage, Hitzetage, Eistage", "d1", "y1",
   "T_min_h1", "Tropennächte, Frosttage", "d1", "y1",
)
clima_stats <- calculate_statstable(data, clima_table)
clima_stats$y1
#> # A tibble: 5 × 6
#>   starttime           site                    parameter    interval unit  value
#>   <dttm>              <fct>                   <fct>        <fct>    <fct> <dbl>
#> 1 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Sommertage   y1       1        66
#> 2 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Hitzetage    y1       1        21
#> 3 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Eistage      y1       1        13
#> 4 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Tropennächte y1       1        14
#> 5 2017-01-01 00:00:00 Zch_Stampfenbachstrasse Frosttage    y1       1        38