Skip to contents

Generic padding function. Generates a sequence from lubridate::floor(start_date, "year") to lubridate::ceil(end_date, "year"). The last point is excluded if end_date != max(data[[date_col]]). Under the hood the heavy lifting is done by tidyr::complete().

Usage

pad_to_year(
  data,
  date_col,
  interval,
  fill = list(),
  start_date = NULL,
  end_date = NULL
)

Arguments

data

input data

date_col

column containing date information, every date should be unique

interval

interval between two dates

fill

A named list that for each variable supplies a single value to use instead of NA for missing combinations.

start_date

optional start_date instead of min(data[[date_col]])

end_date

optional start_date instead of max(data[[date_col]])

Value

padded data

Examples

fn <- rOstluft.data::f("Zch_Stampfenbachstrasse_min30_2013_Jan.csv")
data <- rOstluft::read_airmo_csv(fn)
data <- rOstluft::rolf_to_openair(data)

# last data point is at 2013-01-31 23:30:00
tail(data)
#> # A tibble: 6 × 16
#>   date                site          CO    Hr    NO   NO2   NOx    O3     p  PM10
#>   <dttm>              <fct>      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2013-01-31 21:00:00 Zch_Stamp… 0.191  67.3 0.675  7.70  4.57  71.0  970.  7.21
#> 2 2013-01-31 21:30:00 Zch_Stamp… 0.195  64.9 0.359  7.72  4.33  69.7  970.  4.89
#> 3 2013-01-31 22:00:00 Zch_Stamp… 0.191  65.1 0.424  6.84  3.92  69.0  970.  6.71
#> 4 2013-01-31 22:30:00 Zch_Stamp… 0.184  67.3 0.353  5.38  3.09  70.5  970.  5.19
#> 5 2013-01-31 23:00:00 Zch_Stamp… 0.186  67.3 0.634  5.87  3.58  70.2  969.  5.79
#> 6 2013-01-31 23:30:00 Zch_Stamp… 0.189  68.7 0.435  6.76  3.88  67.6  969.  7.92
#> # ℹ 6 more variables: RainDur <dbl>, SO2 <dbl>, StrGlo <dbl>, T <dbl>,
#> #   wd <dbl>, ws <dbl>

# the site column get filled with NA, this could lead to problems
tail(pad_to_year(data, date, "30 min"))
#> # A tibble: 6 × 16
#>   date                site     CO    Hr    NO   NO2   NOx    O3     p  PM10
#>   <dttm>              <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2013-12-31 21:00:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> 2 2013-12-31 21:30:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> 3 2013-12-31 22:00:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> 4 2013-12-31 22:30:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> 5 2013-12-31 23:00:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> 6 2013-12-31 23:30:00 NA       NA    NA    NA    NA    NA    NA    NA    NA
#> # ℹ 6 more variables: RainDur <dbl>, SO2 <dbl>, StrGlo <dbl>, T <dbl>,
#> #   wd <dbl>, ws <dbl>

# better to provide a fill value, for more complex cases use pad_to_year_fill()
tail(pad_to_year(data, date, "30 min", fill = list(site = "Zch_Stampfenbachstrasse")))
#> # A tibble: 6 × 16
#>   date                site          CO    Hr    NO   NO2   NOx    O3     p  PM10
#>   <dttm>              <fct>      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2013-12-31 21:00:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> 2 2013-12-31 21:30:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> 3 2013-12-31 22:00:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> 4 2013-12-31 22:30:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> 5 2013-12-31 23:00:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> 6 2013-12-31 23:30:00 Zch_Stamp…    NA    NA    NA    NA    NA    NA    NA    NA
#> # ℹ 6 more variables: RainDur <dbl>, SO2 <dbl>, StrGlo <dbl>, T <dbl>,
#> #   wd <dbl>, ws <dbl>