How to use fpaR?
Time Intelligence
When you execute a time intelligence function, it will return a “ti” class object with a custom print method that explains what the function is doing and a summary of transformation steps and the calendar attributes.
Simply pass your data in either tibble or a lazy DBI object to the time intelligence function and input the required arguments.
Regardless if you pass a tibble or lazy DBI object, all time intelligence functions will return a lazy DBI object for performance reasons
We will use mtd() function to calculate the month-to-date sum of contoso’s company’ revenue (contoso::sales).
Most time intelligence functions follow the same structure:
- specify the date column to index the time intelligence functions
- specify the value column to aggregate
- if there is a period rollback / rollforward then clarify the number of periods
- clarify if we are using a “standard” calendar or non-standard variation (currently supports 5-5-4)
When you execute mtd(), your console return a ti object and will print a summary of the function’s actions, details the calendar’s attributes, describes the main transformation steps and lists out possible next actions.
contoso::sales |>
fpaR::mtd(.date=order_date,.value = revenue,calendar_type = "standard")
── Month-to-date ───────────────────────────────────────────────────────────────
Function: `mtd` was executed
── Description: ──
This creates a daily `cumsum()` of the current month revenue from the start of
the standard calendar month to the end of the month
── Calendar: ──
• The calendar aggregated order_date to the day time unit
• A standard calendar is created with 0 groups
• Calendar ranges from 2021-05-18 to 2024-04-20
• 222 days were missing and replaced with 0
• New date column date, year and month was created from order_date
── Actions: ──
✔Aggregate revenue
✖Shift
✖Compare
✖Proportion Of Total
✖Count Distinct
── Next Steps: ──
• Use `calculate()` to return the results
────────────────────────────────────────────────────────────────────────────────
To return the results to a lazy tibble of results, pass the ti object through to calculate().
contoso::sales |>
fpaR::mtd(.date=order_date,.value = margin,calendar_type = "standard") |>
fpaR::calculate()
If you using a tibble, under the hood, fpaR is converting your data to a duckdb database.
If your data is in a database, the package will leverage dbplyr to execute all the calculations.
Either case use dplyr::collect() to return your results to a local tibble.
contoso::sales |>
fpaR::mtd(.date=order_date,.value = margin,calendar_type = "standard") |>
fpaR::calculate() |>
dplyr::collect() |>
head(10)
# A tibble: 10 × 7
year month date margin missing_date_indicator mtd_margin
<dbl> <dbl> <date> <dbl> <dbl> <dbl>
1 2022 4 2022-04-01 0 1 0
2 2022 4 2022-04-02 0 1 0
3 2022 4 2022-04-03 0 1 0
4 2022 4 2022-04-04 0 1 0
5 2022 4 2022-04-05 0 1 0
6 2022 4 2022-04-06 0 1 0
7 2022 4 2022-04-07 0 1 0
8 2022 4 2022-04-08 0 1 0
9 2022 4 2022-04-09 0 1 0
10 2022 4 2022-04-10 0 1 0
# ℹ 1 more variable: days_in_current_period <dbl>
What if you need the analysis at the group level?
Simply pass through the groups that you want with dplyr::group_by() and time intelligence function will create a custom calendar for each group level.
This will calculate a complete calendar ensuring each group has a complete calendar with no missing dates.
contoso::sales |>
dplyr::group_by(customer_key,store_key) |>
fpaR::yoy(.date=order_date,.value = margin,calendar_type = "standard")
The functions will work with your database even if you don’t have write permission by creatively leveraging CTEs to create interim tables.
Why do we need this package when we have lubridate?
Lubridate is an excellent package and is heavily used by the package. The issue isn’t lubridate but rather issues you may not be aware of in your package.
Time-based comparisons, such as Year-over-Year (YoY), Quarter-over-Quarter (QoQ), and Month-to-Date (MTD), are common for tracking business performance. However, they come with challenges:
Many datasets do not have continuous dates, especially if data is recorded only on business days or for active transactions
Period imbalances between periods (Eg. the different number of days between February vs. January) can create misleading analysis or trends
Your analysis may need to reference a non-standard calendar such as a 5-5-4, 4-4-5, or 13 month calendar
Your data may be in excel sheets, csv or databases and you need to inter-operable framework to switch between all your data types
Issue 1: Continuous Dates
Referencing the Table 1 below, if we were use dplyr::lag() to compare Day-over-Day (DoD) margin, we would be missing 2024-01-02, 2024-01-04, and 2024-01-05 which would lead to incorrect answers or trends.
To correct this, fpaR will automatically complete your calendar for each group for the missing periods to ensure there are no missing periods when calculating trends.
In Table 2 we see a complete calendar set and a new column, “missing_date_indicator” to indicate how many dates were missing.
Issue 2: Period imbalances
When comparing two performance periods with a standard calendar, you often will compare a period with unequal number of days or periods. For example if you want to compare January sales to February you can get misleading conclusions due to the unequal number of weekends and days in those periods.
In practice you have two choices:
compare periods with similar days (eg. the 28th of February compares should only compare up to the 28th of January) and you omit three days of January sales all together
compare have an imbalanced comparison (eg. the 28th of February compares to the 31st of January so that no days are lost).
this package does the second option which is ensure we don’t loose any of January’s sales but to help flag for imbalance, fpaR will add a column to let you know how many periods (eg. days) are in your comparison period to increase transparency to this dynamic.
To create this example, we will use the pmtd() function to calculate the prior month to date cumulative margin in the current month.
When we pass the ti object through to calculate() and filter the results for February 2022, we would see the below Table 3.
On 2022-02-28, we see that is comparing 31 days of the previous period to the 28 days in the current period.
what is going on under the hood
In financial planning and analysis, it is common to compare year-to-date (YTD) metrics against the previous year’s YTD to understand performance trends.
The function yoytd_fn() provides this functionality. While it is an internal, non-exported function, it is called by [yoytd()] and executed through [calculate()].
This vignette explains what the function does, how it works, and how to work with its output.
Function Purpose
yoytd_fn() performs the following:
- Computes current year-to-date (YTD) values.
- Computes previous year-to-date (PYTD) values, optionally applying a lag.
- Joins YTD and PYTD tables together on
date, year, and grouping variables.
- Fills missing dates downward for consistency.
- Summarizes the data by summing the YTD and PYTD values by group.
The output is a DBI object that can be converted into a tibble using dplyr::collect() for further analysis.