vignettes/datatable-implementation.Rmd
datatable-implementation.Rmd
Generally, income taxes are parmeterized by tax tables. In Australia, for the 2017-18 financial year, the table looks like
dollar <- function(x) paste0("$", prettyNum(x, big.mark = ","))
grattan:::tax_table2[fy_year == "2017-18",
.(`Lower bracket` = dollar(lower_bracket),
`Tax payable` = dollar(tax_at),
`Marginal rate` = marginal_rate)] %>%
kable(align = "r")
Lower bracket | Tax payable | Marginal rate |
---|---|---|
$0 | $0 | 0.000 |
$18,200 | $0 | 0.190 |
$37,000 | $3,572 | 0.325 |
$87,000 | $19,822 | 0.370 |
$180,000 | $54,232 | 0.450 |
To calculate the income tax for a given income find the next lowest value in the Lower bracket
column, then add the tax payable and difference between the lower bracket value and the income multiplied by the marginal rate. Originally, we implemented it like this:
income <- 50e3
fy_year <- "2017-18"
ifelse(fy_year == '2017-18',
ifelse(income < 18200,
0,
ifelse(income < 37e3,
0 + 0.19 * (income - 18200),
ifelse(income < 87e3,
3572 + 0.325 * (income - 37000),
ifelse(income < 180e3,
19822 + 0.37 * (income - 87e3),
54232 + 0.45 * (income - 180e3))))),
stop("Not yet implemented."))
## [1] 7797
There were some problems, however. One is that it’s difficult to update or modify. But perhaps more importantly is that it’s not particularly literate: the code merely ‘happens’ to give the right answer, rather than expressing the method too.
A more natural and expressive approach is to treat the income tax as a rolling join.
input <- data.table(income = income)
tax_table2 <- copy(grattan:::tax_table2)
# Record the order if needed
input[, the_order := .I]
input[, fy_year := "2017-18"]
setkey(input, fy_year, income)
tax_table2[input, roll = TRUE] %>%
.[, tax := tax_at + (income - lower_bracket) * marginal_rate] %>%
.[order(the_order)]
## fy_year income lower_bracket marginal_rate tax_at the_order tax
## 1: 2017-18 50000 37000 0.325 3572 1 7797
With this approach, we can just append new tax years or proposed tax rates to tax_table2
, rather than pipetting magic numbers into a nest of ifelse
s. The code and the data are separate and both are easier to manage.