Simple projections of the annual 2% samples of Australian Taxation Office tax returns.

project(sample_file, h = 0L, fy.year.of.sample.file = NULL,
  WEIGHT = 50L, excl_vars = NULL, forecast.dots = list(estimator =
  "mean", pred_interval = 80), wage.series = NULL, lf.series = NULL,
  use_age_pop_forecast = FALSE, .recalculate.inflators = FALSE,
  .copyDT = TRUE, check_fy_sample_file = TRUE,
  differentially_uprate_Sw = TRUE)

Arguments

sample_file

A data.table matching a 2% sample file from the ATO. See package taxstats for an example.

h

An integer. How many years should the sample file be projected?

fy.year.of.sample.file

The financial year of sample_file. If NULL, the default, the number is inferred from the number of rows of sample_file to be one of 2012-13, 2013-14, 2014-15, 2015-16, or 2016-17.

WEIGHT

The sample weight for the sample file. (So a 2% file has WEIGHT = 50.)

excl_vars

A character vector of column names in sample_file that should not be inflated. Columns not present in the 2013-14 sample file are not inflated and nor are the columns Ind, Gender, age_range, Occ_code, Partner_status, Region, Lodgment_method, and PHI_Ind.

forecast.dots

A list containing parameters to be passed to generic_inflator.

wage.series

See wage_inflator. Note that the Sw_amt will uprated by differentially_uprate_wage.

lf.series

See lf_inflator_fy.

use_age_pop_forecast

Should the inflation of the number of taxpayers be moderated by the number of resident persons born in a certain year? If TRUE, younger ages will grow at a slightly higher rate beyond 2018 than older ages.

.recalculate.inflators

(logical, default: FALSE. Should generic_inflator() or CG_inflator be called to project the other variables? Adds time.

.copyDT

(logical, default: TRUE) Should a copy() of sample_file be made? If set to FALSE, will update sample_file.

check_fy_sample_file

(logical, default: TRUE) Should fy.year.of.sample.file be checked against sample_file? By default, TRUE, an error is raised if the base is not 2012-13, 2013-14, 2014-15, 2015-16, or 2016-17, and a warning is raised if the number of rows in sample_file is different to the known number of rows in the sample files.

differentially_uprate_Sw

(logical, default: TRUE) Should the salary and wage column (Sw_amt) be differentially uprating using (differentially_uprate_wage)?

Value

A sample file with the same number of rows as sample_file but with inflated values as a forecast for the sample file in to_fy. If WEIGHT is not already a column of sample_file, it will be added and its sum will be the predicted number of taxpayers in to_fy.

Details

Currently components of taxable income are individually inflated based on their historical trends in the ATO sample files, with the exception of:

inflated using differentially_uprate_wage.

Sw_amt

inflated using wage_inflator

Alow_ben_amt, ETP_txbl_amt, Rptbl_Empr_spr_cont_amt, Non_emp_spr_amt, MCS_Emplr_Contr, MCS_Prsnl_Contr, MCS_Othr_Contr

inflated using cpi_inflator

WRE_car_amt, WRE_trvl_amt, WRE_uniform_amt, WRE_self_amt, WRE_other_amt

inflated by lf_inflator_fy

WEIGHT

inflated by CG_inflator

Net_CG_amt, Tot_CY_CG_amt

Superannuation balances are inflated by a fixed rate of 5% p.a.

We recommend you use sample_file_1213 over sample_file_1314, unless you need the superannuation variables, as the latter suggests lower-than-recorded tax collections. However, more recent data is of course preferable.

Examples

# install.packages('taxstats', repos = 'https://hughparsonage.github.io/drat') if (requireNamespace("taxstats", quietly = TRUE) && requireNamespace("data.table", quietly = TRUE)) { library(taxstats) library(data.table) sample_file <- copy(sample_file_1314) sample_file_1617 <- project(sample_file, h = 3L, # to "2016-17" fy.year.of.sample.file = "2013-14") }