Add a column of ntiles to a data table

mutate_ntile(DT, col, n, weights = NULL, by = NULL, keyby = NULL,
  new.col = NULL, character.only = FALSE, overwrite = TRUE,
  check.na = FALSE)

Arguments

DT

A data.table.

col

The column name (quoted or unquoted) for which quantiles are desired.

n

A positive integer, the number of groups to split col.

weights

If NULL, the default, use unweighted quantiles. Otherwise, a string designating the column that is passed to weighted_ntile.

by, keyby

Produce a grouped quantile column, as in data.table. keyby will set a key on the result (i.e. order by keyby).

new.col

If not NULL, the name of the column to be added. If NULL (the default) a name will be inferred from n. (For example, n = 100 will be <col>Percentile).

character.only

(logical, default: FALSE) Do not contemplate col to be an unquoted column name.

overwrite

(logical, default: TRUE) If TRUE and new.col already exists in DT, the column will be overwritten. If FALSE, attempting to overwrite an existing column is an error.

check.na

(logical, default: FALSE) If TRUE, NAs in DT[[col]] will throw an error. If NA's are present, the corresponding n-tile may take any value.

Value

DT with a new integer column new.col containing the quantiles. If DT is not a data.table its class may be preserved unless keyby is used, where it will always be a data.table.

Examples

library(data.table) DT <- data.table(x = 1:20, y = 2:1) mutate_ntile(DT, "x", n = 10)
#> x y xDecile #> 1: 1 2 1 #> 2: 2 1 1 #> 3: 3 2 2 #> 4: 4 1 2 #> 5: 5 2 3 #> 6: 6 1 3 #> 7: 7 2 4 #> 8: 8 1 4 #> 9: 9 2 5 #> 10: 10 1 5 #> 11: 11 2 6 #> 12: 12 1 6 #> 13: 13 2 7 #> 14: 14 1 7 #> 15: 15 2 8 #> 16: 16 1 8 #> 17: 17 2 9 #> 18: 18 1 9 #> 19: 19 2 10 #> 20: 20 1 10
mutate_ntile(DT, "x", n = 5)
#> x y xDecile xQuintile #> 1: 1 2 1 1 #> 2: 2 1 1 1 #> 3: 3 2 2 1 #> 4: 4 1 2 1 #> 5: 5 2 3 2 #> 6: 6 1 3 2 #> 7: 7 2 4 2 #> 8: 8 1 4 2 #> 9: 9 2 5 3 #> 10: 10 1 5 3 #> 11: 11 2 6 3 #> 12: 12 1 6 3 #> 13: 13 2 7 4 #> 14: 14 1 7 4 #> 15: 15 2 8 4 #> 16: 16 1 8 4 #> 17: 17 2 9 5 #> 18: 18 1 9 5 #> 19: 19 2 10 5 #> 20: 20 1 10 5
mutate_ntile(DT, "x", n = 10, by = "y")
#> x y xDecile xQuintile #> 1: 1 2 1 1 #> 2: 2 1 1 1 #> 3: 3 2 2 1 #> 4: 4 1 2 1 #> 5: 5 2 3 2 #> 6: 6 1 3 2 #> 7: 7 2 4 2 #> 8: 8 1 4 2 #> 9: 9 2 5 3 #> 10: 10 1 5 3 #> 11: 11 2 6 3 #> 12: 12 1 6 3 #> 13: 13 2 7 4 #> 14: 14 1 7 4 #> 15: 15 2 8 4 #> 16: 16 1 8 4 #> 17: 17 2 9 5 #> 18: 18 1 9 5 #> 19: 19 2 10 5 #> 20: 20 1 10 5
mutate_ntile(DT, "x", n = 10, keyby = "y")
#> x y xDecile xQuintile #> 1: 2 1 1 1 #> 2: 4 1 2 1 #> 3: 6 1 3 2 #> 4: 8 1 4 2 #> 5: 10 1 5 3 #> 6: 12 1 6 3 #> 7: 14 1 7 4 #> 8: 16 1 8 4 #> 9: 18 1 9 5 #> 10: 20 1 10 5 #> 11: 1 2 1 1 #> 12: 3 2 2 1 #> 13: 5 2 3 2 #> 14: 7 2 4 2 #> 15: 9 2 5 3 #> 16: 11 2 6 3 #> 17: 13 2 7 4 #> 18: 15 2 8 4 #> 19: 17 2 9 5 #> 20: 19 2 10 5
y <- "x" DT <- data.table(x = 1:20, y = 2:1) mutate_ntile(DT, y, n = 5) # Use DT$y
#> Warning: Interpreting `col = y` as column `DT[['y']]`, not column `DT[['x']]`, despite an extant object y.
#> x y yQuintile #> 1: 1 2 3 #> 2: 2 1 1 #> 3: 3 2 3 #> 4: 4 1 1 #> 5: 5 2 4 #> 6: 6 1 1 #> 7: 7 2 4 #> 8: 8 1 1 #> 9: 9 2 4 #> 10: 10 1 2 #> 11: 11 2 4 #> 12: 12 1 2 #> 13: 13 2 5 #> 14: 14 1 2 #> 15: 15 2 5 #> 16: 16 1 2 #> 17: 17 2 5 #> 18: 18 1 3 #> 19: 19 2 5 #> 20: 20 1 3
mutate_ntile(DT, y, n = 5, character.only = TRUE) # Use DT$x
#> x y yQuintile xQuintile #> 1: 1 2 3 1 #> 2: 2 1 1 1 #> 3: 3 2 3 1 #> 4: 4 1 1 1 #> 5: 5 2 4 2 #> 6: 6 1 1 2 #> 7: 7 2 4 2 #> 8: 8 1 1 2 #> 9: 9 2 4 3 #> 10: 10 1 2 3 #> 11: 11 2 4 3 #> 12: 12 1 2 3 #> 13: 13 2 5 4 #> 14: 14 1 2 4 #> 15: 15 2 5 4 #> 16: 16 1 2 4 #> 17: 17 2 5 5 #> 18: 18 1 3 5 #> 19: 19 2 5 5 #> 20: 20 1 3 5