Hello,
I have been trying to find a good documentation on how to avoid potential scoping issues.
For instance, consider
library("data.table")
dt = data.table(a = 1:3, b = 4:6)
dt[a %in% 1:3]
So far so good, a is interpreted within the dt. But what if we set a = 4:6?
Documentation often talks about the . and .. calls, but those are specific to j and not i part of data.table.
obviously wouldn't have desired effect.
But the Introduction to data.table doesn't really offer a solution.
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html
Most stackoverflow or AI answers are incorrect, often they will say to use the get() solution, but it doesn't work either.
The solution is introduced in https://cran.r-project.org/web/packages/data.table/vignettes/datatable-programming.html, use the env = list(...), such as:
dt[col %in% value, env = list(col = "a", value = a)
Yet, in some cases, wrapping in I is also required if the value is of character, because it would be interpreted as a column.
dt[] = lapply(dt, as.character)
a = "1"
dt[col %in% value, env = list(col = "a", value = a))] # fails because "1" is not a column
dt[col %in% value, env = list(col = "a", value = I(a))] # works
dt[col %in% value, env = list(col = as.name("a"), value = I(a))] # safest?
All these things are imo required when he user wants to be specific, like this variable names a column in a data.table, while this_one names a variable comming from outer environment. This way wires won't be crossed.
This should IMO be in the "introduction to data.table" as a simple case or much more complex "Programming on data.table`
Apparently, env is a new interface and get(), mget() etc. were at one point interfaces of data.table but were discontinued (likely because they were buggy).
https://stackoverflow.com/a/54800108
tl,dr: Add a note in "Introduction to data.table" about the env interface and how to pass column names/values as variables.
Hello,
I have been trying to find a good documentation on how to avoid potential scoping issues.
For instance, consider
So far so good,
ais interpreted within thedt. But what if we seta = 4:6?Documentation often talks about the
.and..calls, but those are specific tojand notipart of data.table.obviously wouldn't have desired effect.
But the Introduction to data.table doesn't really offer a solution.
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html
Most stackoverflow or AI answers are incorrect, often they will say to use the
get()solution, but it doesn't work either.The solution is introduced in https://cran.r-project.org/web/packages/data.table/vignettes/datatable-programming.html, use the
env = list(...), such as:Yet, in some cases, wrapping in
Iis also required if the value is of character, because it would be interpreted as a column.All these things are imo required when he user wants to be specific, like this variable names a column in a data.table, while this_one names a variable comming from outer environment. This way wires won't be crossed.
This should IMO be in the "introduction to data.table" as a simple case or much more complex "Programming on data.table`
Apparently,
envis a new interface andget(),mget()etc. were at one point interfaces ofdata.tablebut were discontinued (likely because they were buggy).https://stackoverflow.com/a/54800108
tl,dr: Add a note in "Introduction to data.table" about the env interface and how to pass column names/values as variables.