Showing posts with label scoping. Show all posts
Showing posts with label scoping. Show all posts

Friday, September 21, 2018

curve3d can't find local function “fn”

Leave a Comment

I'm trying to use the curve3d function in the emdbook-package to create a contour plot of a function defined locally inside another function as shown in the following minimal example:

library(emdbook) testcurve3d <- function(a) {   fn <- function(x,y) {     x*y*a   }   curve3d(fn(x,y)) } 

Unexpectedly, this generates the error

> testcurve3d(2)  Error in fn(x, y) : could not find function "fn"  

whereas the same idea works fine with the more basic curve function of the base-package:

testcurve <- function(a) {   fn <- function(x) {     x*a   }   curve(a*x) } testcurve(2) 

The question is how curve3d can be rewritten such that it behaves as expected.

3 Answers

Answers 1

You can temporarily attach the function environment to the search path to get it to work:

testcurve3d <- function(a) {   fn <- function(x,y) {     x*y*a   }   e <- environment()   attach(e)   curve3d(fn(x,y))   detach(e) } 

Analysis

The problem comes from this line in curve3d:

eval(expr, envir = env, enclos = parent.frame(2)) 

At this point, we appear to be 10 frames deep, and fn is defined in parent.frame(8). So you can edit the line in curve3d to use that, but I'm not sure how robust this is. Perhaps parent.frame(sys.nframe()-2) might be more robust, but as ?sys.parent warns there can be some strange things going on:

Strictly, sys.parent and parent.frame refer to the context of the parent interpreted function. So internal functions (which may or may not set contexts and so may or may not appear on the call stack) may not be counted, and S3 methods can also do surprising things.

Beware of the effect of lazy evaluation: these two functions look at the call stack at the time they are evaluated, not at the time they are called. Passing calls to them as function arguments is unlikely to be a good idea.

Answers 2

The eval - parse solution bypasses some worries about variable scope. This passes the value of both the variable and function directly as opposed to passing the variable or function names.

library(emdbook)  testcurve3d <- function(a) {   fn <- eval(parse(text = paste0(     "function(x, y) {",     "x*y*", a,     "}"   )))    eval(parse(text = paste0(     "curve3d(", deparse(fn)[3], ")"     ))) }  testcurve3d(2) 

result

Answers 3

I have found other solution that I do not like very much, but maybe it will help you.

You can create the function fn how a call object and eval this in curve3d:

fn <- quote((function(x, y) {x*y*a})(x, y)) eval(call("curve3d", fn)) 

Inside of the other function, the continuous problem exists, a must be in the global environment, but it is can fix with substitute.

Example:

testcurve3d <- function(a) {   fn <- substitute((function(x, y) {                       c <- cos(a*pi*x)                       s <- sin(a*pi*y/3)                       return(c + s)                       })(x, y), list(a = a))   eval(call("curve3d", fn, zlab = "fn")) }  par(mfrow = c(1, 2)) testcurve3d(2) testcurve3d(5) 

enter image description here

Read More

Thursday, April 7, 2016

Using pre-defined variable names within dplyr utility functions

Leave a Comment

Goal

My goal is to define some functions for use within dplyr verbs, that use pre-defined variables. This is because I have some of these functions that take a bunch of arguments, of which many always are the same variable names.

My understanding: This is difficult (and perhaps impossible) because dplyr will lazily evaluate user-specified variables later on, but any default arguments are not in the function call and therefore invisible to dplyr.

Toy example

Consider the following example, where I use dplyr to calculate whether a variable has changed or not (rather meaningless in this case):

library(dplyr) mtcars  %>%   mutate(cyl_change = cyl != lag(cyl)) 

Now, lag also supports alternate ordering like so:

mtcars  %>%   mutate(cyl_change = cyl != lag(cyl, order_by = gear)) 

But what if I'd like to create my own version of lag that always orders by gear?

Failed attempts

The naive approach is this:

lag2 <- function(x, n = 1L, order_by = gear) lag(x, n = n, order_by = order_by)  mtcars %>%   mutate(cyl_change = cyl != lag2(cyl)) 

But this obviously raises the error:

no object named ‘gear’ was found

More realistic options would be these, but they also don't work:

lag2 <- function(x, n = 1L) lag(x, n = n, order_by = ~gear) lag2 <- function(x, n = 1L) lag(x, n = n, order_by = get(gear)) lag2 <- function(x, n = 1L) lag(x, n = n, order_by = getAnywhere(gear)) lag2 <- function(x, n = 1L) lag(x, n = n, order_by = lazyeval::lazy(gear)) 

Question

Is there a way to get lag2 to correctly find gear within the data.frame that dplyr is operating on?

  • One should be able to call lag2 without having to provide gear.
  • One should be able to use lag2 on datasets that are not called mtcars (but do have gear as one it's variables).
  • Preferably gear would be a default argument to the function, so it can still be changed if required, but this is not crucial.

3 Answers

Answers 1

This isn't elegant, as it requires an extra argument. But, by passing the entire data frame we get nearly the required behavior

lag2 <- function(x, df, n = 1L, order_by="gear", ...) {   lag(x, n = n, order_by = df[[order_by]], ...) }  hack <- mtcars  %>%  mutate(cyl_change = cyl != lag2(cyl, .)) ans <- mtcars  %>%  mutate(cyl_change = cyl != lag(cyl, order_by = gear)) all.equal(hack, ans) # [1] TRUE 
  1. One should be able to call lag2 without having to provide gear.

yes, but need to pass .

  1. One should be able to use lag2 on datasets that are not called mtcars (but do have gear as one it's variables).

ok, as far as point 1 goes

  1. Preferably gear would be a default argument to the function, so it can still be changed if required, but this is not crucial.

yes, but need to provide a character, see below. I'm guessing you could get around this with some lazyeval

hack <- df %>%  mutate(cyl_change = cyl != lag2(cyl, ., order_by = 'cyl')) ans_nondefault <- df %>%  mutate(cyl_change = cyl != lag(cyl, order_by = cyl)) all.equal(hack, ans_nondefault) # [1] TRUE 

Addendum

It seems hard to avoid using SE mutate_ as in the answer posed by the OP, to do some simple hackery like in my answer here, or to do something more advanced involving reverse-engineering lazyeval::lazy_dots.

Evidence:

1) dplyr::lag itself doesn't use any NSE wizardry

2) mutate simply calls mutate_(.data, .dots = lazyeval::lazy_dots(...))

Answers 2

Here are two approaches in data.table, however I don't believe that either of them will work in dplyr at the present.

In data.table, whatever is inside the j-expression (aka the 2nd argument of [.data.table) gets parsed by the data.table package first, and not by regular R parser. In a way you can think of it as a separate language parser living inside the regular language parser that is R. What this parser does, is it looks for what variables you have used that are actually columns of the data.table you're operating on, and whatever it finds it puts it in the environment of the j-expression.

What this means, is that you have to let this parser know somehow that gear will be used, or it simply will not be part of the environment. Following are two ideas for accomplishing that.

The "simple" way to do it, is to actually use the column name in the j-expression where you call lag2 (in addition to some monkeying within lag2):

dt = as.data.table(mtcars)  lag2 = function(x) lag(x, order_by = get('gear', sys.frame(4)))  dt[, newvar := {gear; lag2(cyl)}] # or dt[, newvar := {.SD; lag2(cyl)}] 

This solution has 2 undesirable properties imo - first, I'm not sure how fragile that sys.frame(4) is - you put this thing in a function or a package and I don't know what will happen. You can probably work around it and figure out the right frame, but it's kind of a pain. Second - you either have to mention the particular variable you're interested in, anywhere in the expression, or dump all of them in the environment by using .SD, again anywhere.

A second option that I like more, is to take advantage of the fact that the data.table parser evaluates eval expressions in place before the variable lookup, so if you use a variable inside some expression that you eval, that would work:

lag3 = quote(function(x) lag(x, order_by = gear))  dt[, newvar := eval(lag3)(cyl)] 

This doesn't suffer from the issues of the other solution, with the obvious disadvantage of having to type an extra eval.

Answers 3

This solution is coming close:

Consider a slightly easier toy example:

mtcars %>%   mutate(carb2 = lag(carb, order_by = gear)) 

We still use lag and it's order_by argument, but don't do any further computation with it. Instead of sticking to the SE mutate, we switch to NSE mutate_ and make lag2 build a function call as a character vector.

lag2 <- function(x, n = 1, order_by = gear) {   x <- deparse(substitute(x))   order_by <- deparse(substitute(order_by))   paste0('dplyr::lag(x = ', x, ', n = ', n, ', order_by = ', order_by, ')') }  mtcars %>%   mutate_(carb2 = lag2(carb)) 

This gives us an identical result to the above.

The orginial toy example can be achieved with:

mtcars %>%   mutate_(cyl_change = paste('cyl !=', lag2(cyl))) 

Downsides:

  1. We have to use the SE mutate_.
  2. For extended usage as in the original example we need to also use paste.
  3. This is not particularly safe, i.e. it is not immediately clear where gear should come from. Assigning values to gear or carb in the global environment seems to be ok, but my guess is that unexpected bugs could occur in some cases. Using a formula instead of a character vector would be safer, but this requires the correct environment to be assigned for it to work, and that is still a big question mark for me.
Read More