```Econometrics and Free Software by Bruno Rodrigues.
Follow me on Mastodon, twitter, or check out my Github.
Check out my package that adds logging to R functions, {chronicler}.
Or read my free ebooks, to learn some R and build reproducible analytical pipelines..
You can also watch my youtube channel or find the slides to the talks I've given here.
Buy me a coffee, my kids don't let me sleep.
```

# Mapping a list of functions to a list of datasets with a list of columns as arguments

R

This week I had the opportunity to teach R at my workplace, again. This course was the “advanced R” course, and unlike the one I taught at the end of last year, I had one more day (so 3 days in total) where I could show my colleagues the joys of the `tidyverse` and R.

To finish the section on programming with R, which was the very last section of the whole 3 day course I wanted to blow their minds; I had already shown them packages from the `tidyverse` in the previous days, such as `dplyr`, `purrr` and `stringr`, among others. I taught them how to use `ggplot2`, `broom` and `modelr`. They also liked `janitor` and `rio` very much. I noticed that it took them a bit more time and effort for them to digest `purrr::map()` and `purrr::reduce()`, but they all seemed to see how powerful these functions were. To finish on a very high note, I showed them the ultimate `purrr::map()` use case.

Consider the following; imagine you have a situation where you are working on a list of datasets. These datasets might be the same, but for different years, or for different countries, or they might be completely different datasets entirely. If you used `rio::import_list()` to read them into R, you will have them in a nice list. Let’s consider the following list as an example:

``library(tidyverse)``
``````data(mtcars)
data(iris)

data_list = list(mtcars, iris)``````

I made the choice to have completely different datasets. Now, I would like to map some functions to the columns of these datasets. If I only worked on one, for example on `mtcars`, I would do something like:

``````my_summarise_f = function(dataset, cols, funcs){
dataset %>%
summarise_at(vars(!!!cols), funs(!!!funcs))
}``````

And then I would use my function like so:

``````mtcars %>%
my_summarise_f(quos(mpg, drat, hp), quos(mean, sd, max))``````
``````##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335``````

`my_summarise_f()` takes a dataset, a list of columns and a list of functions as arguments and uses tidy evaluation to apply `mean()`, `sd()`, and `max()` to the columns `mpg`, `drat` and `hp` of `mtcars`. That’s pretty useful, but not useful enough! Now I want to apply this to the list of datasets I defined above. For this, let’s define the list of columns I want to work on:

``````cols_mtcars = quos(mpg, drat, hp)
cols_iris = quos(Sepal.Length, Sepal.Width)

cols_list = list(cols_mtcars, cols_iris)``````

Now, let’s use some `purrr` magic to apply the functions I want to the columns I have defined in `list_cols`:

``````map2(data_list,
cols_list,
my_summarise_f, funcs = quos(mean, sd, max))``````
``````## [[1]]
##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335
##
## [[2]]
##   Sepal.Length_mean Sepal.Width_mean Sepal.Length_sd Sepal.Width_sd
## 1          5.843333         3.057333       0.8280661      0.4358663
##   Sepal.Length_max Sepal.Width_max
## 1              7.9             4.4``````

That’s pretty useful, but not useful enough! I want to also use different functions to different datasets!

Well, let’s define a list of functions then:

``````funcs_mtcars = quos(mean, sd, max)
funcs_iris = quos(median, min)

funcs_list = list(funcs_mtcars, funcs_iris)``````

Because there is no `map3()`, we need to use `pmap()`:

``````pmap(
list(
dataset = data_list,
cols = cols_list,
funcs = funcs_list
),
my_summarise_f)``````
``````## [[1]]
##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335
##
## [[2]]
##   Sepal.Length_median Sepal.Width_median Sepal.Length_min Sepal.Width_min
## 1                 5.8                  3              4.3               2``````

Now I’m satisfied! Let me tell you, this blew their minds 😄!

To be able to use things like that, I told them to always solve a problem for a single example, and from there, try to generalize their solution using functional programming tools found in `purrr`.

If you found this blog post useful, you might want to follow me on twitter for blog post updates.