About Me Blog
Dealing with heteroskedasticity; regression with robust standard errors using R Exporting editable plots from R to Powerpoint: making ggplot2 purrr with officer Forecasting my weight with R Getting data from pdfs using the pdftools package Going from a human readable Excel file to a machine-readable csv with {tidyxl} How Luxembourguish residents spend their time: a small {flexdashboard} demo using the Time use survey data Imputing missing values in parallel using {furrr} Missing data imputation and instrumental variables regression: the tidy approach The year of the GNU+Linux desktop is upon us: using user ratings of Steam Play compatibility to play around with regex and the tidyverse {pmice}, an experimental package for missing data imputation in parallel using {mice} and {furrr} Building formulae Functional peace of mind Get basic summary statistics for all the variables in a data frame Getting {sparklyr}, {h2o}, {rsparkling} to work together and some fun with bash Importing 30GB of data into R with sparklyr Introducing brotools It's lists all the way down It's lists all the way down, part 2: We need to go deeper Keep trying that api call with purrr::possibly() Lesser known dplyr 0.7* tricks Lesser known dplyr tricks Lesser known purrr tricks Make ggplot2 purrr Mapping a list of functions to a list of datasets with a list of columns as arguments Predicting job search by training a random forest on an unbalanced dataset Teaching the tidyverse to beginners Why I find tidyeval useful tidyr::spread() and dplyr::rename_at() in action Easy peasy STATA-like marginal effects with R Functional programming and unit testing for data munging with R available on Leanpub How to use jailbreakr My free book has a cover! Work on lists of datasets instead of individual datasets by using functional programming Method of Simulated Moments with R New website! Nonlinear Gmm with R - Example with a logistic regression Simulated Maximum Likelihood with R Bootstrapping standard errors for difference-in-differences estimation with R Careful with tryCatch Data frame columns as arguments to dplyr functions Export R output to a file I've started writing a 'book': Functional programming and unit testing for data munging with R Introduction to programming econometrics with R Merge a list of datasets together Object Oriented Programming with R: An example with a Cournot duopoly R, R with Atlas, R with OpenBLAS and Revolution R Open: which is fastest? Read a lot of datasets at once with R Unit testing with R Update to Introduction to programming econometrics with R Using R as a Computer Algebra System with Ryacas

It's lists all the way down

There’s a part 2 to this post: read it here.

Today, I had the opportunity to help someone over at the R for Data Science Slack group (read more about this group here) and I thought that the question asked could make for an interesting blog post, so here it is!

Disclaimer: the way I’m doing things here is totally not optimal, but I want to illustrate how to map functions over nested lists. But I show the optimal way at the end, so for the people that are familiar with purrr don’t get mad at me.

Suppose you have to do certain data transformation tasks on a data frame, and you write a nice function that does that for you:

library(tidyverse)
data(mtcars)

nice_function = function(df, param1, param2){
  df = df %>%
    filter(cyl == param1, am == param2) %>%
    mutate(result = mpg * param1 * (2 - param2))

  return(df)
}

nice_function(mtcars, 4, 0)
##    mpg cyl  disp hp drat    wt  qsec vs am gear carb result
## 1 24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2  195.2
## 2 22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2  182.4
## 3 21.5   4 120.1 97 3.70 2.465 20.01  1  0    3    1  172.0

This might seem like a silly function and not a nice function, but it will illustrate the point I want to make (and the question that was asked) very well. This function is completely useless, but bear with me. Now, suppose that you want to do these operations for each value of cyl and am (of course you can do that without using nice_function()…). First, you might want to fix the value of am to 0, and then loop over the values of cyl. But as I have explained in this other blog post I prefer using the map() functions included in purrr. For example:

values_cyl = c(4, 6, 8)

(result = map(values_cyl, nice_function, df = mtcars, param2 = 0))
## [[1]]
##    mpg cyl  disp hp drat    wt  qsec vs am gear carb result
## 1 24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2  195.2
## 2 22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2  182.4
## 3 21.5   4 120.1 97 3.70 2.465 20.01  1  0    3    1  172.0
## 
## [[2]]
##    mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 2 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 3 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 4 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 
## [[3]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 2  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 3  16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 4  17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 5  15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 6  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 7  10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 8  14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 9  15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 10 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 11 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 12 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2

What you get here is a list for each value in values_cyl; so one list for 4, one for 6 and one for 8. Suppose now that you are feeling adventurous, and want to loop over the values of am too:

values_am = c(0, 1)

So first, we need to map a function to each element of values_am. But which function? Well, for given value of am, our problem is the same as before; we need to map nice_function() to each value of cyl. So, that’s what we’re going to do:

(result = map(values_am, ~map(values_cyl, nice_function, df = mtcars, param2 = .)))
## [[1]]
## [[1]][[1]]
##    mpg cyl  disp hp drat    wt  qsec vs am gear carb result
## 1 24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2  195.2
## 2 22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2  182.4
## 3 21.5   4 120.1 97 3.70 2.465 20.01  1  0    3    1  172.0
## 
## [[1]][[2]]
##    mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 2 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 3 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 4 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 
## [[1]][[3]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 2  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 3  16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 4  17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 5  15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 6  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 7  10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 8  14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 9  15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 10 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 11 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 12 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 
## 
## [[2]]
## [[2]][[1]]
##    mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 2 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 3 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 4 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 5 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 6 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 7 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 8 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 
## [[2]][[2]]
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb result
## 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4  126.0
## 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4  126.0
## 3 19.7   6  145 175 3.62 2.770 15.50  0  1    5    6  118.2
## 
## [[2]][[3]]
##    mpg cyl disp  hp drat   wt qsec vs am gear carb result
## 1 15.8   8  351 264 4.22 3.17 14.5  0  1    5    4  126.4
## 2 15.0   8  301 335 3.54 3.57 14.6  0  1    5    8  120.0

We now have a list of size 2 (for each value of am) where each element is itself a list of size 3 (for each value of cyl) where each element is a data frame. Are you still with me? Also, notice that the second map is given as a formula (notice the ~ in front of the second map). This creates an anonymous function, where the parameter is given by the . (think of the . as being the x in f(x)). So the . is the stand-in for the values contained inside values_am.

The people that are familiar with the map() functions must be fuming right now; there is a way to avoid this nested hell. I will talk about it soon, but first I want to play around with this list of lists.

If you have a list of data frames, you can bind their rows together with reduce(list_of_dfs, rbind). You would like to this here, but because your lists of data frames are contained inside another list… you guessed it, you have to map over it!

(result2 = map(result, ~reduce(., rbind)))
## [[1]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  195.2
## 2  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  182.4
## 3  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  172.0
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 7  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 8  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 9  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 10 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 11 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 12 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 13 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 14 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 15 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 16 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 17 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 18 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 19 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 
## [[2]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 2  32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 3  30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 4  33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 5  27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 6  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 7  30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 8  21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 9  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  126.0
## 10 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  126.0
## 11 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  118.2
## 12 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  126.4
## 13 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  120.0

Here again, I pass reduce() as a formula to map() to create an anonymous function. Again, the . is used as the stand-in for each element contained in result; a list of data frames, where reduce(., rbind) knows what to do. Now that we have this we can use reduce() with rbind() again to get a single data frame:

(result3 = reduce(result2, rbind))
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  195.2
## 2  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  182.4
## 3  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  172.0
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 7  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 8  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 9  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 10 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 11 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 12 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 13 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 14 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 15 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 16 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 17 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 18 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 19 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 20 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 21 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 22 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 23 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 24 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 25 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 26 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 27 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 28 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  126.0
## 29 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  126.0
## 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  118.2
## 31 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  126.4
## 32 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  120.0

Of course, since reduce(list_of_dfs, rbind) is such a common operation, you could have simply used dplyr::bind_rows, which does exactly this:

(result2 = map(result, bind_rows))
## [[1]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  195.2
## 2  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  182.4
## 3  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  172.0
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 7  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 8  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 9  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 10 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 11 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 12 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 13 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 14 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 15 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 16 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 17 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 18 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 19 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 
## [[2]]
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 2  32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 3  30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 4  33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 5  27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 6  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 7  30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 8  21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 9  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  126.0
## 10 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  126.0
## 11 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  118.2
## 12 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  126.4
## 13 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  120.0

and then:

(result3 = bind_rows(result2))
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  195.2
## 2  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  182.4
## 3  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  172.0
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 7  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 8  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 9  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 10 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 11 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 12 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 13 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 14 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 15 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 16 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 17 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 18 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 19 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 20 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 21 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 22 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 23 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 24 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 25 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 26 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 27 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 28 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  126.0
## 29 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  126.0
## 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  118.2
## 31 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  126.4
## 32 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  120.0

Of course, things are even simpler: you can avoid this deeply nested monstrosity by using map_df() instead of map()! map_df() works just like map() but return a data frame (hence the _df in the name) instead of a list:

(result_df = map_df(values_am, ~map_df(values_cyl, nice_function, df = mtcars, param2 = .)))
##     mpg cyl  disp  hp drat    wt  qsec vs am gear carb result
## 1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  195.2
## 2  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  182.4
## 3  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  172.0
## 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  256.8
## 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  217.2
## 6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  230.4
## 7  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  213.6
## 8  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  299.2
## 9  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  228.8
## 10 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  262.4
## 11 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  276.8
## 12 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  243.2
## 13 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  166.4
## 14 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  166.4
## 15 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  235.2
## 16 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  248.0
## 17 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  243.2
## 18 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  212.8
## 19 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  307.2
## 20 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1   91.2
## 21 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  129.6
## 22 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  121.6
## 23 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  135.6
## 24 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  109.2
## 25 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  104.0
## 26 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  121.6
## 27 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2   85.6
## 28 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  126.0
## 29 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  126.0
## 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  118.2
## 31 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  126.4
## 32 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  120.0

If you look at the source code of map_df() you see that dplyr::bind_rows gets called at the end:

map_df
## function (.x, .f, ..., .id = NULL) 
## {
##     if (!is_installed("dplyr")) {
##         abort("`map_df()` requires dplyr")
##     }
##     .f <- as_mapper(.f, ...)
##     res <- map(.x, .f, ...)
##     dplyr::bind_rows(res, .id = .id)
## }
## <bytecode: 0x55dad486e6a0>
## <environment: namespace:purrr>

So moral of the story? There are a lot of variants of the common purrr::map() functions (as well as of dplyr verbs, such as filter_at, select_if, etc…) and learning about them can save you from a lot of pain! However, if you need to apply a function to nested lists this is still possible; you just have to think about the structure of the nested list for a bit. There is also another function that you might want to study, modify_depth() which solves related issues but I will end the blog post here. I might talk about it in a future blog post.

Also, if you want to learn more about R and the tidyverse, do read the link I posted in the introduction of the post and join the R4ds slack group! There are a lot of very nice people there that want to help you get better with your R-fu. Also, this is where I got the inspiration to write this blog post and I am thankful to the people there for the discussions; I feel comfortable with R, but I still learn new tips and tricks every day!

If you enjoy these blog posts, you can follow me on twitter. And happy new yeaR!