About Me Blog
A tutorial on tidy cross-validation with R Analyzing NetHack data, part 1: What kills the players Analyzing NetHack data, part 2: What players kill the most Building a shiny app to explore historical newspapers: a step-by-step guide Classification of historical newspapers content: a tutorial combining R, bash and Vowpal Wabbit, part 1 Classification of historical newspapers content: a tutorial combining R, bash and Vowpal Wabbit, part 2 Curly-Curly, the successor of Bang-Bang Dealing with heteroskedasticity; regression with robust standard errors using R Easy time-series prediction with R: a tutorial with air traffic data from Lux Airport Exporting editable plots from R to Powerpoint: making ggplot2 purrr with officer Fast food, causality and R packages, part 1 Fast food, causality and R packages, part 2 For posterity: install {xml2} on GNU/Linux distros Forecasting my weight with R From webscraping data to releasing it as an R package to share with the world: a full tutorial with data from NetHack Get text from pdfs or images using OCR: a tutorial with {tesseract} and {magick} Getting data from pdfs using the pdftools package Getting the data from the Luxembourguish elections out of Excel Going from a human readable Excel file to a machine-readable csv with {tidyxl} Historical newspaper scraping with {tesseract} and R How Luxembourguish residents spend their time: a small {flexdashboard} demo using the Time use survey data Imputing missing values in parallel using {furrr} Intermittent demand, Croston and Die Hard Looking into 19th century ads from a Luxembourguish newspaper with R Making sense of the METS and ALTO XML standards Manipulate dates easily with {lubridate} Manipulating strings with the {stringr} package Maps with pie charts on top of each administrative division: an example with Luxembourg's elections data Missing data imputation and instrumental variables regression: the tidy approach Modern R with the tidyverse is available on Leanpub Objects types and some useful R functions for beginners Pivoting data frames just got easier thanks to `pivot_wide()` and `pivot_long()` R or Python? Why not both? Using Anaconda Python within R with {reticulate} Searching for the optimal hyper-parameters of an ARIMA model in parallel: the tidy gridsearch approach Some fun with {gganimate} Split-apply-combine for Maximum Likelihood Estimation of a linear model Statistical matching, or when one single data source is not enough The best way to visit Luxembourguish castles is doing data science + combinatorial optimization The never-ending editor war (?) The year of the GNU+Linux desktop is upon us: using user ratings of Steam Play compatibility to play around with regex and the tidyverse Using Data Science to read 10 years of Luxembourguish newspapers from the 19th century Using a genetic algorithm for the hyperparameter optimization of a SARIMA model Using cosine similarity to find matching documents: a tutorial using Seneca's letters to his friend Lucilius Using linear models with binary dependent variables, a simulation study Using the tidyverse for more than data manipulation: estimating pi with Monte Carlo methods What hyper-parameters are, and what to do with them; an illustration with ridge regression {disk.frame} is epic {pmice}, an experimental package for missing data imputation in parallel using {mice} and {furrr} Building formulae Functional peace of mind Get basic summary statistics for all the variables in a data frame Getting {sparklyr}, {h2o}, {rsparkling} to work together and some fun with bash Importing 30GB of data into R with sparklyr Introducing brotools It's lists all the way down It's lists all the way down, part 2: We need to go deeper Keep trying that api call with purrr::possibly() Lesser known dplyr 0.7* tricks Lesser known dplyr tricks Lesser known purrr tricks Make ggplot2 purrr Mapping a list of functions to a list of datasets with a list of columns as arguments Predicting job search by training a random forest on an unbalanced dataset Teaching the tidyverse to beginners Why I find tidyeval useful tidyr::spread() and dplyr::rename_at() in action Easy peasy STATA-like marginal effects with R Functional programming and unit testing for data munging with R available on Leanpub How to use jailbreakr My free book has a cover! Work on lists of datasets instead of individual datasets by using functional programming Method of Simulated Moments with R New website! Nonlinear Gmm with R - Example with a logistic regression Simulated Maximum Likelihood with R Bootstrapping standard errors for difference-in-differences estimation with R Careful with tryCatch Data frame columns as arguments to dplyr functions Export R output to a file I've started writing a 'book': Functional programming and unit testing for data munging with R Introduction to programming econometrics with R Merge a list of datasets together Object Oriented Programming with R: An example with a Cournot duopoly R, R with Atlas, R with OpenBLAS and Revolution R Open: which is fastest? Read a lot of datasets at once with R Unit testing with R Update to Introduction to programming econometrics with R Using R as a Computer Algebra System with Ryacas

The never-ending editor war (?)

The creation of this blog post was prompted by this tweet, asking an age-old question:

This is actually a very important question, that I have been asking myself for a long time. An IDE, and plain text editors, are a very important tools to anyone writing code. Most working hours are spent within such a program, which means that one has to be careful about choosing the right one, and once a choice is made, one has, in my humble opinion, learn as many features of this program as possible to become as efficient as possible.

As you can notice from the tweet above, I suggested the use of Spacemacs… and my tweet did not get any likes or retweets (as of the 19th of May, sympathetic readers of this blog have liked the tweet). It is to set this great injustice straight that I decided to write this blog post.

Spacemacs is a strange beast; if vi and Emacs had a baby, it would certainly look like Spacemacs. So first of all, to understand what is Spacemacs, one has to know a bit about vi and Emacs.

vi is a text editor with 43 years of history now. You might have heard of Vim (Vi IMproved) which is a modern clone of vi, from 1991. More recently, another clone has been getting popular, Neovim, started in 2014. Whatever version of vi however, its basic way of functioning remains the same. vi is a modal editor, meaning that the user has to switch between different modes to work on a text file. When vi is first started, the program will be in Normal mode. In this mode, trying to type a word will likely result in nothing, or unexpected behaviour; unexpected, if you’re not familiar with vi. For instance, in Normal mode, typing j will not show the character j on your screen. Instead, this will move the cursor down one line. Typing p will paste, u will undo the last action, y will yank (copy) etc…

To type text, first, one has to enter Insert mode, by typing i while in Normal mode. Only then is it possible to write text. To go back to Normal mode, type ESC. Other modes are Visual mode (from Normal mode press v), which allows the user to select text and Command-line mode which can be entered by keying : from Normal mode and allows to enter commands.

Now you might be wondering why anyone would use such a convoluted way to type text. Well, this is because one can chain these commands quite easily to perform repetitive tasks very quickly. For instance, to delete a word, one types daw (in Normal mode), delete a word. To delete the next 3 words, you can type 3daw. To edit the text between, for instance, () you would type ci( (while in Normal mode and anywhere between the braces containing the text to edit), change in (. Same logic applies for ci[ for instance. Can you guess what ciw does? If you are in Normal mode, and you want to change the word the cursor is on, this command will erase the word and put you in Insert mode so that you can write the new word.

These are just basic reasons why vi (or its clones) are awesome. It is also possible to automate very long and complex tasks using macros. One starts a macro by typing q and then any letter of the alphabet to name it, for instance a. The user then performs the actions needed, types q again to stop the recording of the macro, and can then execute the macro with @a. If the user needs to execute the macro say, 10 times, 10@‌‌a does the trick. It is possible to extend vi’s functionalities by using plugins, but more on that down below.

vi keybindings have inspired a lot of other programs. For instance, you can get extensions for popular web browsers that mimick vi keybindings, such as Tridayctl for Firefox, or Vivium for Chromium (or Google Chrome). There are even browsers that are built from scratch with support for vi keybinds, such as my personal favorite, qutebrowser. You can even go further and use a tiling window manager on GNU-Linux, for instance i3, which I use, or xmonad. You might need to configure those to behave more like vi, but it is possible. This means that by learning one set of keyboard shortcuts, (and the logic behind chaining the keystrokes to achieve what you want), you can master several different programs. This blog post only deals with the editor part, but as you can see, if you go down the rabbit hole enough, a new exciting world opens up.

I will show some common vi operations below, but before that let’s discuss Emacs.

I am not really familiar with Emacs; I know that Emacs users only swear by it (just like vi users only swear by vi), and that Emacs is not a modal editor. However, it contains a lot of functions that you can use by pressing ESC, CTRL, ALT or META (META is the Windows key on a regular PC keyboard) followed by regular keys. So the approach is different, but it is widely accepted that productivity of proficient Emacs users is very high too. Emacs was started in 1985, and the most popular clone is GNU Emacs. Emacs also features modes, but not in the same sense as vi. There are major and minor modes. For instance, if you’re editing a Python script, Emacs will be in Python mode, or if editing a Markdown file Emacs will be in Markdown mode. This will change the available functions to the user, as well as provide other niceties, such as auto-completion. Emacs is also easily extensible, which is another reason why it is so popular. Users can install packages for Emacs, just like R users would do for R, to extend Emacs’ capabilities. For instance, a very important package if you plan to use Emacs for statistics or data science is ESS, Emacs Speaks Statistics. Emacs contains other very high quality packages, and it seems to me (but don’t quote me on that) that Emacs’ packages are more mature and feature-rich than vi’s plugins. However, vi keybindings are really awesome. This is, I believe, what Sylvain Benner was thinking when he developed Spacemacs.

Spacemacs’ motto is that The best editor is neither Emacs nor Vim, it’s Emacs and Vim!. Spacemacs is a version, or distribution of Emacs, that has a very specific way of doing things. However, since it’s built on top of Emacs, all of Emacs’ packages are available to the user, notably Evil, which is a package that makes Emacs mimick vi’s modal mode and keybindings (the name of this package tells you everything you need to know about what Emacs users think of vi users 😀)

Not only does Spacemacs support Emacs packages, but Spacemacs also features so-called layers, which are configuration files that integrate one, or several packages, seamlessly into Spacemacs particular workflow. This particular workflow is what gave Spacemacs its name. Instead of relying on ESC, CTRL, ALT or META like Emacs, users can launch functions by typing Space in Normal mode and then a sequence of letters. For instance, Spaceqr restarts Spacemacs. And what’s more, you don’t actually need to learn these new key sequences. When you type Space, the minibuffer, a little popup window at the bottom of Spacemacs, appears and shows you all the options that you can type. For instance, typing b after Space opens up the buffer menu. Buffers are what could be called tabs in Rstudio. Here you can chose to delete a buffer, with d, create a new buffer with N, and many more options.

Enough text, let’s get into the videos. But keep in mind the following: the videos below show the keystrokes I am typing to perform the actions. However, because I use the BÉPO keyboard layout, which is the french equivalent of the DVORAK layout, the keystrokes will be different than those in a regular vi guide, which are mainly written for the QWERTY layout. Also, to use Spacemacs for R, you need to enable the ESS layer, which I show how to do at the end. Enabling this layer will turn on auto-completion, as well as provide documentation in real time for your function in the minibuffer:

The first video shows Spacemacs divided into two windows. On the left, I am navigating around code using the T (move down) and S (move up) keys. To execute a region that I select, I type Spacemrr (this stands for Major mode Run Region). Then around second 5, I key O which switches to Insert mode one line below the line I was, type head(mtcars) and then ESC to switch back to Normal mode and run the line with Spacemrl (Major mode Run Line).

In this video, I show you how to switch between windows. Type SpaceN to switch to window N. At the end, I key dd which deletes a whole line.

In the video below, I show how to use the pipe operator with Spacemm. This is a keyboard shortcut that I have defined myself. You can also spot the auto-completion at work in this video. To run the code, I first select it with V, which selects the whole line the cursor is currently at and enters Visual mode. I then select the lines below with T and run the region with Spacemrr.

Here I show how plotting behaves. When a plot is created, a new window is opened with the plot. This is a major shortcoming of using Spacemacs for R programming; there is not a dedicated buffer for plots, and it only shows the very last one created, so there is no way to keep all the plots created in the current session in a neat, dedicated buffer. It seems to be possible using Org-mode, which is an Emacs mode for writing notes, todos, and authoring documents. But I haven’t explored this option yet, mainly because in my case, only looking at one plot at a time is ok.

Here I show how to quickly add text to the top of the document when at the cursor is at the bottom: I try to use the tabyl() function found in the {janitor} package, which I forgot to load. I quickly go all the way up with gg, then key yy to copy the first line, then P to paste it on the line below (p would paste it on the same line), type fv, to find the letter v from the word “tidyverse”, then type liw (which is the BÉPO equivalent of ciw for Change In Word) and finally change “tidyverse” to “janitor”. This seems overly complex, but once you get used to this way of working, you will wonder why you hadn’t tried vi sooner.

Here I show how to do block comment. 8gg jumps to the 8th line, CTRLv starts block visual mode, which allows me to select a block of text. I select the first column of the text, G to jump all the way down, then A to enter insert mode at the end of the selection (actually, it would have been more logical to use I, which enters insert mode at the beginning of the selection) of the line and then add “#” to comment.

Here I show how to delete a block of text:

Search and replace, by entering command-line mode (look at the very bottom of the window):

I forgot to add “,” characters on a bunch of lines. I add the first “,” to the first line, go down and press ESC to exit Insert mode. Now in Normal mode, I type . to execute the last command, which is inserting a “,” character and going down a line. This dot command is a feature of vi, and it will always redo the last performed change.

But instead of typing . six times, just type 6. and be done with it:

What if you want to do something more complex, involving several commands? Here the dot command won’t be enough, since it only replicates the last command, not more. For this you can define macros with **@**. I look for the “,” character, twice, and put the rest of the characters in the next line with enter. I then repeat this operation by executing the macro using @‌‌a repeatedly (@‌‌a because I saved the actions in a, but it could have been any other letter). I then undo my changes and execute the macro 5 times with 5@‌‌a.

Here I show the undo tree (by typing Spaceua), which is a feature Spacemacs inherited from Emacs: it makes undoing changes and going back to a previous version of your script very easily:

Finally, I show my Spacemacs configuration file. I show where one needs to specify the layers one wishes to use. For R, the ESS layer (which is a configuration file for the ESS Emacs package) is mandatory. As I explained above, it is also possible to use Emacs packages for which no layer is available. These are the packages under dotspacemacs-additional-packages. In my case I use:

dotspacemacs-additional-packages '(polymode
                                  poly-R
                                  poly-noweb
                                  poly-markdown)

which makes working with RMarkdown possible. polymode enables simultaneous Major modes, which is needed for RMarkdown (because RMarkdown files mix Markdown and R).

That’s the end of this long post. Spacemacs is really a joy to use, but the learning curve is quite steep. However, it is definitely worth it. There are so many packages available for Emacs (and hence Spacemacs) that allow you to browse the web, play games, listen to music, send and read emails… that a recurrent joke is that Emacs is a very nice operating system, but it lacks a decent editor. If that’s the case, Spacemacs is the perfect operating system, because it includes the greatest editor, vi.

If you’re interested and and want to learn more about vi, I advise you to read the following book Vim Recipes (pdf warning, free) or Practical Vim, Edit Text at the Speed of thought (not free, but worth every cent), and Use Vim Like a Pro, which I have not read, but it looks quite good, and is free too if you want. Now this only covers the vi part, not the Emacs aspects of Spacemacs, but you don’t really need to know about Emacs to use Spacemacs. I had 0 experience with Emacs, and still have 0 experience with it. I only learned how to configure Spacemacs, which does not require any previous experience. To find the packages you need, as usual, use any search engine of your liking.

The last point I want to address is the built-in Vim mode of Rstudio. While it works, it does not work 100% as regular Vim, and worst of all, does not support, as far as I know, any other keyboard layout than QWERTY, which is a nogo for me.

In any case, if you’re looking to learn something new that you can use for many programs, including Rstudio, learn Vim, and then give Spacemacs a try. Chaining keystrokes to edit text gets addictive very quickly.

Hope you enjoyed! If you found this blog post useful, you might want to follow me on twitter for blog post updates and buy me an espresso or paypal.me.

Buy me an EspressoBuy me an Espresso

For reference, here is my dotspacemacs/user-config, which is where I defined the shortcut for the %>% operator.

(defun dotspacemacs/user-config ()
  "Configuration for user code:
This function is called at the very end of Spacemacs startup, after layer
configuration.
Put your configuration code here, except for variables that should be set
before packages are loaded."
;;; R modes
  (add-to-list 'auto-mode-alist '("\\.md" . poly-markdown-mode))
  (add-to-list 'auto-mode-alist '("\\.Snw" . poly-noweb+r-mode))
  (add-to-list 'auto-mode-alist '("\\.Rnw" . poly-noweb+r-mode))
  (add-to-list 'auto-mode-alist '("\\.Rmd" . poly-markdown+r-mode))

  ;; (require 'poly-R)
  ;; (require 'poly-markdown)
  ;; (add-to-list 'auto-mode-alist '("\\.Rmd" . poly-markdown+r-mode))

  (global-company-mode t)
  (global-hl-line-mode 1) ; Enable/Disable current line highlight
  (setq-default fill-column 99)
  (setq-default auto-fill-mode t)
  ;; ESS shortcuts
  (spacemacs/set-leader-keys "mdt" 'ess-r-devtools-test-package)
  (spacemacs/set-leader-keys "mrl" 'ess-eval-line)
  (spacemacs/set-leader-keys "mrr" 'ess-eval-region)
  (spacemacs/set-leader-keys "mdb" 'ess-r-devtools-build-package)
  (spacemacs/set-leader-keys "mdd" 'ess-r-devtools-document-package)
  (spacemacs/set-leader-keys "mdl" 'ess-r-devtools-load-package)
  (spacemacs/set-leader-keys "mdc" 'ess-r-devtools-check-package)
  (spacemacs/set-leader-keys "mdp" 'ess-r-package-mode)
  (add-hook 'ess-mode-hook
            (lambda ()
              (ess-toggle-underscore nil)))
  (define-key evil-normal-state-map (kbd "SPC mm")
            (lambda ()
              (interactive)
              (insert " %>% ")
              (evil-insert-state)
              ))
  ;; Move lines around
  (spacemacs/set-leader-keys "MS" 'move-text-line-up)
  (spacemacs/set-leader-keys "MT" 'move-text-line-down)
  (setq-default whitespace-mode t)
  (setq-default whitespace-style (quote (spaces tabs newline space-mark tab-mark newline-mark)))
  (setq-default whitespace-display-mappings
        ;; all numbers are Unicode codepoint in decimal. try (insert-char 182 ) to see it
        '(
          (space-mark 32 [183] [46]) ; 32 SPACE, 183 MIDDLE DOT 「·」, 46 FULL STOP 「.」
          (newline-mark 10 [9226 10]) ; 10 LINE FEED
          (tab-mark 9 [9655 9] [92 9]) ; 9 TAB, 9655 WHITE RIGHT-POINTING TRIANGLE 「▷」
          ))
  (setq-default TeX-view-program-selection
         '((output-pdf "PDF Viewer")))
  (setq-default TeX-view-program-list
        '(("PDF Viewer" "okular %o")))
  (setq-default indent-tabs-mode nil)
  (setq-default tab-width 2)
   ;; (setq org-default-notes-file (concat org-directory "/agenda/notes.org"))
   (add-hook 'prog-mode-hook 'spacemacs/toggle-fill-column-indicator-on)
   (add-hook 'text-mode-hook 'spacemacs/toggle-fill-column-indicator-on)
   (add-hook 'markdown-mode-hook 'spacemacs/toggle-fill-column-indicator-on)
  )