Skip to content
Generic filters
Exact matches only

A Guide to the Pipe in R. R’s most important operator for data… | by Rory Spanton | Aug, 2020

Why should we use the pipe?

The pipe has a huge advantage over any other method of processing data in R: it makes processes easy to read. If we read %>% as “then”, the code from the previous section is very easy to digest as a set of instructions in plain English:

This is far more readable than if we were to express this process in another way. The two options below are different ways of expressing the previous code, but both are worse for a few reasons.

Option 1 gets the job done, but overwriting our output dataframe result in every line is problematic. For one, doing this for a procedure with lots of steps isn’t efficient and creates unnecessary repetition in the code. This repetition also makes it harder to identify exactly what is changing on each line in some cases.

Option 2 is even less practical. Nesting each function we want to use gets ugly fast, especially for long procedures. It’s hard to read, and harder to debug. This approach also makes it tough to see the order of steps in the analysis, which is bad news if you want to add new functionality later.

It’s easy to see how using the pipe can substantially improve most R scripts. It makes analyses more readable, removes repetition, and simplifies the process of adding and modifying code. Is there anything it can’t do?