Balthazar – Text processing in the shell

This article is part of a self-published book project by Balthazar Rouberol and Etienne Brodu, ex-roommates, friends and colleagues, aiming at empowering the up and coming generation of developers. We currently are hard at work on it!

If you are interested in the project, we invite you to join the mailing list!

Table of Contents

One of the things that makes the shell an invaluable tool is the amount of available text processing commands, and the ability to easily pipe them into each other to build complex text processing workflows. These commands can make it trivial to perform text and data analysis, convert data between different formats, filter lines, etc.

When working with text data, the philosophy is to break any complex problem you have into a set of smaller ones, and to solve each of them with a specialized tool.

Make each program do one thing well.1

The examples in that chapter might seem a little contrived at first, but this is also by design. Each of these tools were designed to solve one small problem. They however become extremely powerful when combined.

We will go over some of the most common and useful text processing commands the shell has to offer, and will demonstrate real-life workflows piping them together. I suggest you take a look at the man of these commands to see the full breadth of options at your disposal.

The example CSV (comma-separated values) file is available online.2 Feel free to download it yourself to test these commands.

cat

As seen in the previous chapter, cat is used to concatenate a list of one or more files and displays their content on screen.

https://blog.balthazar-rouberol.com/text-processing-in-the-shell