22nd April 2021, 6 min read

Some useful regular expressions for programmers

9 thoughts on “Some useful regular expressions for programmers”

dsernst says:

April 23, 2021 at 7:06 am

You might like prettier: https://prettier.io.

It handles all this and more automatically for you, for almost every language. It’s like magic.
1. Jennifer says:
  
  April 24, 2021 at 7:34 am
  
  As I understand the idea of the article is to helps and teach useful regular expressions and uses code formatting as an example.
Kai says:

April 23, 2021 at 7:14 am

Assuming that most of your research is done in C or C++, I wonder why you’re not considering using clang-format for these tasks as regular expressions will only get you so far?
1. Daniel Lemire says:
  
  April 23, 2021 at 12:29 pm
  
  I program using a wide range of programming languages, including C and C++. I do use clang-format and other code reformatters.
Shiv says:

April 23, 2021 at 3:54 pm

Nice tips. You can use \S instead of [^\s] to shorten some of these.

To delete the extra space, I can select it with look-ahead and look-behind expressions such as <(?<=^(\s\s)*)\s(?=[^\s]).

I think you’ve got an extra < in there, unless that’s some sort of new metacharacter.

I do not want a space after the opening parenthesis nor before the closing parenthesis. I can check for such a case with (\(\s|\s\)). If I want to remove the spaces, I can detect them with a look-behind expression such as (?<=\()\s.

This is probably the desired behaviour, but just to note, \s will also match newlines. If you wanted to preserve those, you could use [ \t] instead.

Your use of lookbehind & lookahead is interesting. When I’m doing search-and-replace on code, I always include the prefix and suffix in capturing groups and account for those in the replacement.

…on further consideration, that’s probably because I’m doing it in Emacs, whose native regex engine is rather primitive.
1. Daniel Lemire says:
  
  April 23, 2021 at 4:13 pm
  
  This is probably the desired behaviour, but just to note, \s will also match newlines.
  
  It depends on whether the regular expression is applied to the whole documents or to lines. Many editors match regular expressions on a line-by-line basis by default.
  1. Shiv says:
    
    April 23, 2021 at 4:41 pm
    
    Oh, my mistake! I don’t think I’ve ever used one like that. Thanks, learnt something new. 😊
2. Greg says:
  
  April 23, 2021 at 7:18 pm
  
  Your habit to use capture groups instead of lookarounds is good. A lot of regex engines don’t support variable-length lookarounds.
Dominic Amann says:

April 26, 2021 at 5:32 pm

These are some really useful regular expressions. I use some all the time, but I used your blog as an opportunity to delete all the annoying trailing spaces in our code base (like really, who does that?).