<How .NET Regular Expressions Really Work ⁄ >

My early BASIC programs were littered with IF statements that dissected strings using LEFT$, RIGHT$, MID$, TRIM$, and UCASE$. It took me hours to write a program that parsed a simple text file. Just trying to support whitespace and mixed casing was enough to drive me crazy.

Years later when I started programming in Java, I discovered the StringTokenizer class. I thought it was a huge leap forward. I no longer had to worry about whitespace. However, I still had to use functions like "substring" and "toUpperCase", but I thought that was as good as it could get.

And then one day I found regular expressions:

I almost cried when I realized that I could replace parsing code that took me hours to write with a simple regular expression. It still took me several years to become comfortable with the syntax, but the learning curve was worth the power obtained.

And yet with all of this love, I still had this nagging suspicion that I was doing it wrong. After reading Pragmatic Thinking and Learning, I was determined to try to imagine what life was like inside the code I wrote. But I just couldn't connect with a regular expression.

The last straw came recently when I was trying to help a coworker craft a regex to properly handle name/value string pairs with escaped strings. In the end, our regex worked, but I felt that it was duct-taped together. I knew there was a better way.

