Home Page
Posts > Better Regular Expression Lists
Search:
Better Regular Expression Lists
Regular expressions have been one of my favorite programming tools since I first discovered them. They are wonderfully robust and things can usually be done with them in many ways. For example, here are multiple ways to match an IPv4 address:
  • ^\d\d?\d?\.\d\d?\d?\.\d\d?\d?\.\d\d?\d?$
  • ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$
  • ^(\d{1,3}\.){3}\d{1,3}$
  • ^([0-9]{1,3}\.){3}[0-9]{1,3}$

One of my major annoyances though has always been lists. I have always done them like ^(REGEX,)*REGEX$.

For example, I would do a list of IP addresses like this: ^(\d{1,3}\.){3}\d{1,3},)*\d{1,3}\.){3}\d{1,3}$.

I recently realized however that a list can much more elegantly be done as follows: ^(REGEX(,|$))+(?<!,)$. I would describe this as working by:

  • ^: Start of the statement (test string)
  • (REGEX(,|$))+: A list of items separated by either a comma or EOS (end of statement). If we keep this regular expression as not-multi-line (the default), then the EOS can only happen at the end of the statement.
  • (?<!,): This is a look-behind assertion saying that the last character before the EOS cannot be a comma. If we didn’t have this, the list could look like this, with a comma at the end: “ITEM,ITEM,ITEM,”.
  • $: The end of the statement

So the new version of the IP address list would look like this ^((\d{1,3}\.){3}\d{1,3}(,|$))+(?<!,)$ instead of this ^((\d{1,3}\.){3}\d{1,3},)*(\d{1,3}\.){3}\d{1,3}$.


Also, since an IP address is just a list of numbers separated by periods, it could also look like this: ^(\d{1,3}(\.|$)){4}(?<!\.)$.


Comments
To add comments, please go to the forum page for this post (guest comments are allowed for the Projects, Posts, and Updates Forums).
Comments are owned by the user who posted them. We accept no responsibility for the contents of these comments.

No comments for this Post