A lot of the time it seems that Perl programmers write a regex like
/^foo$/
(where “foo” is some arbitrary regex pattern)
but from the context it seems like the intention of the programmer was
to make it so that the string to be searched must match the pattern
“foo”, and there must be nothing between the beginning of “foo” and the
beginning of the string to be searched, and there must be nothing
between the end of “foo” and the end of the string to be searched.
But of course the regex doesn’t do that.
The metacharacter ‘$’ matches not only at the end of the string to be
searched but also just before a newline character at the end of the
string to be searched. (Of course when the ‘m’ flag is specified, ‘$’
behaves differently. But I’d like to concentrate on the behaviour
without the ‘m’ flag for the time being.)
So the above pattern will match “foo” and “foo\n”.
Is that what the programmer really wanted? I think in many cases not.
So how can we make the pattern match exactly at the end of the string to
be searched?
The answer is to use the metacharacter ‘\z’. This matches exactly at the
end of the string to be searched.
So to make a regex that matches the pattern “foo”, and with the
beginning and end of the pattern bound to the end of the string to be
searched, we could write:
/^foo\z/
=====
Here endeth the bit about doing the minimum to make the code correctly
reflect the intention of the programmer. The rest is about style,
personal preference, readability, etc.
=====
Some might say that using ‘^’ to match the beginning of the string to be
searched and ‘\z’ at the end is a bit dicey because the meaning of ‘^’
is changed if the ‘m’ flag is used but the meaning of ‘\z’ isn’t. It
would be nice if there was a metacharacter which exactly matched the
beginning of the string to be searched, regardless of the ‘m’ flag.
Fortunately there is, ‘\A’. Using that would give:
/\Afoo\z/
But because ‘\A’ ends with a letter, the regex can be a bit hard to
parse if the ‘\A’ is followed by a pattern which begins with a letter.
So some might say that it might be a good idea to use the ‘x’ flag to
allow whitespace inside the regex. That would give
/ \A foo \z /x
Though I find it a bit hard to read when slash delimiters are combined
with a few initial-backslash metacharacters, so I prefer to use a
different delimiter. So I would think to use something like:
m{ \A foo \z }x
- by Bill Blunn
See also http://perldoc.perl.org/perlre.html#Regular-Expressions
A piggy bank of commands, fixes, succinct reviews, some mini articles and technical opinions from a (mostly) Perl developer.