Lookahead and Lookbehind Zero Width Assertions

By Sakib Farid

LOOKAHEAD

When using regular expressions, sometimes you may want to match a string NOT followed by some particular word (negative lookahead) or followed by some particular word (positive lookahead) . The construct to do this is

search string (?!some word)
search string (?=some word)

The first example will match `search string` not followed by `some word`, this is a negative lookahead. The second example will match `search string` followed by `some word` which is a poitive lookahead. When using lookaheads we can use any regular expression within the brackets and even capture the match e.g.

search string (?=(regex))

In the above example we can match `search string` followed by any regular expression and also capture the regular expression.

LOOKBEHIND

Lookbehind on the other hand works in a slightly different way. You cannot just apply any regular expression for a lookbehind assertion, it has to be fixed width. The construct for lookbehind is

(?<!some word )search string
(?<=some word )search string

The first example is a negative lookbehind which allows you to match `search string` not preceded by `some word `. The second will match `search string` which is preceded by `some word `.

Lookbehinds do not allow you to have just any regular expressions inside the lookbehind. You can only use a regular expression where the length of the match is predetermined, is a fixed width. This applies to perl, python and php. Java on the other hand may allow some extra options such as the ?

All in all the lookarounds in regular expressions are a powerful and indespinsible tool when matching strings.

Tags: , , , , , , , , ,

Leave a Reply