implementations
go
The regexp implementation provided by this package is guaranteed to run in time linear in the size of the input. It uses syntax accepted by RE2 and described at https://golang.org/s/re2syntax, except for \C.
An overview of the syntax can be shown with: go doc regexp/syntax
r := regexp.MustCompile("p([a-z]+)ch")
// MustCompile will panic if there's an error in the syntax. Use Compile for
// error checking (non constant strings).
fmt.Println(r.FindAllString("peach punch pinch", -1))
js
Regular expressions can either be constructed with the RexExp constructor or written as a literal value enclosed with forward slashes.
let re1 = new RegExp("abc")
let re2 = /abc/
Using the RegExp notation, the pattern is written as a normal string with the
usual rules for backslashes. In the second nocation, you must escape forward
slashes with a backslash. Additionally, backslashes which are not part of
character codes such as \n
will be preserved rather than ignored.
testing for matches
/abc/.test("abcde")
This method returns a boolean reporting if the string contains any match of the
pattern.
character sets
[12345]
- Match any of these characters
[0-9]
- Match any character between 0-9
\d
- Any digit character same as [0-9]
\w
- An alphanumeric character ("word character")
\s
- Any whitespace character (space, tab, newline)
\D
- Any character that is not a digit
\W
- Any nonalphanumeric character
\S
- Any nonwhitespace character
.
- Any character, except newline
^
- Start of string
$
- End of string
\b
- Start or end of a word boundary. Doesn't match an actual character, but
instead just enforces that a word boundary exists at that location.\
You can use the backslash codes inside square brackets such as: [\d.]
which
matches any digit or period. Period and plus loose their special meaning inside
square brackets.
^
- Invert a set of characters. [^01]
matches characters other than 0 and 1
+
- Match one or more of the preceeding character
*
- Match zero or more of the preceeding character
?
- Match zero or one time (makes a match "optional")\
To indicate that a pattern should occur a precise number of times, use braces.
Putting {4}
after an element requires it to match 4 times. You can also
specify a range: {2,4}
matched at least twice and at most four times. Or
{5,}
to match 5 of more times.
To make the whole match case insensitive add an i
after the second slash.
To allow more than a single match add a g
after the second slash.
grouping subexpressions
To use an operator like *
or +
on more than a single element you must use
parenthesis.\
let cartoonCrying = /boo+(hoo+)+/i
console.log(cartoonCrying.test("Boohoooohoohoo""))
// true
pipe character
The pipe character denotes choices between the pattern to it's left and right.
let animalCount = /\bd+ (pig|cow|chicken)s?\b/
console.log(animalCount.test("15 pigs"))
replace
String values have a replace method and the first argument can be a regular expression:\
console.log("Borobudur".replace(/[ou]/g, "a"))
// Barabadar
The most powerful aspect is that we can refer to matched groups in the replacement string. For example to swap first and last names:
console.log(
"Liskov, Barbara\nMcCarthy, John\nWadler, Philip"
.replace(/(\w+), (\w+)/g, "$2 $1"))
// Barbara Liskov
// John McCarthy
// Philip Wadler
misc
parse html with regex https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454
wtffff http://sed.sourceforge.net/grabbag/scripts/turing.sed