Cari di Perl 
    Perl User Manual
Daftar Isi
(Sebelumnya) Compile patternReplace a pattern with a string (Berikutnya)
Regular expressions and pattern matching

Quote regular expression magic characters

Daftar Isi

  • quotemeta EXPR

  • quotemeta

    Returns the value of EXPR with all the ASCII non-"word"characters backslashed. (That is, all ASCII characters not matching/[A-Za-z_0-9]/ will be preceded by a backslash in thereturned string, regardless of any locale settings.)This is the internal function implementingthe \Q escape in double-quoted strings.(See below for the behavior on non-ASCII code points.)

    If EXPR is omitted, uses $_.

    quotemeta (and \Q ... \E) are useful when interpolating strings intoregular expressions, because by default an interpolated variable will beconsidered a mini-regular expression. For example:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. $sentence =~ s{$substring}{big bad wolf};

    Will cause $sentence to become 'The big bad wolf jumped over...'.

    On the other hand:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. $sentence =~ s{\Q$substring\E}{big bad wolf};

    Or:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. my $quoted_substring = quotemeta($substring);
    4. $sentence =~ s{$quoted_substring}{big bad wolf};

    Will both leave the sentence as is.Normally, when accepting literal stringinput from the user, quotemeta() or \Q must be used.

    In Perl v5.14, all non-ASCII characters are quoted in non-UTF-8-encodedstrings, but not quoted in UTF-8 strings.

    Starting in Perl v5.16, Perl adopted a Unicode-defined strategy forquoting non-ASCII characters; the quoting of ASCII characters isunchanged.

    Also unchanged is the quoting of non-UTF-8 strings when outside thescope of a use feature 'unicode_strings', which is to quote allcharacters in the upper Latin1 range. This provides complete backwardscompatibility for old programs which do not use Unicode. (Note thatunicode_strings is automatically enabled within the scope of ause v5.12 or greater.)

    Within the scope of use locale, all non-ASCII Latin1 code pointsare quoted whether the string is encoded as UTF-8 or not. As mentionedabove, locale does not affect the quoting of ASCII-range characters.This protects against those locales where characters such as "|" areconsidered to be word characters.

    Otherwise, Perl quotes non-ASCII characters using an adaptation fromUnicode (see http://www.unicode.org/reports/tr31/.)The only code points that are quoted are those that have any of theUnicode properties: Pattern_Syntax, Pattern_White_Space, White_Space,Default_Ignorable_Code_Point, or General_Category=Control.

    Of these properties, the two important ones are Pattern_Syntax andPattern_White_Space. They have been set up by Unicode for exactly thispurpose of deciding which characters in a regular expression patternshould be quoted. No character that can be in an identifier has theseproperties.

    Perl promises, that if we ever add regular expression patternmetacharacters to the dozen already defined(\ | ( ) [ { ^ $ * + ? .), that we will only use ones that have thePattern_Syntax property. Perl also promises, that if we ever addcharacters that are considered to be white space in regular expressions(currently mostly affected by /x), they will all have thePattern_White_Space property.

    Unicode promises that the set of code points that have these twoproperties will never change, so something that is not quoted in v5.16will never need to be quoted in any future Perl release. (Not all thecode points that match Pattern_Syntax have actually had charactersassigned to them; so there is room to grow, but they are quotedwhether assigned or not. Perl, of course, would never use anunassigned code point as an actual metacharacter.)

    Quoting characters that have the other 3 properties is done to enhancethe readability of the regular expression and not because they actuallyneed to be quoted for regular expression purposes (characters with theWhite_Space property are likely to be indistinguishable on the page orscreen from those with the Pattern_White_Space property; and the othertwo properties contain non-printing characters).

 
Source : perldoc.perl.org - Official documentation for the Perl programming language
Site maintained by Jon Allen (JJ)     See the project page for more details
Documentation maintained by the Perl 5 Porters
(Sebelumnya) Compile patternReplace a pattern with a string (Berikutnya)