[Home]Qedit

Contents | (Visit Preferences to set your user name.) | Related To Qedit | RecentChanges | Preferences | Index | Login | Logout

Featured: Featured Stories | Picture Gallery

Vacation | A Lovecraft Christmas
Google
Chat11.com Web Bible11.com MyBibleCenter.com

Subjects > Software

QEDIT Searching With Regular Expressions

In addition to searching for a simple, precise string (or pattern) of
characters, the editor also allows you to indicate a complex or ambiguous search pattern, called a Regular Expression. The [X] (regular eXpression) search option allows you to define a Regular Expression search pattern.

By using Regular Expressions, you could, for example, find all occurrences of words in a file beginning with "b" or "B", and ending with "ing" (such as being, Beginning, bring). Or you could change a group of names listed as first name followed by last name (example: Kari Hood), to a list composed of last name followed by first name, and separated by a comma (example: Hood, Kari).

Regular Expressions are composed by combining simple character patterns with special operators to create a complex search pattern. Operators in Regular Expressions allow you to: limit a match to specific characters or a Class of characters; broaden a match to any character; optionally match a pattern; indicate a repeated pattern; and specify alternate patterns to match. Matches across line boundaries, however, are not supported.

The following list identifies and describes the operation of the operator symbols used to create Regular Expressions. (A "Summary List of Regular Expression Operators" is provided at the end of this section.)

Regular Expression Operators ────────────────────────────

Symbol Regular Expression Operation ______ _____________________________________________________________________

. In a search pattern, matches any single character. (This does not

        match the end-of-line position.)

        Example: 

            Search pattern:   wh.t 

            matches a string beginning with the letters wh, followed by any
            single character, followed by the letter t (such as, what or
            whet or wh t (all on one line), but NOT wht or wheat)

^ In a search pattern, anchors the search for the sub-pattern that

        follows, to the beginning of the line (column 1); or, if a block is
        marked and the [L] (Local) option is specified, anchors the search to
        the beginning column of the block on a line.

        Example: 

            Search pattern:   ^This 

            matches the string This beginning in column 1 

$ In a search pattern, anchors the search for the preceding sub-pattern

        to the end of the line; or, if a block is marked and the [L] option is
        specified, anchors the search to the ending column of the block or to
        the end of the line, whichever comes first.

        Example: 

            Search pattern:   that$ 

            matches the string that occurring as the final text on a line 

| "Or" operator: in a search pattern, matches the preceding or the

        following sub-pattern.

        Example:

            Search pattern:   licens|ce 

            matches the strings license or licence 

        The order of precedence in searching for the indicated patterns is 
        the order in which they are listed in the search string.  Thus, if
        one shorter pattern (such as what) is included in another longer
        "or" pattern (such as whatever), the longer pattern should be
        listed first.

        Example: 

            Search pattern:   {whatever}|{what} 

            matches the strings whatever and what, and locates 
            occurrences of either string in the text   (Use of { } symbols
            is explained later in this section.)

        HOWEVER: 

            Search pattern:   {what}|{whatever} 

            never locates the full string whatever, since the search is 
            immediately satisfied by what

? In a search pattern, optionally matches the preceding sub-pattern.

        Example: 

            Search pattern:   colou?r 

            matches the strings color or colour 

[ ] In a search pattern, identifies a Class of characters against which

        to match a single character.

        Within a Class, the case of characters is significant.  (The [I] 
        Ignore-case search option does not apply to a Class.)  If upper-case
        and lower-case characters are to be included in a Class, both cases
        must be specified.

        Example: 

            Search pattern:   [123aAbC?Create] 

            matches any one of the following single characters: 

            1 2 3 a A b C 

[ - ] In a search pattern, indicates a range of characters (based on ASCII

        sequence) when used BETWEEN characters in a Class.

        The "-" symbol has no special meaning when it occurs as the first or 
        last character within the "[ ]" Class notation, or when used outside
        the Class notation.  In such cases, it is treated as the dash ("-")
        character.

        Example: 

            Search pattern:   [0-9abc-] 

            matches any one of the following single characters: 

            0 1 2 3 4 5 6 7 8 9 a b c - 

[~ ] In a search pattern, identifies a complement Class of characters to

        match against a single character, when "~" is used as the first
        character within the Class notation (immediately following the "["
        symbol).  It matches against the characters that ARE NOT in the
        specified Class of characters.

        The "~" symbol has no special meaning when it DOES NOT occur as the 
        FIRST character within the Class notation, or when used outside the
        Class notation.  In such cases, it is treated as the tilde ("~")
        character.

        Example: 

            Search pattern:   [~0-2a=] 

            matches any single character OTHER than: 

            0 1 2 a = 

        Example: 

            Search pattern:   [~ ]       (This is ~ followed by a space.) 

            matches any single character OTHER than a space character 

        sub-pattern, with minimum closure.  (See "Minimum/Maximum?Create Closure"
        below).

        Example: 

            Search pattern:   ba*c 

            matches a string beginning with the letter b, followed by zero 
            or more occurrences of the letter a, followed by the letter c
            (such as, bc, bac or baac)

        Example: 

            Search pattern:   wh.*t 

            matches a string beginning with the letters wh, followed by zero 
            or more occurrences of ANY character, followed by the letter t
            (such as, wh t (all on one line), wht, what or wheat)

+ In a search pattern, matches 1 or more occurrences of the preceding

        sub-pattern, with minimum closure.  (See "Minimum/Maximum?Create Closure"
        below).

        Example: 

            Search pattern:   ba+c 

            matches a string beginning with the letter b, followed by one or 
            more occurrences of the letter a, followed by the letter c
            (such as, bac or baac, but NOT bc)

@ In a search pattern, matches 0 or more occurrences of the preceding

        sub-pattern, with maximum closure.  (See "Minimum/Maximum?Create Closure"
        below).

        Example: 

            Search pattern:   ba@c 

            matches a string beginning with the letter b, followed by zero 
            or more occurrences of the letter a, followed by the letter c
            (such as, bc, bac or baac)

  1. In a search pattern, matches 1 or more occurrences of the preceding
        sub-pattern, with maximum closure.  (See "Minimum/Maximum?Create Closure"
        below).

        Example: 

            Search pattern:   ba#c 

            matches a string beginning with the letter b, followed by one or 
            more occurrences of the letter a, followed by the letter c
            (such as, bac or baac, but NOT bc)

        Example (to find identifiers in many programming languages): 

            Search pattern:   [a-zA-Z0-9_]# 

            matches any sequence of one or more characters in the Class 
            consisting of all lower and upper case letters, all numbers, and
            the underscore.

{ } In a search pattern, serves as a Tag to identify a sub-pattern within

        the full search pattern.  Tagged patterns can be nested.

        Tags are used to define a group of characters as a sub-pattern so 
        that an operator acts on more than one character or character Class.

        Tags are also used to identify a sub-pattern within a Regular 
        Expression so the sub-pattern can be separately referenced in a
        subsequent replacement.  Tagged sub-patterns are implicitly numbered
        from 1 through 9 based on the leftmost "{" symbol.  The sub-pattern
        number can be used within a replacement string to reference a tagged
        sub-pattern, using the following format:

            \n

        where "n" is the actual sub-pattern number from 1 - 9 that 
        represents the appropriate tagged sub-pattern.  To identify the FULL
        search pattern, "n" is "0" (that is, \0).

        Example (defining groups of characters as sub-patterns): 

            Search pattern:   {Begin}|{End}File 

            matches either of the strings BeginFile?Create or EndFile?Create 

        NOTE (without the Tags): 

            Search pattern:   Begin|EndFile?Create 

            matches the strings BeginndFile?Create or BegiEndFile?Create 

        Example (identifying sub-patterns for replacement): 

            Search pattern:   {flip}{flop}s 

            Replace pattern:  \2\1s 

            changes the string flipflops to flopflips 

        Example (identifying sub-patterns for replacement): 

            Rearrange the following list into last name, first name: 

                         (1)  Sammy Mitchell
                         (8)  Steve Watkins
                         (15) Kevin Carr

            Search pattern:   {([0-9]#) +}{[a-zA-Z]#} {[a-zA-Z]#} 

            Replace pattern:  \1\3, \2 

            changes the list to: 

                         (1)  Mitchell, Sammy
                         (8)  Watkins, Steve
                         (15) Carr, Kevin

        Example (identifying the full search pattern): 

            Search pattern:   ^.*$    (any character(s) on entire line) 

            Replace pattern:  "\0" 

        "         encloses entire line in quotes" 

\ In a search or replace pattern, serves as an Escape operator to

        override a Regular Expression operator so the operator is treated as a
        literal character.  For a Regular Expression operator to be treated as
        a literal character in a search or replace pattern, precede the
        operator symbol with the Escape "\" symbol.  Note that in a replace
        pattern, the only symbol recognized as a Regular Expression operator
        is the Escape symbol itself (as described in the following paragraph).

        In a search or replace pattern, the "\" symbol is also used with 
        certain other letters and numbers to indicate specific characters or
        values, such as a formfeed character or a hexadecimal value (as listed
        below).  Further, in a replace pattern, the "\" symbol is used to
        reference a Tagged sub-pattern (see the explanation of "{ }" (Tags)
        above).

        The following examples show use of the "\" symbol as an Escape
        operator.

        Example: 

            Search pattern:   abc\*\* 

            matches the string abc** 

        Example: 

            Search pattern:   abc\\\* 

            matches the string abc\* 

        Example: 

            Search pattern:   abc\\\* 

            Replace pattern:  \\abc* 

            changes the string abc\*  to  \abc* 

        Note that within the "[ ]" Class notation in a search pattern, the 
        only symbols that are recognized as operators are "\", "-", "~", and
        "]" (and then only when placed as indicated in their descriptions).
        All other operators can be used literally in a Class notation without
        the need for the Escape operator.  Thus, to find the Class of
        characters consisting of the question mark, the comma, the dollar
        sign, and the dash, the Class can be designated as "[?,$-]" and DOES
        NOT have to be designated as "[\?,\$\-]".  (Note in this example that
        the dash has not been placed BETWEEN characters within the Class
        notation.)

The following notations, each beginning with the "\" symbol, are used to indicate certain control characters or to identify a character by its numeric value.

\a In a search or replace pattern, represents the alert (beep) character

        (^G or ASCII 7).

        Example: 

            Search pattern:   \axyz 

            matches the string xyz following an alert character 

\b In a search or replace pattern, represents the backspace character

        (^H or ASCII 8).

\c In a search pattern, designates the placement of the cursor in the

        located string when used with the Find command.  By default, the
        cursor is positioned on the first character of the located string.
        However, if "\c" is used, the cursor is positioned on the character
        immediately following this operator.  If multiple "\c" operators are
        included, the last one is used.  This operator has no effect when
        used with the Replace command.

        Example: 

            Search pattern (with Find):   Hello \cWorld! 

            matches the string Hello World! and positions the cursor on the 
            character "W"

\f In a search or replace pattern, represents the formfeed character (^L

        or ASCII 12).

\n In a search pattern, represents the newline (line feed) character

        (^J or ASCII 10).  This operator is intended for use in Binary mode.
        It locates only imbedded newline characters; it does not represent the
        end-of-line position, nor does it cause a search to span lines.

        In a replace pattern, \n causes the line to be split (as if 
        SplitLine?Create() was invoked) at the point in the replace string where
        \n is specified.

\r In a search or replace pattern, represents the return character (^M

        or ASCII 13).  This operator is intended for use in Binary mode.  It
        locates only imbedded return characters; it does not represent the
        end-of-line position, nor does it cause a search to span lines.

\t In a search or replace pattern, represents the tab character (^I or

        ASCII 9).

\v In a search or replace pattern, represents the vertical tab character

        (^K or ASCII 11).

\xnn In a search or replace pattern, represents the character that is

        equivalent to the indicated hexadecimal value, where "nn" is a value
        from 00 through FF that must be specified as a 2-digit number.

        Example: 

            Search pattern:   ^\x40

            matches the string @ located at the beginning of the line 

\dnnn In a search or replace pattern, represents the character that is

        equivalent to the indicated decimal value, where "nnn" is a value from
        000 through 255 that must be specified as a 3-digit number.

        Example: 

            Search pattern:   \d064$ 

            matches the string @ located at the end of the line 

\onnn In a search or replace pattern, represents the character that is

        equivalent to the indicated octal value, where "nnn" is a value from
        000 through 377 that must be specified as a 3-digit number.

        Example: 

            Search pattern:   \\xyz\o100abc 

            matches the string \xyz@abc 

Minimum/Maximum?Create Closure ───────────────────────

Within the editor's Regular Expression search feature, you can indicate that you want a search to be satisfied either by zero or more, or by one or more, occurrences of a pattern, based on minimum or maximum closure. Minimum closure is achieved as soon as a string is located which contains the minimum set of characters that match a specified search pattern. Maximum closure is not achieved until the maximum set of characters is located that matches a specified search pattern. To satisfy either minimum or maximum closure, the editor searches for a matching string that is entirely contained on a single line. The symbols "*" (zero or more) and "+" (1 or more) are minimum closure operators; "@" (0 or more) and "#" (1 or more) are maximum closure operators.

To illustrate the difference between minimum and maximum closure in a search, consider the following:

    Given the text:  This_is_the_issue. 

    then: 

        Search pattern:  Thi.*is  (minimum closure, 0 or more occurrences) 

        matches the string This_is 

    and: 

        Search pattern:  Th.*is   (minimum closure, 0 or more occurrences) 

        matches the string This 

    HOWEVER: 

        Search pattern:  Thi.@is   (maximum closure, 0 or more occurrences) 

        matches the string This_is_the_is 

    and: 

        Search pattern:  Th.@is   (maximum closure, 0 or more occurrences) 

        also matches the string This_is_the_is 

Note that if the LAST character of a Regular Expression is a MINIMUM closure operator, then the number of characters matched will be 0 characters for the "*" operator (0 or more occurrences), and 1 character for the "+" operator (1 or more occurrences). Typically, you would not use either "*" or "+" as the LAST character of a Regular Expression, since the same result can be achieved without the use of these operators. However, you may encounter situations where it is useful to include the MAXIMUM closure operators ("@" and "#") as the last character of a Regular Expression. For example, consider the following:

    Given the text:  abbbxyz 

    then: 

        Search pattern:  ab*      (minimum closure, 0 or more occurrences) 

        matches the string a (because zero occurrences of b satisfies the 
        minimum); thus, you can achieve the same result with the simple Search
        pattern:  a

    and: 

        Search pattern:  ab+      (minimum closure, 1 or more occurrences) 

        matches the string ab (and not abbb, because one occurrence of b 
        satisfies the minimum designated number of occurrences); thus, you can
        achieve the same result with the simple Search pattern:  ab

    HOWEVER: 

        Search patterns:  ab@   or   ab#    (maximum closure) 

        both match the string abbb (because either search is satisfied only 
        by the maximum number of occurrences of b)

    ALSO NOTE: 

        Search patterns:  ab*x   or   ab+x   or   ab@x   or   ab#x

        ALL match the string abbbx 

Here is another example distinguishing minimum and maximum closure:

    Given the text: 

        They suggested we start singing something while sightseeing. 

    then: 

        Search pattern:  s[a-z]*g    (minimum closure, 0 or more occurrences) 

        matches the strings, sug, sing, something, sig, and seeing, as 
        indicated in bold in the following:

            They suggested we start singing something while sightseeing. 

    HOWEVER: 

        Search pattern:  s[a-z]@g    (maximum closure, 0 or more occurrences) 

        matches the strings, sugg, singing, something, and sightseeing, as 
        indicated in bold by the following:

            They suggested we start singing something while sightseeing. 

Additional Regular Expression Examples ──────────────────────────────────────

Following are additional examples illustrating the use of Regular Expressions.

    ∙ To find any string enclosed in double quotes (""), use the following
      search pattern:

          ".*"

      Use the following variation of this to position the cursor on the last
      character of the located string (so if you search Again, the search will
      begin immediately AFTER the quote at the end of the located string):

          ".*\c"

    ∙ To find any string enclosed in single ('') OR double ("") quotes, use
      the following search pattern:

          {".*"}|{'.*'}

    ∙ To find any white (tab or space) character, use either of the following
      search patterns:

          [ \x09]

          [ \t]

    ∙ To find any non-white character, use either of the following search
      patterns:

          [~ \x09]

          [~ \t]

    ∙ To find a blank line, use the following search pattern:

          ^$

    ∙ To find blank lines, or lines that contain only white characters, use
      either of the following search patterns:

          ^[ \x09]@$

          ^[ \t]@$

Summary List of Regular Expression Operators ────────────────────────────────────────────

Symbol Regular Expression Operation ______ _____________________________________________________________________

. Search: matches any single character (except end-of-line)

^ Search: anchors to beginning of line (or block)

$ Search: anchors to end of line (or block)

| Search "Or" operator: matches preceding or following sub-pattern

? Search: optionally matches preceding sub-pattern

[ ] Search: identifies a Class of characters

[ - ] Search: indicates a range of characters in a Class

[~ ] Search: identifies a complement Class

        (minimum closure)

+ Search: matches 1 or more occurrences of preceding sub-pattern

        (minimum closure)

@ Search: matches 0 or more occurrences of preceding sub-pattern

        (maximum closure)

  1. Search: matches 1 or more occurrences of preceding sub-pattern
        (maximum closure)

{ } Search: Tags a sub-pattern

\0..\9 Replace: references a Tagged search sub-pattern

\ Search/replace: Escape operator (overrides Regular Expression

        operators)

\a Search/replace: represents alert (beep) character (^G or ASCII 7)

\b Search/replace: represents backspace character (^H or ASCII 8)

\c Search: positions cursor within located string (with Find command)

\f Search/replace: represents formfeed character (^L or ASCII 12)

\n Search/replace: represents newline (line feed) character (^J or

        ASCII 10) if used in a "Search for:" specification.  Causes a
        SplitLine?Create() operation to occur (at the specified position) if used in
        a "Replace with:" specification.

\r Search/replace: represents return character (^M or ASCII 13)

\t Search/replace: represents tab character (^I or ASCII 9)

\v Search/replace: represents vertical tab character (^K or ASCII 11)

\xnn Search/replace: represents hexadecimal value of equivalent character

\dnnn Search/replace: represents decimal value of equivalent character

\onnn Search/replace: represents octal value of equivalent character

From Software



Contents | (Visit Preferences to set your user name.) | Related To Qedit | RecentChanges | Preferences | Index | Login | Logout
Edit this www.chat11.com page | View other versions (diff)
Search:
Sign up for PayPal and start accepting credit card payments
instantly.
Bobsgear - Get A Free Enterrpise Wiki Space!
Review: The Bobsgear Project was started to develop a variety of Confluence plugins. This installation of the Confluence Enterprise wiki includes flexible attachments, many Confluence plugins, personal blogs, interesting articles, and more. Bobsgear already has spaces related to politics, art and photography wiki, technical issues wiki, ediscovery wiki, health, Christian theology and Sabbath School wiki, the bible, book reviews, and quotations. Bobsgear allows free signup, and invites anyone to create a free hosted Confluence wiki space.


NEW USERS CLICK HERE! for a quick introduction to Wiki.

 

 Interested in HSI ENGL 242-0731- English Literature II?
192 total hits since 9/2007
Recently accessed pages: IPS - Iron Pipe Size John - The Apostle Who Personified Love Lutheran Guide To Lent Nostradamus Problems Caused By Unreadable Recovery CDs Problems With Sony Vaio Recovery Wizard SpongeBob Stretching FAQ B.4 - Seated Leg Stretches Terazzo You Are Not Authorized To View This Page Error From IIS When Using Perl And LWP To Browse Localhost

Elapsed:1