Ace-T's Blog 내 검색 [네이버 커넥트 이웃 합니다~^-^/ 요청 大 환영~~]

JBUG - JDK8 Study(정규표현식)- Differences Among Greedy, Reluctant, and Possessive Quantifiers

Study/Study group 2014.07.17 00:41
[Good Comment!!, Good Discussion!!, Good Contens!!]
[ If you think that is useful, please click the finger on the bottom~^-^good~ ]
by ace-T



오늘 이슈가 되었던 내용을 소개 하려고 한다.


링크 : http://docs.oracle.com/javase/tutorial/essential/regex/index.html


Differences Among Greedy, Reluctant, and Possessive Quantifiers

의 내용 이였다.

우선은 아래의 내용을 참고 해보도록 하자.

Quantifiers

Quantifiers allow you to specify the number of occurrences to match against. For convenience, the three sections of the Pattern API specification describing greedy, reluctant, and possessive quantifiers are presented below. At first glance it may appear that the quantifiers X?, X?? and X?+ do exactly the same thing, since they all promise to match "X, once or not at all". There are subtle implementation differences which will be explained near the end of this section.

GreedyReluctantPossessiveMeaning
X?X??X?+X, once or not at all
X*X*?X*+X, zero or more times
X+X+?X++X, one or more times
X{n}X{n}?X{n}+X, exactly n times
X{n,}X{n,}?X{n,}+X, at least n times
X{n,m}X{n,m}?X{n,m}+X, at least n but not more than m times



그리고..초반에 매우 중요한 그림이 있었죠!




문제의 부분

There are subtle differences among greedy, reluctant, and possessive quantifiers.

Greedy quantifiers are considered "greedy" because they force the matcher to read in, or eat, the entire input string prior to attempting the first match. If the first match attempt (the entire input string) fails, the matcher backs off the input string by one character and tries again, repeating the process until a match is found or there are no more characters left to back off from. Depending on the quantifier used in the expression, the last thing it will try matching against is 1 or 0 characters.

The reluctant quantifiers, however, take the opposite approach: They start at the beginning of the input string, then reluctantly eat one character at a time looking for a match. The last thing they try is the entire input string.

Finally, the possessive quantifiers always eat the entire input string, trying once (and only once) for a match. Unlike the greedy quantifiers, possessive quantifiers never back off, even if doing so would allow the overall match to succeed.

To illustrate, consider the input string xfooxxxxxxfoo.

 
Enter your regex: .*foo  // greedy quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo  // reluctant quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .*+foo // possessive quantifier
Enter input string to search: xfooxxxxxxfoo
No match found.

The first example uses the greedy quantifier .* to find "anything", zero or more times, followed by the letters "f" "o" "o". Because the quantifier is greedy, the .* portion of the expression first eats the entire input string. At this point, the overall expression cannot succeed, because the last three letters ("f" "o" "o") have already been consumed. So the matcher slowly backs off one letter at a time until the rightmost occurrence of "foo" has been regurgitated, at which point the match succeeds and the search ends.

The second example, however, is reluctant, so it starts by first consuming "nothing". Because "foo" doesn't appear at the beginning of the string, it's forced to swallow the first letter (an "x"), which triggers the first match at 0 and 4. Our test harness continues the process until the input string is exhausted. It finds another match at 4 and 13.

The third example fails to find a match because the quantifier is possessive. In this case, the entire input string is consumed by .*+, leaving nothing left over to satisfy the "foo" at the end of the expression. Use a possessive quantifier for situations where you want to seize all of something without ever backing off; it will outperform the equivalent greedy quantifier in cases where the match is not immediately found.


이슈가 되었던 부분은..


Enter your regex: .*foo  // greedy quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo  // reluctant quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .*+foo // possessive quantifier
Enter input string to search: xfooxxxxxxfoo
No match found.

위의 부분에서 greedy quantifier와 possessive quantifier에서 매칭되어지는 과정에서 궁금증이 생겼다.
greedy : 매칭
possessive : No 매칭

Why??? 

생각해보자.

.*foo 의 패턴! xfooxxxxxxfoo 전체 매칭??? greedy에서..처음에 바로 전체매칭이 된다고 의견이

나왔다.

하지만 튜토리얼에도 나온다.

the quantifier is greedy, the .* portion of the expression first eats the entire input string. At this point, the overall expression cannot succeed, because the last three letters ("f" "o" "o") have already been consumed.


possessive 역시 No match found. 이다.

그러면..어떻게...왜?? greedy에서 매칭이 실패한 걸까????

(아..잠이 와서 무척 포스팅 하는데 건성건성~이군요..하하;)


저의 답은..




# greedy가 전체매칭에 실패했다는 내용



지하철에서 생각해보고 집에와서 그려보았네요ㅎㅎ;;


그리고 시큐리티 매니저라는 친구에 대해서도 알아보았드랬죵!


참고 : 2014/07/03 - [Study/Study group] - JBUG - JDK8 Study(Concurrency_01)


     - END -




acet 박태하가 추천하는 readtrend 추천글!

설정

트랙백

댓글

:::: facebook을 이용하시는 분들은 로그인 후 아래에 코멘트를 남겨주세요 ::::

티스토리 툴바