Mastering Regular Expressions in Java 2nd Edition.pdf
(
1023 KB
)
Pobierz
15177103 UNPDF
Perl, .NET, Java, and More
Regular
Expressions
Jeffrey E.F. Friedl
Mastering
Mastering Regular Expressions
Second Edition
Jeffrey E. F. Friedl
Beijing
•
Cambridge
•
Farnham
•
Köln
•
Paris
•
Sebastopol
•
Taipei
•
Tokyo
Java
Java didn’t come with a regex package until Java 1.4, so early programmers had to
do without regular expressions. Over time, many programmers independently
developed Java regex packages of varying degrees of quality, functionality, and
complexity. With the early-2002 release of Java 1.4, Sun entered the fray with their
java.util.regex
package. In preparing this chapter, I looked at Sun’s package,
and a few others (detailed starting on page 372). So which one is best? As you’ll
soon see, there can be many ways to judge that.
In This Chapter Before looking at what’s in this chapter, it’s important to mention
what’s
not
in this chapter. In short, this chapter doesn’t restate everything from
Chapters 1 through 6. I understand that some readers interested only in Java may
be inclined to start their reading with this chapter, and I want to encourage them
not to miss the benefits of the preface and the earlier chapters: Chapters 1, 2,
and 3 introduce basic concepts, features, and techniques involved with regular
expressions, while Chapters 4, 5, and 6 offer important keys to regex understand-
ing that directly apply to every Java regex package that I know of.
As for this chapter, it has several distinct parts. The first part, consisting of “Judging
a Regex Package” and “Object Models,” looks abstractly at some concepts that help
you to understand an unfamiliar package more quickly, and to help judge its suit-
ability for your needs. The second part, “Packages, Packages, Packages,” moves
away from the abstract to say a few words about the specific packages I looked at
while researching this book. Finally, we get to the real fun, as the third part talks
in specifics about two of the packages, Sun’s
java.util.regex
and Jakarta’s
ORO
package.
365
366
Chapter 8: Java
Judging a Regex Package
The first thing most people look at when judging a regex package is the regex fla-
vor itself, but there are other technical issues as well. On top of that, “political”
issues like source code availability and licensing can be important. The next sec-
tions give an overview of some points of comparison you might use when select-
ing a regex package.
143)
•
Rich Flavor? How full-featured is the flavor? How many of the items on
page 113 are supported? Are they supported well? Some things are more
important than others: lookaround and lazy quantifiers, for example, are more
important than possessive quantifiers and atomic grouping, because look-
around and lazy quantifiers can’t be mimicked with other constructs, whereas
possessive quantifiers and atomic grouping can be mimicked with lookahead
that allows capturing parentheses.
•
Unicode Support? How well is Unicode supported? Java strings support Uni-
code intrinsically, but does
!\w"
know which Unicode characters are “word”
characters? What about
!\d"
and
!\s"
? Does
!\b"
understand Unicode? (Does its
idea of a word character match
!\w"
’s idea of a word character?) Are Unicode
properties supported? How about blocks? Scripts? (
119) Which version of
Unicode’s mappings do they support: Version 3.0? Version 3.1? Version 3.2?
Does case-insensitive matching work properly with the full breadth of Uni-
code characters? For example, does a case-insensitive ‘
ß
’ really match ‘
SS
’?
(Even in lookbehind?)
•
How Flexible? How flexible are the mechanics? Can the regex engine deal
only with
String
objects, or the whole breadth of
CharSequence
objects? Is it
easy to use in a multi-threaded environment?
•
How Convenient? The raw engine may be powerful, but are there extra
“convenience functions” that make it easy to do the common things without a
lot of cumbersome overhead? Does it, borrowing a quote from Perl, “make the
easy things easy, and the hard things possible?”
•
JRE
Requirements? What version of the
JRE
does it require? Does it need the
latest version, which many may not be using yet, or can it run on even an old
(and perhaps more common)
JRE
?
Technical Issues
Some of the technical issues to consider are:
•
Engine Type? Is the underlying engine an
NFA
or
DFA
?Ifan
NFA
,isita
POSIX
NFA
or a Traditional
NFA
? (See Chapter 4
Judging a Regex Package
367
•
Ef ficient? How efficient is it? The length of Chapter 6 tells you how much
there is to be said on this subject. How many of the optimizations described
there does it do? Is it efficient with memory, or does it bloat over time? Do
you have any control over resource utilization? Does it employ lazy evaluation
to avoiding computing results that are never actually used?
•
Does it Work? When it comes down to it, does the package work? Are there
a few major bugs that are “deal-breakers?” Are there many little bugs that
would drive you crazy as you uncover them? Or is it a bulletproof, rock-solid
package that you can rely on?
Of course, this list just the tip of the iceberg — each of these bullet points could be
expanded out to a full chapter on its own. We’ll touch on them when comparing
packages later in this chapter.
Social and Political Issues
Some of the non-technical issues to consider are:
•
Documented? Does it use Javadoc? Is the documentation complete? Correct?
Approachable? Understandable?
•
Maintained? Is the package still being maintained? What’s the turnaround
time for bugs to be fixed? Do the maintainers really care about the package? Is
it being enhanced?
•
Suppor t and Popular ity? Is there official support, or an active user community
you can turn to for reliable support (and that you can provide support to,
once you become skilled in its use)?
•
Ubiquity? Can you assume that the package is available everywhere you go,
or do you have to include it whenever you distribute your programs?
•
Licensing?
May
you redistribute it when you distribute your programs? Are
the terms of the license something you can live with? Is the source code avail-
able for inspection?
May
you redistribute modified versions of the source
code?
Must
you?
Well, there are certainly a lot of questions. Although this book can give you the
answers to some of them, it can’t answer the most important question:
which is
right for you?
I make some recommendations later in this chapter, but only you
can decide which is best for you. So, to give you more background upon which to
base your decision, let’s look at one of the most basic aspects of a regex package:
its object model.
Plik z chomika:
X-files
Inne pliki z tego folderu:
Wireless Java Developing with J2ME 2nd 2003.chm
(2615 KB)
Wireless J2ME Platform Programming , 2002.pdf
(2614 KB)
using uml for modeling a distributed java application 1997.pdf
(707 KB)
Using Java 2 Standard Ed 2001.chm
(6449 KB)
Using Enterprise JavaBeans 2.0 2002.chm
(1098 KB)
Inne foldery tego chomika:
130 linux and unix ebooks
132 C and C++ ebooks
156 database ebooks
237.For.Dummies.ebooks.Wiley.Publishing
Architecture e-books
Zgłoś jeśli
naruszono regulamin