Skip to main content

BreakIterator example

2 replies [Last post]
Anonymous

The BreakIterator description contains the following example.\

Find the next word:

public static int nextWordStartAfter(int pos, String text) {
BreakIterator wb = BreakIterator.getWordInstance();
wb.setText(text);
int last = wb.following(pos);
int current = wb.next();
while (current != BreakIterator.DONE) {
for (int p = last; p < current; p++) {
if (Character.isLetter(text.codePointAt(p))
return last;
}
last = current;
current = wb.next();
}
return BreakIterator.DONE;
}

In the inner for loop, what happens if a non-letter surrogate is found?

Peter
--
Peter B. West
Folio

===========================================================================
To unsubscribe, send email to listserv@java.sun.com and include in the body
of the message "signoff JAVA2D-INTEREST". For general help, send email to
listserv@java.sun.com and include in the body of the message "help".

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Phil Race

Peter,

First the delay in getting any response is because this isn't a Java2D
question so I don't
think there's a whole lot of expertise on this list in this area.

I am informed that a better place to have asked this question would have
been :

http://developers.sun.com/contact/feedback.jsp?category=j2se&mailsubject=Internationalization%20(I18N)

which I agree is obscure.

Anyway I asked the experts and the response I received was:

>And for the break iterator question, assuming that s/he refers to "a
non-letter supplementary character"
>by "a non-letter surrogate", I think the code happens to work even
though the loop is doing "p++",
>since a stand alone surrogate code point is always non-letter. So it
just skips the low surrogate in the loop.

For further clarification I suggest the above mentioned web form.

-Phil.

Peter B. West wrote:
> The BreakIterator description contains the following example.\
>
> Find the next word:
>
> public static int nextWordStartAfter(int pos, String text) {
> BreakIterator wb = BreakIterator.getWordInstance();
> wb.setText(text);
> int last = wb.following(pos);
> int current = wb.next();
> while (current != BreakIterator.DONE) {
> for (int p = last; p < current; p++) {
> if (Character.isLetter(text.codePointAt(p))
> return last;
> }
> last = current;
> current = wb.next();
> }
> return BreakIterator.DONE;
> }
>

>
> In the inner for loop, what happens if a non-letter surrogate is found?
>
> Peter
> --
> Peter B. West
> Folio
>
> ===========================================================================
>
> To unsubscribe, send email to listserv@java.sun.com and include in the
> body
> of the message "signoff JAVA2D-INTEREST". For general help, send
> email to
> listserv@java.sun.com and include in the body of the message "help".

===========================================================================
To unsubscribe, send email to listserv@java.sun.com and include in the body
of the message "signoff JAVA2D-INTEREST". For general help, send email to
listserv@java.sun.com and include in the body of the message "help".

Phil Race

Phil Race wrote:

> Peter,
>
> First the delay in getting any response is because this isn't a Java2D
> question so I don't
> think there's a whole lot of expertise on this list in this area.
>
> I am informed that a better place to have asked this question would have
> been :
>
> http://developers.sun.com/contact/feedback.jsp?category=j2se&mailsubject=Internationalization%20(I18N)
>
>
> which I agree is obscure.

PS .. I was just informed of a Java Internationalization forum :
http://forum.java.sun.com/forum.jspa?forumID=16

that's the real best place for this question

-phil.

===========================================================================
To unsubscribe, send email to listserv@java.sun.com and include in the body
of the message "signoff JAVA2D-INTEREST". For general help, send email to
listserv@java.sun.com and include in the body of the message "help".