Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

UTF-8/unicode input in querying in Lucene

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


goksron at gmail

Sep 14, 2007, 1:01 PM

Post #1 of 4 (1522 views)
Permalink
UTF-8/unicode input in querying in Lucene

Hi-

The page http://lucene.apache.org/java/docs/queryparsersyntax.html does not
mention that \uNNNN Unicode syntax is supported.
For example, \u0048\u0045\u004c\u004c\u004f is HELLO.

Please add this to the page, it took experimentation to discover it.

Thanks,

Lance Norskog


hossman_lucene at fucit

Sep 14, 2007, 5:47 PM

Post #2 of 4 (1443 views)
Permalink
Re: UTF-8/unicode input in querying in Lucene [In reply to]

: The page http://lucene.apache.org/java/docs/queryparsersyntax.html does not
: mention that \uNNNN Unicode syntax is supported.
: For example, \u0048\u0045\u004c\u004c\u004f is HELLO.
:
: Please add this to the page, it took experimentation to discover it.

I don't believe the QueryParser actually treats \uNNNNN as a special
syntax ... what you may have encountered was that when *javac* parses a
literal string constant, those sequences have special meaning -- but they
are already the literal unicode characters long before QueryParser sees
them.

As far as query parser is concerned the backslash in \uNNNNN is only
escaping the "u" (all characters can be escaped, wether they need it or
not)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


yonik at apache

Sep 14, 2007, 6:14 PM

Post #3 of 4 (1465 views)
Permalink
Re: UTF-8/unicode input in querying in Lucene [In reply to]

On 9/14/07, Chris Hostetter <hossman_lucene [at] fucit> wrote:
> I don't believe the QueryParser actually treats \uNNNNN as a special
> syntax

LUCENE-716 added unicode escapes.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


hossman_lucene at fucit

Sep 14, 2007, 7:43 PM

Post #4 of 4 (1456 views)
Permalink
Re: UTF-8/unicode input in querying in Lucene [In reply to]

: > I don't believe the QueryParser actually treats \uNNNNN as a special
: > syntax
:
: LUCENE-716 added unicode escapes.

doh! that's what i get for assuming the random solr port i used to sanity
check my assumption was relatively up to date.

LUCENE-1000



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.