After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 328302 - can't search only Korean files.
can't search only Korean files.
Status: RESOLVED FIXED
Product: beagle
Classification: Other
Component: General
0.2.0
Other All
: Normal normal
: ---
Assigned To: Beagle Bugs
Beagle Bugs
Depends on:
Blocks:
 
 
Reported: 2006-01-23 14:28 UTC by sangu
Modified: 2006-10-20 16:23 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Hangul support patch for Lucene.Net (4.33 KB, patch)
2006-01-25 07:38 UTC, dittos
none Details | Review
add Hangul Jamo and Syllable area in Lucene.Net (6.40 KB, patch)
2006-01-26 01:02 UTC, Young-Ho Cha
committed Details | Review
Hangul Jamo support patch for CVS Head (6.27 KB, patch)
2006-10-08 05:02 UTC, Young-Ho Cha
none Details | Review

Description sangu 2006-01-23 14:28:02 UTC
Please describe the problem:
can't search cjk contents.

Steps to reproduce:
1. type Korean or Japanese string.
2. 
3. 


Actual results:


Expected results:


Does this happen every time?
always

Other information:
Fedora development 
mono 1.1.13
Comment 1 Joe Shaw 2006-01-23 18:04:45 UTC
Can you provide an example file and an example query, please?
Comment 2 dittos 2006-01-25 07:37:12 UTC
Beagle can't search Korean strings only.
So I made a patch of Lucene.Net which supports Hangul unicode area.
Comment 3 dittos 2006-01-25 07:38:08 UTC
Created attachment 58063 [details] [review]
Hangul support patch for Lucene.Net
Comment 4 dittos 2006-01-25 07:39:06 UTC
Comment on attachment 58063 [details] [review]
Hangul support patch for Lucene.Net

>--- beagle.orig/beagled/Lucene.Net/Analysis/Standard/StandardTokenizerTokenManager.cs	2006-01-23 23:04:05.000000000 +0900
>+++ beagle/beagled/Lucene.Net/Analysis/Standard/StandardTokenizerTokenManager.cs	2006-01-23 22:49:17.000000000 +0900
>@@ -78,20 +78,21 @@
> 			JjCheckNAdd(jjnextStates[start]);
> 			JjCheckNAdd(jjnextStates[start + 1]);
> 		}
>-		internal static readonly ulong[] jjbitVec0 = new ulong[]{0x1ff0000000000000L, 0xffffffffffffc000L, 0xffffffffL, 0x600000000000000L};
>+		internal static readonly ulong[] jjbitVec0 = new ulong[]{0x1ff0000000000000L, 0xffffffffffffc000L, 0xfffff000ffffffffL, 0x6000000007fffffL};
> 		internal static readonly ulong[] jjbitVec2 = new ulong[]{0x0L, 0xffffffffffffffffL, 0xffffffffffffffffL, 0xffffffffffffffffL};
> 		internal static readonly ulong[] jjbitVec3 = new ulong[]{0xffffffffffffffffL, 0xffffffffffffffffL, 0xffffL, 0x0L};
> 		internal static readonly ulong[] jjbitVec4 = new ulong[]{0xffffffffffffffffL, 0xffffffffffffffffL, 0x0L, 0x0L};
> 		internal static readonly ulong[] jjbitVec5 = new ulong[]{0x3fffffffffffL, 0x0L, 0x0L, 0x0L};
>-		internal static readonly ulong[] jjbitVec6 = new ulong[]{0x1600L, 0x0L, 0x0L, 0x0L};
>-		internal static readonly ulong[] jjbitVec7 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0xffc000000000L};
>-		internal static readonly ulong[] jjbitVec8 = new ulong[]{0x0L, 0x3ff00000000L, 0x0L, 0x3ff000000000000L};
>-		internal static readonly ulong[] jjbitVec9 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0xff8000000000L};
>-		internal static readonly ulong[] jjbitVec10 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0x0L};
>-		internal static readonly ulong[] jjbitVec11 = new ulong[]{0x0L, 0x3ff0000L, 0x0L, 0x3ff0000L};
>-		internal static readonly ulong[] jjbitVec12 = new ulong[]{0x0L, 0x3ffL, 0x0L, 0x0L};
>-		internal static readonly ulong[] jjbitVec13 = new ulong[]{0xfffffffeL, 0x0L, 0x0L, 0x0L};
>-		internal static readonly ulong[] jjbitVec14 = new ulong[]{0x0L, 0x0L, 0x0L, 0xff7fffffff7fffffL};
>+        internal static readonly ulong[] jjbitVec6 = new ulong[]{0xffffffffffffffffL, 0xffffffffffffffffL, 0xfffffffffL, 0x0L};
>+		internal static readonly ulong[] jjbitVec7 = new ulong[]{0x1600L, 0x0L, 0x0L, 0x0L};
>+		internal static readonly ulong[] jjbitVec8 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0xffc000000000L};
>+		internal static readonly ulong[] jjbitVec9 = new ulong[]{0x0L, 0x3ff00000000L, 0x0L, 0x3ff000000000000L};
>+		internal static readonly ulong[] jjbitVec10 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0xff8000000000L};
>+		internal static readonly ulong[] jjbitVec11 = new ulong[]{0x0L, 0xffc000000000L, 0x0L, 0x0L};
>+		internal static readonly ulong[] jjbitVec12 = new ulong[]{0x0L, 0x3ff0000L, 0x0L, 0x3ff0000L};
>+		internal static readonly ulong[] jjbitVec13 = new ulong[]{0x0L, 0x3ffL, 0x0L, 0x0L};
>+		internal static readonly ulong[] jjbitVec14 = new ulong[]{0xfffffffeL, 0x0L, 0x0L, 0x0L};
>+		internal static readonly ulong[] jjbitVec15 = new ulong[]{0x0L, 0x0L, 0x0L, 0xff7fffffff7fffffL};
> 		private int JjMoveNfa_0(int startState, int curPos)
> 		{
> 			int startsAt = 0;
>@@ -1165,6 +1166,9 @@
> 				
> 				case 61: 
> 					return ((jjbitVec5[i2] & l2) != (ulong) 0L);
>+
>+                case 215:
>+                    return ((jjbitVec6[i2] & l2) != (ulong) 0L);
> 				
> 				default: 
> 					if ((jjbitVec0[i1] & l1) != (ulong) 0L)
>@@ -1179,23 +1183,23 @@
> 			{
> 				
> 				case 6: 
>-					return ((jjbitVec8[i2] & l2) != (ulong) 0L);
>-				
>-				case 11: 
> 					return ((jjbitVec9[i2] & l2) != (ulong) 0L);
> 				
>-				case 13: 
>+				case 11: 
> 					return ((jjbitVec10[i2] & l2) != (ulong) 0L);
> 				
>-				case 14: 
>+				case 13: 
> 					return ((jjbitVec11[i2] & l2) != (ulong) 0L);
> 				
>-				case 16: 
>+				case 14: 
> 					return ((jjbitVec12[i2] & l2) != (ulong) 0L);
>+
>+                case 16:
>+                    return ((jjbitVec13[i2] & l2) != (ulong) 0L);
> 				
> 				default: 
>-					if ((jjbitVec6[i1] & l1) != (ulong) 0L)
>-						if ((jjbitVec7[i2] & l2) == (ulong) 0L)
>+					if ((jjbitVec7[i1] & l1) != (ulong) 0L)
>+						if ((jjbitVec8[i2] & l2) == (ulong) 0L)
> 							return false;
> 						else
> 							return true;
>@@ -1209,10 +1213,10 @@
> 			{
> 				
> 				case 0: 
>-					return ((jjbitVec14[i2] & l2) != (ulong) 0L);
>+					return ((jjbitVec15[i2] & l2) != (ulong) 0L);
> 				
> 				default: 
>-					if ((jjbitVec13[i1] & l1) != (ulong) 0L)
>+					if ((jjbitVec14[i1] & l1) != (ulong) 0L)
> 						return true;
> 					return false;
>
Comment 5 Joe Shaw 2006-01-25 18:16:31 UTC
How did you generate this patch?  Did you patch the source directly, or did you get it from the Java Lucene, or what?  The code is basically impossible to follow, so I'm not sure how to test it comprehensively.
Comment 6 dittos 2006-01-26 00:48:44 UTC
(In reply to comment #5)
> How did you generate this patch?  Did you patch the source directly, or did you
> get it from the Java Lucene, or what?  The code is basically impossible to
> follow, so I'm not sure how to test it comprehensively.
> 

First, I added unicode Hangul area in StandardTokenizer.jj and generated java code with JavaCC. And then I manually applied changes to StandardTokenizerTokenManager.cs.
Comment 7 Young-Ho Cha 2006-01-26 01:00:38 UTC
after apply from dittos' patch, I can search hangul syllable("\uac00"-"\ud7a3"), but can't search hangul jamo(decomposited)("\u1100"-"\u11f9") code. so I add hangul jamo area in StandardTokenizer.jj.

Comment 8 Young-Ho Cha 2006-01-26 01:02:38 UTC
Created attachment 58132 [details] [review]
add Hangul Jamo and Syllable area in Lucene.Net
Comment 9 Joe Shaw 2006-01-26 19:14:58 UTC
I've committed this patch, thanks!
Comment 10 Debajyoti Bera 2006-10-05 17:48:28 UTC
Young-Ho Cha and dittos,
  Recently we merged Lucene.Net-1.9.1 to CVS head. The patch that was merged earlier didnot apply cleanly and then it looked like it wont be necessary in 1.9.1. As of now, the CVS head doesnt contain the patch. Could one of you test the current CVS to see everything is working correctly as before ?
Thanks,
- dBera
Comment 11 Young-Ho Cha 2006-10-08 05:01:28 UTC
There is Hangul syllable support in CVS Head, but no support for Hangul Jamo.

I'll attach a patch for Hangul Jamo support.

Comment 12 Young-Ho Cha 2006-10-08 05:02:25 UTC
Created attachment 74265 [details] [review]
Hangul Jamo support patch for CVS Head
Comment 13 Joe Shaw 2006-10-20 16:23:58 UTC
Sorry, I missed this patch for the 0.2.11 release, I've just checked it in.

I'm also going to submit the .jj patch upstream to Lucene.