Isearch patch

Below is the emacs-devel thread about adding an isearch keybinding to yank one letter at a time into the search string. If you're in a hurry, you can get a good summary just by reading the last two messages, from Karl Fogel and Gerd Moellman respectively.


From: Karl Fogel <kfogel@collab.net>
To: emacs-devel@gnu.org
Subject: [PATCH] better isearch support for complex input methods
Date: 02 Apr 2001 20:45:46 -0500
Reply-to: kfogel@red-bean.com
Message-ID: <87pueueoxh.fsf@collab.net>


This patch adds a new binding to isearch mode, C-o, which grabs one
letter at a time from the buffer and adds it to the search string.

Here's why this is useful:

Until recently, there was no point having a special keybinding for
this in isearch mode, because it would almost always have been just as
easy to type the letter itself, as to type a special binding to grab
the letter.

However, with modern input methods (such as `chinese-tonepy'), typing
a letter can be an involved process.  In fact, in Chinese language
environments, what Emacs thinks of as a single letter is often
conceptually a whole word, i.e., one ideograph.  It's very useful to
be able to grab one ideograph at a time from the buffer and add it to
the search string.

Currently, if one uses C-w in isearch in Chinese text, it grabs to the
next punctuation mark or end of line.  This is usually not what the
user wanted -- it's much more likely that they want to grab one
Chinese character at a time.

This patch binds C-o to a new function `isearch-yank-letter'.  C-o
seemed like a key unlikely to be often used for exiting isearch.
There may be a better binding I haven't thought of, though.

-Karl

2001-04-02  Karl Fogel  <kfogel@red-bean.com>

	* isearch.el (isearch-mode-map): Bind C-o in isearch to yank the
	next letter from the buffer into the search string.
	(isearch-yank-internal): New helper function, contains common
	internals of next three.
	(isearch-yank-letter): New function.
	(isearch-yank-word): Implement using isearch-yank-internal.
	(isearch-yank-line): Implement using isearch-yank-internal.

Index: isearch.el
===================================================================
RCS file: /cvs/emacs/lisp/isearch.el,v
retrieving revision 1.188
diff -u -r1.188 isearch.el
--- isearch.el	2001/02/05 17:13:28	1.188
+++ isearch.el	2001/04/03 00:20:57
@@ -286,6 +286,7 @@
     (define-key map " " 'isearch-whitespace-chars)
     (define-key map [?\S-\ ] 'isearch-whitespace-chars)
     
+    (define-key map "\C-o" 'isearch-yank-letter)
     (define-key map "\C-w" 'isearch-yank-word)
     (define-key map "\C-y" 'isearch-yank-line)
 
@@ -1057,24 +1058,32 @@
 	(isearch-yank-x-selection)
       (mouse-yank-at-click click arg))))
 
-(defun isearch-yank-word ()
-  "Pull next word from buffer into search string."
-  (interactive)
+(defun isearch-yank-internal (jumpform)
+  "Pull the text from point to the point reached by JUMPFORM.
+JUMPFORM is a lambda expression that takes no arguments and returns a
+buffer position, possibly having moved point to that position.  For
+example, it might move point forward by a word and return point, or it
+might return the position of the end of the line."
   (isearch-yank-string
    (save-excursion
      (and (not isearch-forward) isearch-other-end
 	  (goto-char isearch-other-end))
-     (buffer-substring-no-properties
-      (point) (progn (forward-word 1) (point))))))
+     (buffer-substring-no-properties (point) (funcall jumpform)))))
 
+(defun isearch-yank-letter ()
+  "Pull next letter from buffer into search string."
+  (interactive)
+  (isearch-yank-internal (lambda () (forward-char 1) (point))))
+
+(defun isearch-yank-word ()
+  "Pull next word from buffer into search string."
+  (interactive)
+  (isearch-yank-internal (lambda () (forward-word 1) (point))))
+
 (defun isearch-yank-line ()
   "Pull rest of line from buffer into search string."
   (interactive)
-  (isearch-yank-string
-   (save-excursion
-     (and (not isearch-forward) isearch-other-end
-	  (goto-char isearch-other-end))
-     (buffer-substring-no-properties (point) (line-end-position)))))
+  (isearch-yank-internal (lambda () (line-end-position))))
 
 
 (defun isearch-search-and-update ()


From: Kenichi Handa <handa@etl.go.jp>
To: kfogel@red-bean.com
CC: emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Message-Id: <200104030440.NAA02762@mule.m17n.org>


Karl Fogel <kfogel@collab.net> writes:
> This patch adds a new binding to isearch mode, C-o, which grabs one
> letter at a time from the buffer and adds it to the search string.

I surely like this feature.   Actually, it's in my todo list.


> Here's why this is useful:

> Until recently, there was no point having a special keybinding for
> this in isearch mode, because it would almost always have been just as
> easy to type the letter itself, as to type a special binding to grab
> the letter.

> However, with modern input methods (such as `chinese-tonepy'), typing
> a letter can be an involved process.  In fact, in Chinese language
> environments, what Emacs thinks of as a single letter is often
> conceptually a whole word, i.e., one ideograph.  It's very useful to
> be able to grab one ideograph at a time from the buffer and add it to
> the search string.

> Currently, if one uses C-w in isearch in Chinese text, it grabs to the
> next punctuation mark or end of line.  This is usually not what the
> user wanted -- it's much more likely that they want to grab one
> Chinese character at a time.

> This patch binds C-o to a new function `isearch-yank-letter'.  C-o
> seemed like a key unlikely to be often used for exiting isearch.
> There may be a better binding I haven't thought of, though.

> -Karl

> 2001-04-02  Karl Fogel  <kfogel@red-bean.com>

> 	* isearch.el (isearch-mode-map): Bind C-o in isearch to yank the
> 	next letter from the buffer into the search string.
> 	(isearch-yank-internal): New helper function, contains common
> 	internals of next three.
> 	(isearch-yank-letter): New function.
> 	(isearch-yank-word): Implement using isearch-yank-internal.
> 	(isearch-yank-line): Implement using isearch-yank-internal.

> Index: isearch.el
> ===================================================================
> RCS file: /cvs/emacs/lisp/isearch.el,v
> retrieving revision 1.188
> diff -u -r1.188 isearch.el
> --- isearch.el	2001/02/05 17:13:28	1.188
> +++ isearch.el	2001/04/03 00:20:57
> @@ -286,6 +286,7 @@
>      (define-key map " " 'isearch-whitespace-chars)
>      (define-key map [?\S-\ ] 'isearch-whitespace-chars)
     
> +    (define-key map "\C-o" 'isearch-yank-letter)
>      (define-key map "\C-w" 'isearch-yank-word)
>      (define-key map "\C-y" 'isearch-yank-line)
 
> @@ -1057,24 +1058,32 @@
>  	(isearch-yank-x-selection)
>        (mouse-yank-at-click click arg))))
 
> -(defun isearch-yank-word ()
> -  "Pull next word from buffer into search string."
> -  (interactive)
> +(defun isearch-yank-internal (jumpform)
> +  "Pull the text from point to the point reached by JUMPFORM.
> +JUMPFORM is a lambda expression that takes no arguments and returns a
> +buffer position, possibly having moved point to that position.  For
> +example, it might move point forward by a word and return point, or it
> +might return the position of the end of the line."
>    (isearch-yank-string
>     (save-excursion
>       (and (not isearch-forward) isearch-other-end
>  	  (goto-char isearch-other-end))
> -     (buffer-substring-no-properties
> -      (point) (progn (forward-word 1) (point))))))
> +     (buffer-substring-no-properties (point) (funcall jumpform)))))
 
> +(defun isearch-yank-letter ()
> +  "Pull next letter from buffer into search string."
> +  (interactive)
> +  (isearch-yank-internal (lambda () (forward-char 1) (point))))
> +
> +(defun isearch-yank-word ()
> +  "Pull next word from buffer into search string."
> +  (interactive)
> +  (isearch-yank-internal (lambda () (forward-word 1) (point))))
> +
>  (defun isearch-yank-line ()
>    "Pull rest of line from buffer into search string."
>    (interactive)
> -  (isearch-yank-string
> -   (save-excursion
> -     (and (not isearch-forward) isearch-other-end
> -	  (goto-char isearch-other-end))
> -     (buffer-substring-no-properties (point) (line-end-position)))))
> +  (isearch-yank-internal (lambda () (line-end-position))))
 
 
>  (defun isearch-search-and-update ()

> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://mail.gnu.org/mailman/listinfo/emacs-devel

From: Kenichi Handa <handa@etl.go.jp>
To: kfogel@red-bean.com
CC: emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: Tue, 3 Apr 2001 13:45:23 +0900 (JST)
Message-Id: <200104030445.NAA02860@mule.m17n.org>

Karl Fogel <kfogel@collab.net> writes:
> This patch adds a new binding to isearch mode, C-o, which grabs one
> letter at a time from the buffer and adds it to the search string.

I surely like this feature.  Actually, I'm using the similar
command personally, and committing it is in my todo list for
21.2.  But, the command name should be isearch-yank-char (or
isearch-yank-character) because it pulls not only letters
but also newline, tab, space, etc.  And, I think C-c is the
better binding.

---
Ken'ichi HANDA
handa@etl.go.jp


From: Andreas Schwab <schwab@suse.de>
To: Kenichi Handa <handa@etl.go.jp>
Cc: kfogel@red-bean.com, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 03 Apr 2001 11:16:19 +0200
References: <200104030445.NAA02860@mule.m17n.org>
Message-ID: <je3dbqqr6k.fsf@hawking.suse.de>


Kenichi Handa <handa@etl.go.jp> writes:

|> And, I think C-c is the better binding.

I don't think so.  C-c has already a defined meaning as a prefix key and
should not be overloaded by isearch.

Andreas.

-- 
Andreas Schwab                                  "And now for something
SuSE Labs                                        completely different."
Andreas.Schwab@suse.de
SuSE GmbH, Schanzäckerstr. 10, D-90443 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5


From: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>
To: Karl Fogel <kfogel@collab.net>
Cc: emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods 
Date: Tue, 03 Apr 2001 10:11:27 -0400
References: <87pueueoxh.fsf@collab.net> 
Message-Id: <200104031411.f33EBRO32119@rum.cs.yale.edu>


> Currently, if one uses C-w in isearch in Chinese text, it grabs to the
> next punctuation mark or end of line.  This is usually not what the
> user wanted -- it's much more likely that they want to grab one
> Chinese character at a time.

This suggests that instead of introducing a new binding C-o we should
just change C-w so that it yanks in either one word or one char
depending on the alphabet.


	Stefan

From: Karl Fogel <kfogel@collab.net>
To: Andreas Schwab <schwab@suse.de>
Cc: Kenichi Handa <handa@etl.go.jp>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 03 Apr 2001 13:04:46 -0500
Message-ID: <874rw5hnb5.fsf@collab.net>
References: <200104030445.NAA02860@mule.m17n.org>
	<je3dbqqr6k.fsf@hawking.suse.de>
Reply-to: kfogel@red-bean.com

Andreas Schwab <schwab@suse.de> writes:
> |> And, I think C-c is the better binding.
> 
> I don't think so.  C-c has already a defined meaning as a prefix key and
> should not be overloaded by isearch.

I agree with Andreas.

The important question is, which binding is _least_ likely to be used
by people as an "exit isearch and start doing something else" key.
C-o meets this test better than C-c, because C-c is the start of many
commands, whereas C-o is the start of just one command, `open-line'.

open-line is a very unlikely command to exit isearch mode with,
because you usually don't open a new line immediately on having
reached a search match.  After all, most of the time, you're at some
random point in the match, depending on how many characters you needed
to type to reach the desired match.  Therefore, the chances are low
that it would also be an appropriate spot to open a new line.

Kenichi Handa <handa@etl.go.jp> writes:
> 21.2.  But, the command name should be isearch-yank-char (or
> isearch-yank-character) because it pulls not only letters
> but also newline, tab, space, etc.

Good point, thanks.  I will change it to `isearch-yank-char'.

Does anyone have a problem with my committing this change, then?

-Karl

From: Karl Fogel <kfogel@collab.net>
To: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 03 Apr 2001 13:48:34 -0500
Message-ID: <87elv9g6pp.fsf@collab.net>
References: <87pueueoxh.fsf@collab.net>
	<200104031411.f33EBRO32119@rum.cs.yale.edu>
Reply-to: kfogel@red-bean.com

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> > Currently, if one uses C-w in isearch in Chinese text, it grabs to the
> > next punctuation mark or end of line.  This is usually not what the
> > user wanted -- it's much more likely that they want to grab one
> > Chinese character at a time.
> 
> This suggests that instead of introducing a new binding C-o we should
> just change C-w so that it yanks in either one word or one char
> depending on the alphabet.

I would prefer that too, but I'm not sure how to implement it.

Remember, the buffer may contain mixed-language text.  We still want
C-w to work right for multicharacter words.  For example, imagine this
text, where "X", "Y", and "Z" are Chinese characters:

   X Y is my name, and Z is my game

If point is on X, someone might want to search the next occurrence of
"X Y is my name".  *Ideally*, they would type

   C-s C-w C-w C-w C-w C-w C-s

However, this can only work if C-w knows to behave differently when
looking at a multibyte Chinese character than when looking at some
other kind of text.  And of course, it can't just be for Chinese:
Emacs would have to recognize a general classification of "characters
that are themselves words" in various lanuages.  Can it do this?

I agree it would be a wonderfully usable solution, better than the C-o
solution.  I'm not sure how to implement it, though.

Any ideas?

[In the meantime, C-o is a vast improvement over what we have now, and
it is of general use, even in non-multibyte-char buffers -- I've
wanted to pull one char at a time into my search string before,
without having to find the chars on my keyboard.]

-Karl


From: "Eli Zaretskii" <eliz@is.elta.co.il>
To: kfogel@red-bean.com
CC: monnier+gnu/emacs@rum.cs.yale.edu, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
Message-Id: <8011-Tue03Apr2001205856+0200-eliz@is.elta.co.il>
References: <87pueueoxh.fsf@collab.net>
	<200104031411.f33EBRO32119@rum.cs.yale.edu> <87elv9g6pp.fsf@collab.net>


> From: Karl Fogel <kfogel@collab.net>
> Date: 03 Apr 2001 13:48:34 -0500
> 
> However, this can only work if C-w knows to behave differently when
> looking at a multibyte Chinese character than when looking at some
> other kind of text.  And of course, it can't just be for Chinese:
> Emacs would have to recognize a general classification of "characters
> that are themselves words" in various lanuages.  Can it do this?

Well, how about if it always treated entire charsets in the same
manner?  For example, all Chinese characters would be treated as
words.  We can surely classify characters by the character set they
belong to, even when the internal character representation is changed
to be based on Unicode.


From: "Eli Zaretskii" <eliz@is.elta.co.il>
Date: Tue, 03 Apr 2001 21:03:55 +0200
To: kfogel@red-bean.com
CC: emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Reply-to: Eli Zaretskii <eliz@is.elta.co.il>
Message-Id: <1438-Tue03Apr2001210354+0200-eliz@is.elta.co.il>
References: <200104030445.NAA02860@mule.m17n.org>
	<je3dbqqr6k.fsf@hawking.suse.de> <874rw5hnb5.fsf@collab.net>


> From: Karl Fogel <kfogel@collab.net>
> Date: 03 Apr 2001 13:04:46 -0500
> 
> Does anyone have a problem with my committing this change, then?

If you intend to commit this now (i.e. before v21.1 is released), I
think you need Gerd's approval, since we are under a feature freeze
for quite some time.


From: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>
To: "Eli Zaretskii" <eliz@is.elta.co.il>
Cc: kfogel@red-bean.com, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods 
Date: Tue, 03 Apr 2001 15:39:57 -0400
Message-Id: <200104031939.f33Jdvg01141@rum.cs.yale.edu>
References: <87pueueoxh.fsf@collab.net>
            <200104031411.f33EBRO32119@rum.cs.yale.edu>
            <87elv9g6pp.fsf@collab.net>
            <8011-Tue03Apr2001205856+0200-eliz@is.elta.co.il> 


> > From: Karl Fogel <kfogel@collab.net>
> > Date: 03 Apr 2001 13:48:34 -0500
> > 
> > However, this can only work if C-w knows to behave differently when
> > looking at a multibyte Chinese character than when looking at some
> > other kind of text.  And of course, it can't just be for Chinese:
> > Emacs would have to recognize a general classification of "characters
> > that are themselves words" in various lanuages.  Can it do this?
> 
> Well, how about if it always treated entire charsets in the same
> manner?  For example, all Chinese characters would be treated as
> words.  We can surely classify characters by the character set they
> belong to, even when the internal character representation is changed
> to be based on Unicode.

That's what I was thinking.  It's probably not 100% correct, but
should be fairly close.  Also it should be fairly easy to list
the relevant charsets since I would expect that all the ideogram-based
charsets are 2-byte charsets and we don't have thousands of those, do we ?


	Stefan

From: Miles Bader <miles@lsi.nec.co.jp>
To: Kenichi Handa <handa@etl.go.jp>
Cc: kfogel@red-bean.com, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 03 Apr 2001 14:01:47 +0900
Reply-To: Miles Bader <miles@gnu.org>
Message-ID: <buoy9ti8tl0.fsf@mcspd15.ucom.lsi.nec.co.jp>
References: <200104030445.NAA02860@mule.m17n.org>

Kenichi Handa <handa@etl.go.jp> writes:
> And, I think C-c is the better binding.

However, it's also normally a prefix command, so using it will block a
much larger set of bindings from being used to exit isearch.

C-o is currently used by quail's japanese input method to extend the
translation region by one character, so there is _some_ precedent for
using C-o.

-Miles
-- 
Love is a snowmobile racing across the tundra.  Suddenly it flips over,
pinning you underneath.  At night the ice weasels come.  --Nietzsche


From: Miles Bader <miles@lsi.nec.co.jp>
To: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>
Cc: Karl Fogel <kfogel@collab.net>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 04 Apr 2001 10:02:09 +0900
Reply-To: Miles Bader <miles@gnu.org>
Message-ID: <buohf05eaum.fsf@mcspd15.ucom.lsi.nec.co.jp>
References: <87pueueoxh.fsf@collab.net>
	<200104031411.f33EBRO32119@rum.cs.yale.edu>


"Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu> writes:
> This suggests that instead of introducing a new binding C-o we should
> just change C-w so that it yanks in either one word or one char
> depending on the alphabet.

Er, I'd like to use this functionality with ASCII too... (often C-w goes
too far, and it'd be nice to just hit C-o 10 times instead of carefully
typing in whatever happens to be at the match point)

-Miles
-- 
Love is a snowmobile racing across the tundra.  Suddenly it flips over,
pinning you underneath.  At night the ice weasels come.  --Nietzsche


From: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>
To: Miles Bader <miles@lsi.nec.co.jp>
Cc: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>,
   Karl Fogel <kfogel@collab.net>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods 
Date: Tue, 03 Apr 2001 22:43:26 -0400
References: <87pueueoxh.fsf@collab.net>
            <200104031411.f33EBRO32119@rum.cs.yale.edu>
            <buohf05eaum.fsf@mcspd15.ucom.lsi.nec.co.jp>
Message-Id: <200104040243.f342hQq02180@rum.cs.yale.edu>


> "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu> writes:
> > This suggests that instead of introducing a new binding C-o we should
> > just change C-w so that it yanks in either one word or one char
> > depending on the alphabet.
> 
> Er, I'd like to use this functionality with ASCII too... (often C-w goes
> too far, and it'd be nice to just hit C-o 10 times instead of carefully
> typing in whatever happens to be at the match point)

I use C-w M-e ... instead ;-)

In any case, I just realized that my earlier point about C-w moving
by either one word or one ideogram might actually apply not just
to C-w in isearch but to forward-word in general.

I'm not directly interested in such a change since I only ever use latin
charsets, but I'm still wondering...


	Stefan


From: Miles Bader <miles@lsi.nec.co.jp>
To: "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu>
Cc: Karl Fogel <kfogel@collab.net>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 04 Apr 2001 11:55:05 +0900
Reply-To: Miles Bader <miles@gnu.org>
Message-ID: <buod7ate5me.fsf@mcspd15.ucom.lsi.nec.co.jp>
References: <87pueueoxh.fsf@collab.net>
	<200104031411.f33EBRO32119@rum.cs.yale.edu>
	<buohf05eaum.fsf@mcspd15.ucom.lsi.nec.co.jp>
	<200104040243.f342hQq02180@rum.cs.yale.edu>


"Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu> writes:
> In any case, I just realized that my earlier point about C-w moving
> by either one word or one ideogram might actually apply not just
> to C-w in isearch but to forward-word in general.

Er, we already have C-f for that ...

In the case of Japanese, at least, forward-word is still useful because
it stops at kana/kanji boundaries.

-Miles
-- 
Love is a snowmobile racing across the tundra.  Suddenly it flips over,
pinning you underneath.  At night the ice weasels come.  --Nietzsche


From: Eli Zaretskii <eliz@is.elta.co.il>
To: Miles Bader <miles@lsi.nec.co.jp>
cc: Stefan Monnier <monnier+gnu/emacs@RUM.cs.yale.edu>,
        Karl Fogel <kfogel@collab.net>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: Wed, 4 Apr 2001 09:40:24 +0200 (IST)
Message-ID: <Pine.SUN.3.91.1010404094000.23448G-100000@is>


On 4 Apr 2001, Miles Bader wrote:

> "Stefan Monnier" <monnier+gnu/emacs@flint.cs.yale.edu> writes:
> > This suggests that instead of introducing a new binding C-o we should
> > just change C-w so that it yanks in either one word or one char
> > depending on the alphabet.
> 
> Er, I'd like to use this functionality with ASCII too...

We could have a user option for that.

From: Karl Fogel <kfogel@collab.net>
To: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: "Eli Zaretskii" <eliz@is.elta.co.il>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 04 Apr 2001 11:41:41 -0500
Reply-to: kfogel@red-bean.com
Message-ID: <87y9tgsjlm.fsf@collab.net>
References: <87pueueoxh.fsf@collab.net>
	<200104031411.f33EBRO32119@rum.cs.yale.edu>
	<87elv9g6pp.fsf@collab.net>
	<8011-Tue03Apr2001205856+0200-eliz@is.elta.co.il>
	<200104031939.f33Jdvg01141@rum.cs.yale.edu>

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> > Well, how about if it always treated entire charsets in the same
> > manner?  For example, all Chinese characters would be treated as
> > words.  We can surely classify characters by the character set they
> > belong to, even when the internal character representation is changed
> > to be based on Unicode.
> 
> That's what I was thinking.  It's probably not 100% correct, but
> should be fairly close.  Also it should be fairly easy to list
> the relevant charsets since I would expect that all the ideogram-based
> charsets are 2-byte charsets and we don't have thousands of those, do we ?

This sounds like a better plan than my original patch.

I'm reading up on some multibyte stuff, and will post a new patch
soon.  Ideally Emacs would be able to "get it right" 100% of the time,
instead of having a mostly-correct heuristic; will see if this is
possible.

Definitely won't commit it without asking first, thanks for reminding
me about the feature freeze.  It's no big deal if it doesn't make it
into this release, IMHO.

[Note: I abandoned this grander plan for lack of time, though it
certainly would be a good thing to implement.  See next message for
why the existing patch is at least a lossless workaround.  -Karl]

-K


From: Karl Fogel <kfogel@collab.net>
To: Eli Zaretskii <eliz@is.elta.co.il>
Cc: Miles Bader <miles@lsi.nec.co.jp>,
   Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 04 Apr 2001 14:54:36 -0500
Reply-to: kfogel@red-bean.com
Message-ID: <87u244sao3.fsf@collab.net>
References: <Pine.SUN.3.91.1010404094000.23448G-100000@is>


Based on the feedback so far, here's a summary:

   1. It would be useful if forward-word understood enough about the
      language environment to move forward by one "semantic word
      unit", even when that behavior would be the same as
      forward-char.  If forward-word did this perfectly, then isearch
      would need no changes to grab a character at a time in Chinese
      text (for example) -- you'd just use C-w.

   2. Independently of the above, it would be useful to have a binding
      in isearch-mode that grabs one letter at a time into one's
      search string (at least, Miles Bader <miles@lsi.nec.co.jp> has
      concurred that he's wanted this before too).

      Furthermore, if one had this binding, it would be a handy
      workaround solution for isearch in Chinese and Japanese (and
      possibly other) languages.  Of course, the best solution is
      still to have forward-word know what is a word and what isn't,
      but until we can teach it that, those of us who have to edit
      such text today will at least have an improvement over the
      current situation.

So, both #1 and #2 are useful.  We currently have a patch for #2, but
not yet for #1.  There is no reason for #2 to wait on #1, but maybe #2
should wait on the feature freeze.

Gerd, assuming there is consensus that #2 is a Good Thing, should it
be committed now, or after feature freeze?

Here is the patch for it:

2001-04-02  Karl Fogel  <kfogel@red-bean.com>

	* isearch.el (isearch-mode-map): Bind C-o in isearch to yank the
	next letter from the buffer into the search string.
	(isearch-yank-internal): New helper function, contains common
	internals of next three.
	(isearch-yank-char): New function.
	(isearch-yank-word): Rewrite to use isearch-yank-internal.
	(isearch-yank-line): Rewrite to use isearch-yank-internal.

Index: isearch.el
===================================================================
RCS file: /cvs/emacs/lisp/isearch.el,v
retrieving revision 1.188
diff -u -r1.188 isearch.el
--- isearch.el	2001/02/05 17:13:28	1.188
+++ isearch.el	2001/04/04 19:52:03
@@ -286,6 +286,7 @@
     (define-key map " " 'isearch-whitespace-chars)
     (define-key map [?\S-\ ] 'isearch-whitespace-chars)
     
+    (define-key map "\C-o" 'isearch-yank-char)
     (define-key map "\C-w" 'isearch-yank-word)
     (define-key map "\C-y" 'isearch-yank-line)
 
@@ -1057,24 +1058,32 @@
 	(isearch-yank-x-selection)
       (mouse-yank-at-click click arg))))
 
-(defun isearch-yank-word ()
-  "Pull next word from buffer into search string."
-  (interactive)
+(defun isearch-yank-internal (jumpform)
+  "Pull the text from point to the point reached by JUMPFORM.
+JUMPFORM is a lambda expression that takes no arguments and returns a
+buffer position, possibly having moved point to that position.  For
+example, it might move point forward by a word and return point, or it
+might return the position of the end of the line."
   (isearch-yank-string
    (save-excursion
      (and (not isearch-forward) isearch-other-end
 	  (goto-char isearch-other-end))
-     (buffer-substring-no-properties
-      (point) (progn (forward-word 1) (point))))))
+     (buffer-substring-no-properties (point) (funcall jumpform)))))
 
+(defun isearch-yank-char ()
+  "Pull next letter from buffer into search string."
+  (interactive)
+  (isearch-yank-internal (lambda () (forward-char 1) (point))))
+
+(defun isearch-yank-word ()
+  "Pull next word from buffer into search string."
+  (interactive)
+  (isearch-yank-internal (lambda () (forward-word 1) (point))))
+
 (defun isearch-yank-line ()
   "Pull rest of line from buffer into search string."
   (interactive)
-  (isearch-yank-string
-   (save-excursion
-     (and (not isearch-forward) isearch-other-end
-	  (goto-char isearch-other-end))
-     (buffer-substring-no-properties (point) (line-end-position)))))
+  (isearch-yank-internal (lambda () (line-end-position))))
 
 
 (defun isearch-search-and-update ()

From: Gerd Moellmann <gerd@gnu.org>
To: kfogel@red-bean.com
Cc: Eli Zaretskii <eliz@is.elta.co.il>, Miles Bader <miles@lsi.nec.co.jp>,
   Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu>, emacs-devel@gnu.org
Subject: Re: [PATCH] better isearch support for complex input methods
Date: 05 Apr 2001 12:32:30 +0200
Reply-To: gerd@gnu.org
Message-ID: <86snjnzlfl.fsf@gerd.segv.de>
References: <Pine.SUN.3.91.1010404094000.23448G-100000@is>
	<87u244sao3.fsf@collab.net>

Karl Fogel <kfogel@collab.net> writes:

> Gerd, assuming there is consensus that #2 is a Good Thing, should it
> be committed now, or after feature freeze?

Thanks, Karl.  I think it should be installed in 21.2.  Alas, the FSF
also will need papers for it from you.  Would you please request papers
from the FSF with the email template below?

[Note: I filled out the template and mailed it in; the FSF now has
blanket assignment papers for Emacs from me for all my recent
employers. -Karl ]