                         THE WISHLIST (TM)
               (or, Our Plans For The Recent Future).

        About the organization of this file: there are two
considerations to take into effect when prioritizing a wishlist.  The
first is the usefulness of some proposal, and the second is the ease
of implementing it.  There is not necessarily any direct relationship
between the two, but both affect the ordering of the list.  As in, "If
I can have better base colors in five minutes, that's worth more to me
than having secondary-structure browsing in five years."

        In an ideal world, it would be possible to somehow multiply
those two figures (usefulness vs ease-of-implementation) together and
arrive at a single "priority index".  Then we could just start
implementing the items with the highest index first, and work our way
down the list.  However, it's not always clear exactly what the two
values should be, so I have't organized the list quite like that.

        Instead, it is divided into several sections:

KNOWN PRIORITIES: this is the top of the main stack.  It lists, in
                  order, the features we should be working on over the 
                  next few months.  They are large features, not
                  small, well-defined changes.

TRIVIAL STUFF: these are features which would take little or no time
               to implement.  It's okay to do one of them before a
               known priority, because so little time will be lost.
               Just play these by ear and implement one when your
               brain is telling you that today it wants a small, neat
               problem.  They should all be done fairly soon
               (i.e.: March).

SEMI-TRIVIAL STUFF: this is somewhat harder than the trivial stuff.  A
                    semi-trivial feature probably wants a day or two
                    of solid work to get it up and running.  Play
                    these by ear too... but they should all be taken
                    care of by the time the know priorities are
                    completed.

NON-TRIVIAL STUFF: Well.  Here we are.  These changes are just as
                   major as the known priorities, only we didn't find
                   them important enough to be put in that category.
                   So they went here instead.  I have a feeling we
                   won't be touching this stuff for a while.

THE DISTANT FUTURE: This is like above, only more so. :-)  No,
                    seriously, there's no time frame for these things.
                    They're here only so we don't forget about them --
                    this section is just a place to record them.
                    They'll all happen someday, no doubt, but not
                    right now.

UNORDERED: This is stuff I didn't have any ideas about how to order.
           Some of it might be important; I'd appreciate it if you
           guys could check it out and suggest revisions and/or
           good placements for the items here.


        We have also discussed the following things, although they
don't appear in the list in any particular order because they're too
general:
- An interface and semantics for ordering, phylogenetic and otherwise.
- External analysis programs in general
- phylo interface (what exactly do we mean?) } are these two
- ordering method                            } really one problem?
- subalignment editing



-*- -*- -*- -*- -*- -*- -*-  KNOWN PRIORITIES -*- -*- -*- -*- -*- -*- -*- 

0. Phylo browser features (Jim's current stack).
   * (Niels, Jim) Add the features to Phylo-browser/selector that
     Jim/I agreed on November 18 - main points in short:  Reestablish
     menu-bar; handle all cases where there is not a 1:1
     correspondence between sequences and taxa in phylo list; write
     visible part of phylo buffer to user-named file; import/export
     selections to user-named file; scrollbar; conserve some native
     Emacs functionality so edits are possible; incremental addition
     of sequences; more details.   


1. Hook up Ross' searching code (Pace Lab wants this too).
   (yes; for example:  Make a 'hard' selection, click an option which
   composes a motif (in ross-language), which is then fed his pattern 
   matcher which returns the areas of match, see matches highlighted.
   This can be taken much further, e.g. secondary structure, his
   program is more powerful than regexps.  /Niels)
   [ Pavan and I are working on this, I think.  -Karl ]


2. Hook up some treeing programs
   (Pace Lab, priority 4)
   * Ability to call FastDNAml.
     Operation takes days sometimes; the editor shouldn't wait for the
     program to finish.
     It would be nice to be able to run FastDNAml via rsh.


3. Hook up automatic alignment programs (Clustal, Pal (?), etc).


4. Other file formats.  Pace Lab mentions the following, in order of
   priority:
     a) Phylip --- very important
     b) Nexus
     c) GCG --- indexed formats, and multiple-sequence format.
  
     Jim sez: Perhaps the best thing to do would be to modify ReadSeq
     to support tabl format, and then have Ale use ReadSeq for all
     its conversions.  Niels concurs that ReadSeq is a good idea.


* (All) masks
  [ Exactly what is meant by masks is not yet well-defined; what sort
    of interface do we want?  -Karl ]
  [ Ah -- it seems that masks are often handled directry by the
    receiving program, and all we have to do is pass it in correctly
    as a string along with the sequences in question.  -Karl ]

  (This is a place where I think its important to make code general and
  ready for unexpected uses.  Masks ought to appear as highlighted bars
  where characters can be edited underneath; the bars can extend
  vertically through all sequences or be restricted to chosen groups,
  but should default to the sequences they were generated from.  They
  need to be generated automatically at least as a group operation, but
  manual addition/subtraction operations are needed too.  This kind of
  display is much better than old-style extra mask-lines: They can be
  switched on/off without cluttering the alignment; can be applied to
  sequences elsewhere in the alignment (by changing a group) to
  e.g. examine if alignment is solid throughout; could be used to
  display data read from file which apply to a set of sequences; one
  doesnt have to look away from the sequence of interest to see what is
  included.  Would be nice if masks be controlled from a frame similar
  to current group-frame, so that it is clear groups and masks can be
  combined freely so user can submit any intersection to any available
  relevant analysis.  Making new masks by logical operations on old ones
  is relevant.  (The column compositions could perhaps be
  pre-calculated, so masks and consensus sequences can be generated from
  them, rather than looking directly in the alignment).  /Niels)

  [ Agree that we need masks; think maybe the old-style mask line is
    actually a win.  It wouldn't require much extra support from us,
    and vertical-overlay-style masks would be difficult to implement,
    possibly slow down redisplay, and cause more visual confusion on
    an already confused display.  Also, people are already used to
    dealing with old-style mask lines.  -Karl ]

6. I think the way Ale reads and writes files needs some serious
   re-thinking.  The current system is consistent in some ways which
   are useless, and inconsistent in other ways which would be useful.
   See my E-mail of Feb 6 1995 for the rationale for these suggestions.

   In this model, the alignment buffer corresponds to a single file;
   by default, all the sequences in the alignment buffer, and only the
   sequences in the alignment buffer, go into the file.  GDBM files
   are treated specially.

   Open
	   Read a file's contents into the alignment buffer, which must
	   be empty.  Mark the buffer to save to that file by default.
	   Like Emacs's C-x C-f, Macintosh's "Open".

   Save
	   Write the alignment buffer's contents back to its default
	   file, replacing the file's contents entirely.  Like Emacs's
	   C-x C-s, Macintosh's "Save".

   Save As
	   Solicit a filename from the user.  If the file exists, verify
	   that the user intends to overwrite it.  Write the alignment buffer
	   to that file, replacing the file's contents entirely.  Make
	   that file the buffer's default file.  Like Emacs's C-x C-w,
	   Macintosh's "Save As".

   Insert
	   Solicit a filename from the user.  Add the sequences in that
	   file to the current alignment buffer.  Leave the alignment
	   buffer's default file unchanged (i.e. future "Saves" will put
	   all the sequences in the same file).  Like Emacs's C-x i; the
	   Mac accomplishes this with copy and paste commands.

   Phylogenetic List
	   Solicit a filename from the user.  Display a phylogenetic
	   listing of that file's contents.  Let the user select
	   sequences in the phylogenetic listing.  Insert the selected
	   sequences of the file to the current alignment buffer, as for
	   "Insert".


   On the "group" menu, there should be two commands:

   Save Group To File
	   Solicit a filename from the user.  If the file exists, verify
	   that the user intends to overwrite it.  Write the sequences in
	   the group to that file, replacing the file's contents
	   entirely.  Like Emacs's M-x write-region; the Mac accomplishes
	   this with copy and paste commands.

   Merge Group With File
	   Solicit a filename from the user.  Write the sequences in the
	   group to that file, preserving those sequences in the file
	   which are not in the group.  No analog in Emacs or the Mac.


   The above descriptions apply only to flat files.  GDBM files should be
   handled differently.  All saves to GDBM files preserve sequences in
   the file but not in the buffer.  (What should happen when one cuts a
   sequence from a buffer visiting a GDBM file, and then saves?)
	

-*- -*- -*- -*- -*- -*- -*- TRIVIAL STUFF -*- -*- -*- -*- -*- -*- -*- -*- 

* Would it be good to have a Help menu in the browser's menu bar which
  can bring up the phylo-relevant section of the info file?

* It would be nice if the complement/reverse commands could be applied
  to groups as well as single sequences.

* use rect-mark.el to get rectangle dragging w/ selections
  (thank you, Rick Sladkey).  I have a copy in ~kfogel/elithp/.


* color thoughts:
  - put group colors under user control (fore & back *separately*)
  - give them ability to color character backgrounds as well as
    foregrounds.
  - maybe ability to color the green ID bar.
  - put fore/background under user control generally.


* From: Niels Larsen <niels@darwin.life.uiuc.edu>
  To: kfogel@floss.life.uiuc.edu
  Subject: for info ? 
  Date: Wed, 15 Mar 1995 12:06:30 -0600
  
  I did this for my own help button; I want to have a friendly 
  Emacs environment, which I can then give away to friends in a
  friendly way.  Maybe bad code, but maybe it could be used for
  our on-line help by changing Emacs to ALE ? 
  
  (copy-face 'default 'info-xref)
  (set-face-background 'info-xref "khaki1")
  (set-face-foreground 'info-xref "black")
  (copy-face 'default 'info-node)
  (set-face-background 'info-node "SteelBlue4")
  (set-face-foreground 'info-node "white")
  
  (defun nl-help-frame-create ()  ;; (namestr) howto?
    (interactive)
    (let ((curbuf (current-buffer))
          (nl-help-alist '((name . "Emacs On-Line Help")
                            (left . 5)
                            (top . 5)
                            (height . 60)
                            (width . 90)
                            (font . "fixed")
                            (foreground-color . "white")
                            (background-color . "grey30")
                            (internal-border-width . 4)
                            (mouse-color . "khaki1")
                            (cursor-color . "khaki1")
                            (menu-bar-lines . 1)
                            (visibility . t))))
      (Info-goto-node "(emacs)Top")
      (new-frame nl-help-alist)
      (switch-to-buffer curbuf)))
  
  
* (Niels) wants `unreadable' font available, for overview.


* lock/unlock all sequences


* go to organism regexp --> just use ID Find-all, then have group
  movement commands.

  [ Ah, umm.  Okay; must think carefully about keybindings for the
    group movement commands, though.  -Karl ]


* check out DCSE 3.0 (just to steal features, maybe)


* (Niels) We should ask for Jim's little program that compares
  sequences before and after exit.  Would be good to guarantee
  reliability at the sequence level.
  [ Yeah -- I think this is being documented and is ready to use now.
    -Karl ]
  [ This has been documented, but it isn't called automatically.  It
    could be most happily checked against the original file after we
    write out the tabl-format data, but before we convert it to the
    destination format.  - JimB ]


* (Niels) 0.1/0.2 documentation: List of what hardware/software is
  required; Intro (why Emacs, overall status); exact stepwise
  installation instructions; tutorial for computer dummies that
  quickly walks through best features (I know the dummy language and
  could write it); known problems section; key rebinding section.  I
  find key binding important because each extra keystroke saved
  increases editing speed; we can benefit by receiving maps from
  different machines, keyboards or terminal emulators, include them in
  .gyppu and redistribute). 


-*- -*- -*- -*- -*- -*- SEMI-TRIVIAL STUFF -*- -*- -*- -*- -*- -*- -*- -*- 

* (Will Fischer) Guess a reasonable default color map by looking at
the sequences to see if they're amino or nucleic.


* (Karl, Niels) named placeholders (bookmarks, essentially)


* (Karl, Niels) Double click in ID buffer to add/delete seq from
  group.  Single click should just move point.


* (Niels) Include number of seqs in each group in the group frame
  entry. 


* (Niels) Printing.  Look into psprint.el package.  Ask Terry
  Gaasterland if we can have Belmont code for PostScript
  'pretty-prints'.


-*- -*- -*- -*- -*- -*- UNDENIABLY NON-TRIVIAL STUFF -*- -*- -*- -*- -*- -*- 

        Wow, this section has been cleaned out.  It used to have two
things in it, but they're done (or at least moved somewhere else).


-*- -*- -*- -*- -*- -*- -*- THE DISTANT FUTURE -*- -*- -*- -*- -*- -*- -*- 


* consensus calculation.  (re: conversation w/ Terry Marsh)



* generating selections base on composition;
  in general, mechanical ways of generating selections
  "rectangles are nice, but with 3000 sequences..."



* (Terry Gaasterland) be able to *replace* files.  IOW: be able to
  take files/sequences out, load new ones in, have clean interface to
  doing so.  Just like C-x C-v.

  [ Need to discuss w/ Jim about easiest way to do this.  Mainly
    interface questions; right now we don't keep a record of all the
    files in the editor, although that information could be deduced
    from examining the buffer.  We want to offer a menu of files to be
    taken out: any sequences or sequence fragments in those files
    would then be removed from the editor.  Then there can be a
    `replace' command which simply calls the previously-described,
    then `open'.  -Karl ]


* (Carl?) diff two sequences?  (Or is that a job for a subprocess?  Is
  it useful before they've been aligned?  What exactly does it mean?)
  (again, please use highlight, not an ae2-style extra line.  /Niels)
  [ Again, we'll have to decide whether it's worth the extra work. :-)
    -Karl ]


* (Niels) Helix-handling.  Pair checker.


  From: DAMBERGER <damberge@beagle.Colorado.EDU>
  To: kfogel@cyclic.com
  Subject: Re: any luck?
  Date: Tue, 21 Mar 1995 08:11:57 -0700 (MST)
  
  > 
  > >I'd almost be ready to use it if one could add helix information.
  > 
  >         Have you any thoughts about how the information would be presented?
  > 
  
  Hmmm...In ae2 we use reverse video to highlight helices.  Maybe the
  helices could be highlighted that way?  Probably, one would also want
  to be able to turn this option on and off.  Also, a little window to
  tell where the basepair partner is might be useful.  The problem with
  using colors to indicate helices, is that selections are allready 
  colored.
  
  Simon


* (Niels) Consistent search scopes.  There seems to be three possible
  search ranges (current sequence, current group, whole buffer), two
  directions (forward, reverse), three buffers (ids, alignment,
  annotation; be prepared for more), and number of matches (next,
  all).  These things are logically separate, and almost all
  combinations seems relevant. This is another case of where
  generality can have unexpected payoffs; for example searching for a
  probe/primer target in the whole alignment and then cutting out the
  group of matches, would give exactly the kind of overview we need.
  Combination of search range, direction, buffer, and match range
  should therefore be unrestricted and controlled by modifier keys
  (search will be a frequent operation).  We could list every option
  in a search menu, but it would be cluttered.  Or make it consistent
  with edits: If in selected group, range is that group (and menu
  could list group options only), otherwise current sequence.  Meta or
  Shift key could 'amplify' the search range as with movements.  I
  agree its okay to make a group out of the sequence(s) that match (is
  it easy to put the number of members in the groups frame, next to
  name?  its helpful to see number of matches).  Current 'Go to
  organism' would fall under this.  Please dont improve the regexps,
  Ross's pattern matcher has potential for powerful motif searches.
  (btw, where a given search string matches one could give each
  character in the match range the property of the search string itself,
  and that way have a quick way of coming back to the matches later).

  [ Agree -- this seems like a very sensible way to think about
    searches.  I will work on a consistent interface.  -Karl ]



* (Niels) Window shrinking/growing commands.
  [ This refers to the split-alignment stuff.  -Karl }


* version control for sequences (integrated with some database, no
  doubt).
  (yes, wait a while  /Niels)


* (Niels) A general 'repeat last command' function.
  [ I have put this as low priority because it's not as simple as I
    had hoped.  The problem is that many commands take
    arguments/inputs of some sort, and we'd have to record those too.
    Will think some more.  -Karl ]


* Better annotation menus, better annotation mode in general.
  [ Yes; what is useful in annotating?  -Karl ]


* (Niels) Specify some programmer's guidelines, so its easier for
  others to help develop. 


* (Bonnie, ... everyone, really) mirrored helix movements (which
  highlight the helix-wise corresponding base to the current one).
  This probably wants the vertical overlays we were talking about; it
  may be far off in the future.
  (Yea, too early for this, need a maintainable secondary structure 
  description.  I may be able to interest some Michigan people to help
  with this part, including the alignment procedure.  /Niels).


* vertical overlays (?)  In exactly what areas would they be helpful?
  [ Difficult to implement, but useful when done.  -Karl ]  


-*- -*- -*- -*- -*- -*- -*- UNORDERED -*- -*- -*- -*- -*- -*- -*- -*- -*- -*- 


        Jim, which of the following things are already done by your
analysis code?

* (Pace Lab, priority 2) Similarity tables --- how different is each
  sequence from each other sequence?
  a) Over all columns in the sequence.
  b) Specify columns with masks.
  c) Specify columns with selections.


* (Pace Lab, priority 3) Compute consensus line of a group of
  sequences, or of entire alignment.


* (Pace Lab, priority 5) Amino <-> Nucleotide conversion
  The code to use should be a parameter.
  When doing the Amino->Nucleotide conversion, how to deal with
  multiple possible encodings?
  a) generate ambiguity codes
  b) generate "most probable"
  c) generate "least ambiguous" (most universal across organisms?)
  [ This is mostly done, though ambiguity characters could be handled
    better.  -Karl ]


* (Karl) use real dialog boxes?
  (Whats currently possible?  /Niels)


* (Niels in a mail)
  It would be good with a  fast C function that returns the residue 
  number of a given character, like the one cursor is at.  I dont think
  ae2-style edge-numbering is needed.  I would like a special frame that
  lists 1) sequence number of cursor residue, 2) column number of cursor
  character, 3) reference sequence id, 4) sequence number of the residue
  in reference sequence, that is at the same column.  With number frame 
  shown, numbers should update when cursor moves (like up/down moves of
  annotation).  This will also go in the wish list, just havent done 
  it yet. 


* From: Jim Blandy <jimb@totoro.bio.indiana.edu>
  To: Karl Fogel <kfogel@cyclic.com>, Niels Larsen <niels@darwin.life.uiuc.edu>
  Subject: interesting fact
  Date: Wed, 15 Feb 1995 23:01:26 -0500

  GenBank is available in a machine-readable format --- that is, all the
  fields carefully parsed out in a way that is easy for machines to
  grok.  It uses a generic standard print syntax called ASN.1.  Maybe
  this would be helpful when the RDP converts to a full database.
  

* From: Jim Blandy <jimb@totoro.bio.indiana.edu>
  To: kfogel@cyclic.com
  Subject: fixed
  Date: Sat, 18 Feb 1995 16:16:01 -0500


  I'll bet Richard wouldn't mind at all if we made x-pointer-shape a
  frame parameter.  That would fix this problem (frame parameters aren't
  inherited) and relieve us from the ugly mouse color kludge.
