Design and Implementation of an Advanced Text Editor

by Craig Bruce ( )


So far, we can load up a document and whiz around inside of it, but we can't actually change anything. This section describes the single-character modification operations of character input, rub, and delete, and the text "sloshing" algorithm that is needed to make sure that lines are always as full as they can be without going over the target length ("Come on down!").


Adding a single character to a document isn't really very difficult. There are two modes for single-character inputting: insert and overtype. Insert mode is generally more useful and more often used, but overtyping can be very useful when dealing with tabular or specially formatted text. Therefore, we must support both modes. In some other text editors, overtype mode is the natural mode because the cost of inserting a character can be so high, but not here.

Actually, there isn't a whole lot of difference in the implementations of the two modes. For overtype mode, you just fetch the current line, take the inputted character and store it into the line buffer at the current position, stash the line back into memory, and repaint the line. For insert mode, we do the same thing, except that we copy the line from the cursor to the end to one position beyond the cursor (backwards) and bump its length up by one before storing the new character on the line.

Rubbing out a character (Commodore-DEL) is done in quite the same way, except that we copy the rest of the line back one space and decrement the length. Oh, and when the length of the new line is different from the length of the old line, we have to deallocate the old line and allocate new memory for the new line, and surgically link it in with the rest of the document. I have a subroutine that does this.

The DEL key is handled as if you had typed Cursor Left then RUB, except when you press DEL on the first column of a line and the previous line ends with a hard return, the hard return of the previous line is removed instead. I decided to make the RUB (Co-DEL) key return an error instead of joining lines together like DEL, because sometimes it is convenient to just lean on the RUB key to delete to the end of a line.


But, we're not done with text modifications yet. If we just left the modifications as described in the previous sections, we would be end up with lines that are longer than the target length, with ragged lines, and with lines that don't join together like they should after pressing DEL in the first column.

After each of the modifications, the text-sloshing routine is called to straighten everything up and figure out how to redisplay the screen. Often, only one line needs to change, but sometimes, many lines or even the whole screen will have to be updated, as sloshing text can continue for many lines as line overflows and underflows cascade forward from the line that has just been modified. In fact, the text-sloshing routine is enormously complicated and has many, many special cases. (Although, for all of its complexity, it is only 400 lines long, although it still needs a few more features).

There are many more cases that could cause sloshing than you might think. There are the obvious inserting/deleting one-too-many characters in the current line, but there is also the case that an insertion or deletion causes a space character to move to the right position on a line to allow the line to be sloshed backward, or maybe you remove a space from the end of a line that creates a word that would be too long to be contained on one line and therefore needs to be sloshed forward.

The sloshing algorithm doesn't introduce or take away any characters from the body of the document; it just reorganizes the existing characters. All of the spaces are retained at the ends of wrapped lines. We don't want to delete spaces, since it is difficult to reconstruct them, since two spaces are normally between two sentences in text, but it is difficult for a computer to figure out where a sentence ends. In fact, keeping these spaces can sometimes cause an anomaly: if all of the spaces won't fit on the end of one line, then they will be displayed at the beginning of the next line.

The variables that are maintained by the algorithm are as follows:

sloshLinesAltered  = work2+0 ;(1) ;a simple count 
sloshRedisplayAll  = work2+1 ;(1) ;$80=redisplay to bottom, $ff=force all 
sloshRedisplayPrev = work2+2 ;(1) ;whether previous line needs to be repainted 
sloshMaxChars      = work2+3 ;(1) ;number of chars that can be sloshed 
sloshTailChar      = work2+4 ;(1) ;the last char of prev line 
sloshTheCr         = work2+5 ;(1) ;whether a CR should be sloshed 
sloshLinesInserted = work2+6 ;(1) ;number of new line records created 
sloshLinesDeleted  = work2+7 ;(1) ;number of existing line records deleted 
sloshCurAltered    = work2+8 ;(1) ;whether the current line has been altered 
sloshTerminate     = work2+9 ;(1) ;flag to terminate (hit a hard return) 
sloshCurTailChar   = work2+10 ;(1);last char of current line 
The algorithm has a main loop that is repeated for each line that can be sloshed. The loop exits when we either run into a hard return or we run into a line that does not need to be sloshed. We start scanning from the cursor line of the main document.

We first look at the previous line and see how many more characters it can accommodate before being full. Then, we scan the current line from this point (the number of characters that could potentially be sloshed backwards) back to the start of the line searching for spaces. If there are no spaces, then the current line cannot be sloshed back onto the previous line. If we do run into a space, then we stop searching and know that we can (and must) slosh back the current line. So, we remove the characters from the current line and write it back (maintaining links as appropriate) and then go back to the previous line and append these characters to it.

If the cursor happened to be on the line in a position that got sloshed back, then we must adjust the cursor position and move it back to the previous line. If the previous line is before the start of the screen, we must set the flag to redisplay the entire screen later. If the current line is the first line of the slosh area but the cursor didn't get moved back to the previous line, then we must set the flag to indicate that we must redisplay the previous line too when we repaint the screen.

If it turns out that we have sloshed back ALL of the characters on the current line, then we must remove the empty line record of the current line from the document and adjust the global line count. We also have to worry about sloshing back a hard return, and if we do, then we bail out of the algorithm since we are done. Oh, and we have to keep in mind whether we are displaying carriage returns or not in calculating line lengths.

After shoshing backward, we check if the current line needs to be sloshed forward. It needs to be sloshed forward if it is either longer than the target length or it ends in a non-space character and the first word on the next line cannot be sloshed back. The latter is a special case and needs to be checked for specially, even thought the functionality for doing so is redundant.

If we do need to slosh forward, we start scanning the current line at the target-line-length point and scan backwards until we hit the first space. If there is no space, then the current word is longer than the target length and must remain abruptly broken over two (or more) lines. We wrap it at the target length. If we do find a space earlier on the line, then we wrap the line right after that space. Oh, I forgot to mention: we need to do something special for lines that end with non-spaces for backward sloshing too. If the previous line ends in a non-space (presumably because it contains a single very long word), then we don't find any spaces to slosh back to the end of the word, then we slosh back as many characters as will fit, since the word is broken anyway, and we want it to have as many characters as possible on a single line.

To wrap the line, we set the length of the current line to the new length and replace the old version of the line in memory. Then, we create a new line record and store the characters that were wrapped in it and link this line record in with the rest of the document. And this is all that we do here; we don't actually insert the wrapped characters into the next line, since that would be more complicated, and since it might cause that line to overflow. If we just leave the wrapped characters, they will be joined with the next line (if necessary) on the next iteration of the main sloshing loop in a slosh-back operation. We must adjust the cursor location, like before, if the cursor was on the part of the line that got wrapped around.

At the end of a loop, we check to see if we have passed a hard return in the document, in which case, we exit. Otherwise, if either the current line has been modified or if the current line is the first line to be sloshed, then we go on to the next line and repeat the above procedure.

On our way out, we do a little fine tuning of the "sloshLinesAltered", "sloshRedisplayAll", and "sloshRedisplayPrev" variables, which were described earlier. These variables will be used to repaint the changed portions of the screen display. Part of the fine adjustment includes comparing the number of lines that have been inserted and deleted from the document. If these two numbers match, then the bottom portion of the screen doesn't have to be repainted, only the altered lines themselves; otherwise, we need to repaint from the current line all the way to the bottom of the screen.

The sloshing algorithm currently does not handle non-wrap mode; lines will always be word wrapped. Later, all lines will be broken at the N-th column if you are not in word-wrap mode. Also, the algorithm can produce an anomalous wrapping in one case involving lines that end with non-spaces that I can't seem to remember that this algorithm will fail in (but, the document will still be interally consistent). And finally, if you change the target line length while editing a document (which you can't currently do), then the algorithm may not be able to give optimal word wrapping in all cases (though, again, the document will always be interally consistent).

On to Chapter 5.

Last Updated: 1995-12-06 Rev A