HexInput

A Better Method of Screen-Based Text Entry

Many devices involve input by indicating characters on a screen, with no full physical keyboard available. Examples include palmtop computers, cell phones, handheld game systems, and video game consoles.

There are several common approaches to this problem:

  1. Present all the needed characters on-screen, arranged either in alphabetical order or in a visual QWERTY keyboard. Users select each character by tapping each one (or highlighting with a cursor and then pressing a button to select).
  2. Phone input: letters are mapped to the numbers 2-9, which means that you have to press each button up to four times to select the letter you want.
  3. Handwriting recognition, sometimes using a custom shorthand (such as Palm's "Graffiti" input system).
None af these methods of text input is very fast or convenient. Here's an idea for something that works much better, with speeds as good or better than typing easily reached with only a little practice.

The HexInput Method

I call this method "HexInput" even though some variations of it might not actually use a hex grid. But for starters, let's suppose you're using some pen-based device such as a Palm Pilot, a tablet PC, or a Nintendo DS. In this case, input will be done via a sort of onscreen keyboard, like the one shown at right.

Three key ideas work together to make this input method better than a traditional onscreen keyboard:

  1. You can input a letter by tapping it, or by dragging to it from a neighboring letter. More on this in a moment.
  2. Letters are arranged in a hexagonal grid so that every one has six neighbors, and all neighbors are equally easy to reach.
  3. Letters are arranged to optimize text input in a particular language (English, in the examples shown here). Note that this may mean that some common letters may appear twice.
The combination of these features means that you can type multiple characters -- sometimes even multiple words -- with a single stroke of the pen. For example, it takes only three strokes to enter "this is a test." That includes the final period, as shown here:

Note that the keyboard layout wasn't optimized for this particular sentence, but for English text in general; many other words and phrases are similarly efficient. Being able to enter common letter sequences in a single stroke makes the HexInput method amazingly fast.

How It Feels

I prototyped this system on a Palm Pilot to get a feel for how well it works in practice. Entering text with HexInput takes a little getting used to; just like typing, you have to learn where the letters are, so you don't have to hunt for them.

But with a little practice, your hand just knows what to do. Whole words and phrases are remembered not as individual letters, but as multi-letter strokes. For example, "the " is a little triangular arc in the middle of the keyboard. The common suffix "ing" is a horizontal stroke at the left end of the second row. The letter "q" is available by itself, but you'll almost always stroke up and to the right so that you get "qu". Similarly, whenever you hit the period, you'll probably continue down and to the right to follow it with a space.

If you've never used HexInput, or are only starting out, these sequences are unfamiliar. But after a week or two, they'll be ingrained to the point where all but the oddest of words simply spill out of your hand in just a few strokes.

Reducing Errors

Because the layout of the letters mirrors the statistical properties of the target language (e.g. English), HexInput helps reduce typos too. With a keyboard, a common mistake is inverting the order of two letters. But when those two letters are part of a longer HexInput stroke, inverting them is often impossible. For example, consider "ing" -- there's no way you would accidentally enter this as "ign" or "nig" since those would require multiple strokes; only the correct spelling is a single stroke.

A similar principle applies when two letters are parts of different strokes. Consider "fish" for example. This is two strokes, "fis" and "h". Again, there's no way you'd misenter this as "fihs" since "fis" is a single stroke.

Of course, there are some kinds of errors that are still quite possible. Whenever you have a triangular stroke, like "the", it's possible to invert the triangle and swap two letters (e.g. "teh"). However, for common words and letter combinations, you'll have learned these by the shape of the stroke rather than the sequence, and in practice I find that such errors are quite rare.

The one error I do commonly see is either omitting the space between words, or inserting two spaces. This happens because many words end next to a space on the hex layout, and many other words begin next to a space. So it sometimes takes a bit of thought to keep track of whether or not you've already entered a space when beginning the next word.

Variations

The layout shown is best for entering English text on a pen display, such as a Palm device, a Nintendo GS, or a tablet PC. But what if you're using some other device?

In the case of a game system, where your input elements are a directional pad and buttons, you could adapt the HexInput concepts as follows:

  1. Arrange the letters in a square grid. (Note that the image at right is just an example, and the letter placement has not been optimized.)
  2. Add a cursor that indicates the "pen" position. The user moves this cursor around using the d-pad, in either four or eight directions, depending on whether the directional pad allows reliable movement in the diagonal directions.
  3. Use a button to control the "pen" state, either up or down. (The state would be reflected in the appearance of the cursor.)

In all other respects, it's just like the system described above: you enter a letter either by tapping the button while the cursor is on it, or by moving the cursor into it while holding the button down from the previous entry. So, again, you could enter multiple letters at a time by holding the button down while moving the cursor in the appropriate sequence.

I haven't prototyped this variation, but my guess is that while this would be slower than using a pen, it would still be substantially faster than any other game controller-based method of text entry.

What about text entry on a telephone? That's actually quite similar to the game controller case, but even better since you can reliably indicate all eight directions with the numbers 1-9 (excluding 5). All that's required is some other button, which can be held down with the hand holding the phone while pressing number keys with the other hand. Again, I would expect this to be less efficient than pen input, but quite a bit better than other methods of phone text entry.

Note that telephones makers have developed some fancy methods of speeding text entry by guessing what the user is desperately trying to enter, and allowing the user to press another button when the guess is correct. This would be less necessary, but perhaps still useful, when using a HexInput variation. Assuming HexInput is faster than standard phone text entry without the guessing, then it HexInput plus guessing would also be faster than the standard method plus guessing.

Related Work

SHARK is a similar system developed by IBM researchers. It allows the user to stroke out one word at a time, even if that means dragging over letters not involved in the word, and then guesses what actual word the user intended.

Dasher is a very unique text entry system that involves navigating a moving landscape of characters.

Both SHARK and Dasher rely on the system learning the writing habits of the user, as the user learns the methods of the system. I feel that HexInput is substantially simpler and benefits from being deterministic. This is useful especially when you need to enter something odd that the system doesn't expect (e.g., an email address). However, these are both interesting alternatives to a standard onscreen keyboard, and merit further study.

Please use this idea!

I haven't the time to actually apply this idea, beyond the simple PalmOS prototype I've already made. But I really do think that this is a far superior method of getting text into any device without a keyboard. If you are a software developer, I urge you to consider adding this functionality to your product. Feel free to contact me to discuss design or implementation details.

If you're not a software developer, you can still help by writing to the makers of your favorite keyboardless systems and asking them to look at this page.

Note that end-users will require very little explanation of how to use HexInput; in both the pen-based and cursor-based variations, a user will at first assume (correctly) that they can enter letters by tapping them. At some point, through luck or stumbling across a manual, they'll discover that they can hold the pen or button down to enter text more quickly. Their input speed will improve substantially after that. But a key selling point is that even a naive user will be able to enter text successfully.

My hope is that ten years from now, we won't have to laboriously tap out messages letter by letter, but instead will be able to zip them out quickly and efficiently with something like HexInput. Please help make this dream a reality, by telling everyone you know!

Update: the QUONG layout

Since writing the above, I discovered a utility for my Palm T5 PDA called myKbd, by Alex Pruss. This utility lets you choose from a variety of custom input methods, and even (with the help of a free additional Windows utility) create your own. It came with a hexagonal layout called ATOMIK. The ATOMIK layout was developed by some researchers at IBM, and they used many of the same principles as those described above.

So I immediately installed myKbd, selected the ATOMIK layout , and started using it. It was far better than anything I'd even used before -- except for the HexInput prototype. It seemed that the letters in the ATOMIK layout were not as optimized as they could be, and upon reviewing the research, this is not surprising: the researchers optimized not just for efficiency, but also for having the letters arranged quasi-alphabetically. Many other users on the mailing list agreed that ATOMIK didn't seem as efficient as it could be.

So, I ran my optimization algorithm on the ATOMIK keyboard to see if it could do any better, and it did indeed improve it quite a bit, most likely because I gave no value to having the keys in alphabetical order. (In my experience, the quasi-ordering of the letters provided no benefit in learning the layout, and was certainly not worth being hindered for the rest of my life.) After several days of optimization, we got the modified layout shown here:

We noted that the first five letters in this layout happen to be pronounceable, so in the tradition of QWERTY, we call this the QUONG layout.

With QUONG, you can enter text with substantially fewer strokes than with ATOMIK -- almost as few as with the HexInput layout above, even though QUONG does not repeat any letters. You can enter "this is a test" in four strokes (including one fluid multi-word stroke in the middle!).

I've been using QUONG for real-world text entry for about a month now, and am up to a peak of about 30 words per minute, with plenty of room still for improvement as I gain proficiency. I'm satisfied that this is the best layout available for entering text with a stylus, and I encourage everyone with a pen-based device to give it a try. MyKbd users can download the QUONG layout file, including source code.


- Joe Strout (joe@strout.net)
Last Updated: September 2006