Tools that understand user intent

July 05, 2022

Some aspects of language learning never get any easier.

In the beginning you make crude mistakes and people can’t understand you. Later on, you make more subtle mistakes and people don’t bother to correct you.

Testing your assumptions about the grammar of a language can also be difficult because people are often unable to articulate how some words, idioms, or patterns are used. Ask someone the usage of the phrase “as opposed to”, for example, and see what kind of an answer you get - think about how that explanation would benefit someone trying to learn English.

When I was studying linguistics at UBC, much of what were were taught was informed by the faculty’s studies of First Nations languages across Canada. I never studied the methods of linguistic fieldwork but our instructors would sometimes mention techniques they used to collect research data from their consultants.

One that I found really interesting was the story board technique. As mentioned earlier, asking questions directly is not always useful for understanding how certain aspects of a language work. So, if researchers want to test a hypothesis about the usage of some chunk of grammar, they’ll provide consultants with a specific series of images and ask them to narrate a story based on those images in their own native language. The idea is that these images are specifically designed to force the story to make use of the relevant grammar, thereby allowing researchers to hear it used in context.

When one is learning a complex system, being able to reliably test one’s mental model of it is extremely important.

Natural languages aren’t designed the way programming languages are. Linguists take great pains to develop ways of testing their mental models of languages because sometimes there is no spec to consult.

But since programming languages are designed, their implementations can communicate feedback in a way that native speakers of natural languages often cannot.

This is something I’m learning about as I make my way through Crafting Interpreters to write my own interpreter for the Lox programming language.

An interpreter typically consists of a few distinct components - a lexer, a parser, and an evaluator. The lexer takes some source code and converts the raw characters into tokens, e.g. converting 3 into { type: 'INTEGER', value: 3 }. The parser takes a stream of tokens from the lexer and builds them up into a tree structure that represents the program as a series of expressions, statements, data, and the relationships between them. The evaluator traverses the tree and evaluates what it finds at each node, thereby running the program.

To parse tokens into a tree, you need some rules, called productions, that define how tokens can be grouped together to form valid expressions. These rules, along with the alphabet of the language, make up a grammar. Grammars come in many forms but here’s a simple example of a grammar that can generate any valid arithmethic expression using the four basic operations:

1.  <expression> --> number
2.  <expression> --> ( <expression> )
3.  <expression> --> <expression> + <expression>
4.  <expression> --> <expression> - <expression>
5.  <expression> --> <expression> * <expression>
6.  <expression> --> <expression> / <expression>

All programming languages begin with a grammar like the one above. Consider the fact that programs are just a series of valid applications of the productions of a grammar. The program (20 - 5) / (6 * (4 - 3)) can be generated by the productions above, so it’s a valid program in the language defined by that grammar. If you’re writing a program in a language which contains a series of tokens that cannot be generated by applying the productions in the grammar, then you’ve got yourself an invalid program. Uh-oh. Do not pass go and do not collect $200.

This situation is identical to what happens when we try to speak a foreign language with the wrong mental model of the grammar of that language - we generate invalid utterances. And native speakers either understand what we’re saying or they don’t.

What’s really fascinating to me though, is the idea that parsers can serve as a guide and a practice environment for learning the language and testing our assumptions about it instead of merely being the tools that assemble valid token sequences into parse trees.

As the author, Robert Nystrom, states:

Syntax errors are a fact of life, and language tools have to be robust in the face of them. Segfaulting or getting stuck in an infinite loop isn’t allowed. While the source may not be valid code, it’s still a valid input to the parser because users use the parser to learn what syntax is allowed.

Then the author mentions a parsing technique which I think is so profound that its usefulness extends far beyond the domain of programming language design - error productions.

The idea here is that you have a production rule in your grammar that matches syntax errors, making errors part of your grammar. When you parse an invalid program with such a parser, the parser will recognize the erroneous input, but instead of choking and crashing it can report valuable feedback to the user. By recognizing erroneous inputs at the level of the grammar, the parser can be made to understand what the user meant to do and then use that understanding to show the user the right way to achieve their goal.

This is a much better user experience than throwing a syntax error and calling it a day.

I’m thinking about how this idea can be applied to other contexts. When I search for existing examples not much comes to mind. Writing assistants like Grammarly are kind of similar but they typically add AI-powered insights to your writing experience, they’re not quite the same as tools with error recognition built into their functionality. Although perhaps this is not an important distinction.

Beyond that, there are things like the git CLI which nudges you when you mistype a command by suggesting similar commands that you may have intended to type.

$ git stats

git: 'stats' is not a git command. See 'git --help'.

The most similar command is
        status

I’m sure there are other examples but they don’t spring to mind readily.

Not every application will lend itself to this style of feedback. For applications whose usage doesn’t involve adhering to a formal structure like a language grammar, understanding user intent will most likely require non-trivial data gathering and analytics.

But I’m thinking of a SQL editor that tells you how to write what you meant to write (I could sure use one of those). Or a web API that returns well-formed request snippets instead of just a 400 and a generic error message. Or an IaC config that helps write itself.

Dieter Rams’s famous commandment states that the best design is as little design as possible, but design that talks back to you may be the next best thing.

References

Nelson, Randal C. “Context-Free Grammars.” Department of Computer Science, University of Rochester, https://www.cs.rochester.edu/u/nelson/courses/csc_173/grammars/cfg.html.

Nystrom, Robert. “Crafting Interpreters.” Crafting Interpreters, https://www.craftinginterpreters.com/.

Rams, Dieter. “Ten Principles for Good Design.” Reading Design, Reading Design, https://www.readingdesign.org/ten-principles.

“Totem Field Storyboards.” Storyboards for Language Documentation, Totem Field Storyboards, http://totemfieldstoryboards.org/.