type = “post” +++ title = “Iterators in Rust” author = [“Tushar”] description = “Quick intro to the Rust iterators” date = 2019-03-14T13:33:28 tags = [“rust”, “technical”] draft = false +++
I am using Rust to write an interpreter for a small subset of Lisp to restart my Rust journey. One of the pieces required for the interpreter is a tokenizer which converts the entire string into tokens. Since Lisp is mostly parens, characters, whitespaces and dots, it was an easy implementation – although made difficult because … Rust.
Anyhow, my approach was to create a tokenizer which would give back each token which I can then convert into an AST. Since this is word based tokenizer, I did this to save myself some time and not write a character based tokenizer.
Although it would not be as helpful as I thought it would be, and I will have to implement a character based tokenizer, this was a good exercise because I got to implement the iterator pattern.
So let’s try to understand how iterators work in Rust.
An iterator is a lazy collection which iterates over a bunch of values, giving one at a time and stopping when either we finish all the values or some other condition is met. Since it’s lazy, the next value is not calculated until it is asked for, in the meantime the iterator just rests and does nothing, restarting again when the next value is asked for.
Since it has to start and stop so many times, it needs a place to save its
state. The implementation demands a
struct which holds this state and each
iteration needs to return the next value update the state, or stop in case the
values are exhausted.
We start with creating the
struct. The state here depends on the value you
are going to use to return the value on each iteration. This is the state of
the iterator and is updated in each iteration.
We also create the implementation function of this
struct, which allows us to
create the instance. This just seeds the state with the string that we pass.
We now implement the actual functionality, i.e. implement the
Tokenizer. The heart of this implementation is a function called
which returns the next token from the string. It uses another function called
parse_token which returns the next token and the remaining string. The
implementation function can depend on any other function for its working.
And finally here’s the
parse_token function. I am passing the
state as a
&str instead of
String because I had implemented the
function as such in the original implementation. It hardly matters one way or
the other. Although I must point out that using references with
force you to implement lifetime constraints so this example is a bit light on
Here’s how it works: