type = “post” +++ title = “Iterators in Rust” author = [“Tushar”] description = “Quick intro to the Rust iterators” date = 2019-03-14T13:33:28 tags = [“rust”, “technical”] draft = false +++
Iterators
I am using Rust to write an interpreter for a small subset of Lisp to restart my Rust journey. One of the pieces required for the interpreter is a tokenizer which converts the entire string into tokens. Since Lisp is mostly parens, characters, whitespaces and dots, it was an easy implementation – although made difficult because … Rust.
Anyhow, my approach was to create a tokenizer which would give back each token which I can then convert into an AST. Since this is word based tokenizer, I did this to save myself some time and not write a character based tokenizer.
Although it would not be as helpful as I thought it would be, and I will have to implement a character based tokenizer, this was a good exercise because I got to implement the iterator pattern.
So let’s try to understand how iterators work in Rust.
An iterator is a lazy collection which iterates over a bunch of values, giving one at a time and stopping when either we finish all the values or some other condition is met. Since it’s lazy, the next value is not calculated until it is asked for, in the meantime the iterator just rests and does nothing, restarting again when the next value is asked for.
Since it has to start and stop so many times, it needs a place to save its
state. The implementation demands a struct
which holds this state and each
iteration needs to return the next value update the state, or stop in case the
values are exhausted.
We start with creating the struct
. The state here depends on the value you
are going to use to return the value on each iteration. This is the state of
the iterator and is updated in each iteration.
We also create the implementation function of this struct
, which allows us to
create the instance. This just seeds the state with the string that we pass.
|
|
We now implement the actual functionality, i.e. implement the Iterator
trait
for Tokenizer
. The heart of this implementation is a function called next
which returns the next token from the string. It uses another function called
parse_token
which returns the next token and the remaining string. The
implementation function can depend on any other function for its working.
|
|
And finally here’s the parse_token
function. I am passing the state
as a
string reference &str
instead of String
because I had implemented the
function as such in the original implementation. It hardly matters one way or
the other. Although I must point out that using references with struct
will
force you to implement lifetime constraints so this example is a bit light on
the brain.
|
|
Here’s how it works:
|
|