Diving into Rust with a CLI
A blog post titled, "Diving into Go by Building a CLI Application" has been making it's rounds of the internet. It uses a small XKCD downloader as the subject. I thought was small and self contained enough, that it'd be interesting to see the same example in Rust!
Updates: Updated on 2021-10-09 to clap v3.0-beta.4
NOTE: This article assumes the reader is at least mildly familiar with Rust.
This will be an almost one to one comparison, however there are slight differences that I will call out as we go along.
We'll call this grab-xkcd
since we're dropping the go
.
... ( ••)>⌐■-■ (⌐■■) ...
$ cargo new grab-xkcd
Overview
The basic structure will be this:
- A
main()
function will drive the application, and exit with any error messages - An
Args
struct will hold and control our CLI - A
XkcdClient
struct will hold our logic for making requests and running the application - A
Comic
struct will hold our representation of a single comic - A
ComicResponse
will represent the JSON returned by the XKCD API
CLI
With those components in mind, I like to start all CLI applications by building, or at least stubbing the CLI. This allows me to get a feel for what it's like to use the tool, and often leads to small changes for a better user experience (UX). Granted, we're talking about the CLI here, so UX is on a relative scale, but there is nothing that says a CLI has to be terrible!
As per the other article, we'll accept four arguments:
--number
: which allows picking a certain comic, or defaults to0
which the XKCD API defines as the most recent comic--save
: to fetch and save the actual comic image to our current directory--output
: to allow selecting between JSON or Text representations of ourComic
(defaulting totext
)--timeout
: which allows customizing the request timeout, defaulting to 30 seconds
We can do this all in just a few lines of Rust via the
clap
crate. For those familiar with the
structopt
crate, which allows one to
define a Rust struct that contains all the CLI logic, as of clap
3.0 that code
has been merged together. We will use the 3.0.0-beta.4
release of clap
to
demonstrate.
We'll use cargo-edit
to add our
dependencies:
$ cargo add clap --allow-prerelease
Updating 'https://github.com/rust-lang/crates.io-index' index
Adding clap v3.0.0-beta.4 to dependencies
Here is our CLI:
use clap::{ArgEnum, Clap};
/// A utility to grab XKCD comics
#[derive(Clap)]
pub struct Args {
/// Set a connection timeout
#[clap(long, short, default_value = "30")]
pub timeout: u64,
/// Print output in a format
#[clap(long, short, arg_enum, default_value = "text")]
pub output: OutFormat,
/// The comic to load
#[clap(long, short, default_value = "0")]
pub num: usize,
/// Save image file to current directory
#[clap(long, short)]
pub save: bool,
}
#[derive(ArgEnum, Copy, Clone)]
pub enum OutFormat {
Json,
Text,
}
I'll do a quick walk through of highlights to explain a few differences from the first article.
#[derive(Clap)
is where all the magic happens, it tells clap to use the structArgs
as the CLI./// ...
is a documentation comment, which gets translated into the "about" section of a CLI or argumentlong, short
tells clap to create a long--foo
and short-f
switch automatically based off the field name.arg_enum
tells clap to use the field's enum as the value variants.#[derive(ArgEnum)
this allows clap to use the enum variants as values for a particular argument. It does things like auto-derivestd::str::FromStr
and handles ASCII upper/lowercase differences as well as comparing the supplied value at runtime to the one of the variants.
Notice we included an enum OutFormat
with two variants. This allows us to
limit the possible values provided on the CLI to known values. It also allows us
to not have to use String
s to represent it, reducing errors and typos!
If we provide some value not expected, clap
informs the user an exits:
$ grab-xkcd --output yaml
error: 'yaml' isn't a valid value for '--output <FMT>'
[possible values: json, text]
USAGE:
grab-xkcd --num <num> --output <FMT> --timeout <timeout>
For more information try --help
Nice!
Ok, so the CLI is pretty much complete. Let's drive our main()
function with
what we've got so far:
fn main() {
let args = Args::parse();
// ... todo
}
Since clap
handles the errors on invalid CLI use, we don't need to worry about
any sort of validation or exiting on errors.
NOTE: clap
does provide the ability to handle errors manually. But for
this quick example it's out of scope.
Let's also add some of those constants from the other article:
const BASE_URL: &str = "https://xkcd.com";
const LATEST_COMIC: usize = 0;
As noted comments from the other article, 0
represents the latest comic
according to the API.
XkcdClient
Alright, now we can start on the client to actually fetch the data from the server.
We'll use this client to hold the logic for our application, that means it needs to do a few things:
- Make a GET request against the XKCD API
- Parse the response JSON into a
Comic
- Print the
Comic
in the format specified by--output
- If the user requested to save the image:
- Fetch the image from the server via another HTTP GET request
- Write the data to a file
- Return any errors that happen along the way
Since we'll need to know a few items from our CLI, we can make our XkcdClient
contain a field holding our Args
struct. And a way to instantiate a new
instance of this client:
struct XkcdClient {
args: Args,
}
impl XkcdClient {
fn new(args: Args) -> Self {
XkcdClient { args }
}
}
Now we can add that to our main()
method:
main() {
let args = Args::parse();
let client = XkcdClient::new(args);
}
At this point, I usually stub out the rest of the main function:
main() {
let args = Args::parse();
let client = XkcdClient::new(args);
client.run();
}
Hmm...but we already said errors can happen along the way, so XkcdClient::run
will most likely return some form of Result
.
We have some options, either match
the return of run()
and print the
error/exit the process on error, or just let main()
handle it. For this simple
example, we'll let main()
handle it. This has the expense that some error
messages may be a little cryptic error: OS error 1
, but small tool like this
it's fine. We can leave it as an exercise for the reader to implement real error
handling.
To let main()
handle our errors, it needs to return a Result
as well. Since
we'll be dealing all kinds of different errors from different crates, it's
helpful to have some convenience methods and representation for handling all
this. The anyhow
crate does just this!
$ cargo add anyhow
Updating 'https://github.com/rust-lang/crates.io-index' index
Adding anyhow v1.0.31 to dependencies
We can now use the anyhow::Result
type who's E
generic uses the anyhow::Error
which is a lot like a Box<dyn std::error::Error>
under the covers.
So we update our main()
method:
use anyhow::Result;
fn main() -> Result<()> {
let args = Args::parse();
let client = XkcdClient::new(args);
client.run()
}
Now if our run()
method returns an Err(..)
, Rust will insert a pretty:
error: foo
message when it exits. This also allows us to bubble up all the various errors
with a simple ?
operator.
Now it looks like we should implement XkcdClient::run
, however since that will
drive the entire application, it will utilize a bunch of stuff we haven't
created yet. So let's hold off and come back to it later.
ComicResponse
First, we know we'll make an HTTP GET request against the API, so lets create a struct to hold the response:
pub struct ComicResponse {
month: String,
num: usize,
link: String,
year: String,
news: String,
safe_title: String,
transcript: String,
alt: String,
img: String,
title: String,
day: String,
}
Not too bad. But the API will return JSON, so we'll need to marshall the data
into our struct. Luckily, the serde
crate exists for serializing and
deserializing data with ease.
I can't stress this enough. serde
is bar none, the best serialization library
I've used in any language. Much like the clap
crate for CLIs (but I'm biased
about that, since I wrote clap
ᕕ(⌐■_■)ᕗ ♪♬).
We can add some crates used for this serialization now:
$ cargo add serde serde_derive serde_json
Updating 'https://github.com/rust-lang/crates.io-index' index
Adding serde v1.0.110 to dependencies
Adding serde_derive v1.0.110 to dependencies
Adding serde_json v1.0.53 to dependencies
Why three crates?
serde
provides the traitsserde_derive
provides the procedural macros so we can just tag our structs with a#[derive(..)]
serde_json
uses these to convert to/from JSON strings.
So all we do is update our ComicResponse
struct with the attribute:
#[derive(Deserialize)
struct ComicResponse {
// .. same as before
}
We only care about de
serialization (from JSON to Rust struct), but if we cared
about going the other direction we could have just as easily added Serialize
to the list of #[derive]
ed traits.
Comic
So we have our response, which is the Rust representation of the raw JSON, but we only care about a few elements. Likewise, we'd like to display the date in a more friendly format.
Let's first create our Comic
struct to hold the data we care about:
struct Comic {
title: String,
num: usize,
date: String,
desc: String,
img_url: String,
}
The most idiomatic Rust way to get from one type to another is the From
trait.
So we can now implement the From<ComicResponse>
trait for our Comic
which
will give us some sweet .into()
abilities.
impl From<ComicResponse> for Comic {
fn from(cr: ComicResponse) -> Self {
Comic {
title: cr.title,
num: cr.num,
date: format!("{}-{}-{}", cr.day, cr.month, cr.year),
desc: cr.alt,
img_url: cr.img,
}
}
}
Pretty self explanitory.
Ok, so we have a Comic
and we have a ComicResponse
and the ability to
convert between them, what else do we need to implement run
?
Although we derive
d the ability to go from a JSON formatted string to a
ComicResponse
we haven't actually utilized that code anywhere yet. So lets
wire that up with another From
impl:
impl From<String> for ComicResponse {
fn from(json: String) -> Self {
serde_json::from_str(&json) // ...uh oh, returns a Result
}
}
Hmm. Ok so serde_json::from_str
returns a Result
which makes sense. Any
string may not be valid JSON, and especially not valid for whatever struct we're
trying to create.
So we can either expect
that the JSON string we're given is always valid, and
panic otherwise, or if we allow From
express fallibility.
Turns out there is a version of From
for exactly this,
std::convert::TryFrom
!
With quick update, we're back at it:
impl TryFrom<String> for ComicResponse {
type Error = anyhow::Error;
fn try_from(json: String) -> Result<Self, Self::Error> {
serde_json::from_str(&json).map_err(|e| e.into())
}
}
Notice we used anyhow::Error
as the error type, but that meant we needed to
map
the error from the serde
error, to the anyhow
error. Not hard, but
something to note.
Alright! We finally have enough of the scaffolding to write run
with some
minor stubs!
XkcdClient::run
First, we know we'll be making an HTTP request so we'll need some crate to do
that for us. There are a ton of options out there. One I've used before is
reqwest
which supports both async
and blocking I/O. We'll be using the blocking I/O version, so let's add that
crate now:
$ cargo add reqwest
Updating 'https://github.com/rust-lang/crates.io-index' index
Adding reqwest v0.11.5 to dependencies
But since we'll be using the blocking I/O we need to manually edit the
features
array.
# Cargo.toml
[dependencies]
# .. snip
reqwest = { version = "0.11.5", features = ["blocking"]}
Now we can write the actual method! Here goes!
impl XkcdClient {
// ... same as above
fn run(&self) -> Result<()> {
let url = if let Some(n) = self.args.num {
format!("{}/{}/info.0.json", BASE_URL, n)
} else {
format!("{}/info.0.json", BASE_URL)
};
let http_client = reqwest::blocking::ClientBuilder::new()
.timeout(Duration::from_secs(self.args.timeout))
.build()?;
let resp: ComicResponse = http_client.get(&url).send()?.text()?.try_into()?;
let comic: Comic = resp.into();
if self.args.save {
comic.save()?;
}
comic.print(self.args.output)?;
Ok(())
}
Ok, that's a lot to take in.
Let's go over the highlights:
- Lines 5-9: We build a request URL based on if the user requested a particular comic, if not we get the latest comic
- Lines 10-12: We build an HTTP client with a custom timeout based on
--timeout
or the default - Line 13: We make the GET request, convert it to text (JSON), then attempt to
convert to a
ComicResponse
- Line 14: We convert the
ComicResponse
into aComic
- Lines 15-17: If the user wants to save the image, we stub out save call
- Line 18: Prints out the
Comic
repsresentation in the format requested, or the default - Line 19: Returns no errors
Notice all the ?
points! Turns out errors can happen everywhere but thanks
to anyhow::{Result, Error}
we can easily convert between them. A more proper
solution would be to create our own error representation that contains the
original errors and knows how to print errors and exit appropriately. But that
is out of scope for this article.
If we didn't care about custom timeouts, lines 10-13 could have been reduced to
a single line using
reqwest::blocking::get
.
Oh well.
We've also gone ahead and added fallibility to the save
and print
functions
because both of which could realistically fail for many reasons, so it's a safe
assumption that we'll be returning some kind of Result
.
So let's implement those now.
Comic::print
This should be a simple addition, all we need to do is match
on the
OutFormat
and print the Comic
representation appropriately.
Let's stub that out:
impl Comic {
// .. snip
fn print(&self, of: OutFormat) -> Result<()> {
match of {
OutFormat::Text => println!("{}", todo!("print self as Text")),
OutFormat::Json => println!("{}", todo!("print self as JSON")),
}
Ok(())
}
}
NOTE: The signature of print
takes OutFormat
by value, which only works
because we #[derive]
d Copy
and it's a small struct that fits into a register
so this is fine. If we had larger variants, that caused OutFormat
to be a
large enum, we could have taken it by reference &OutFormat
instead.
NOTE: todo!()
is like unimplemented!()
and will allow code to compile
but you'll receive a nice todo message if this code is reached during execution.
OutFormat::Text
Ok, so first what does it mean to print the Comic
struct as text? Well we can
implement std::fmt::Display
for Comic
which will allow an instance of
Comic
to be printed using things like println!
directly.
So if we implement Display
for the OutFormat::Text
variant, it would look
something like this:
impl fmt::Display for Comic {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"Title: {}\n\
Comic No: {}\n\
Date: {}\n\
Description: {}\n\
Image: {}\n",
self.title, self.num, self.date, self.desc, self.img_url
)
}
}
Now instances of Comic
and be passed directly to println!
and will be
formatted correct. So updating our print
method now looks like:
impl Comic {
// .. snip
fn print(&self, of: OutFormat) -> Result<()> {
match of {
OutFormat::Text => println!("{}", self), // <-- New
OutFormat::Json => println!("{}", todo!("print self as JSON")),
}
Ok(())
}
}
OutFormat::Json
Now for JSON!
What if we could just use serde_json
again, but in the reverse of before (Rust
struct to JSON)? #[derive(Serialize)]
!
#[derive(Serialize)]
struct Comic {
// .. snip
}
And we update our print
method:
fn print(&self, of: OutFormat) -> Result<()> {
match of {
OutFormat::Text => println!("{}", self),
OutFormat::Json => println!("{}", serde_json::to_string(self)?), // <-- New
}
Ok(())
}
Yep, we knew it could fail somehow! serde_json::to_string
returns a Result
in case it can't turn the struct in JSON for some reason.
Comic::save
It's also debatable if the logic for save should be in Comic
or the
XkcdClient
. So arbitrarily I picked Comic
. In a larger application it would
become more apparent which location is more appropriate.
The logic will be similar to that of run
where we need to:
- Determine our current working directory (which can be done with
std::env::current_dir
) - Determine the image file name
- Add the two together to form a file path for saving
- Perform an HTTP GET on the image URL
- Save the bytes to a file
We already have the image URL from the earlier request, but in order to get the
file name, we'll need to split the URL parts into the final segment. There are
many ways to do this, but the url
crate makes it pretty easy with iterators.
First we add the url crate:
$ cargo add url
Updating 'https://github.com/rust-lang/crates.io-index' index
Adding url v2.1.1 to dependencies
Next we will need to create an instance of Url
from the String
we've been storing, and iterate it's "segments" to get the last one which will
be the file name.
So let's do that much so far:
impl Comic {
fn save(&self) -> Result<()> {
let url = Url::parse(&*self.img_url)?;
let img_name = url.path_segments().unwrap().last().unwrap();
let p = std::env::current_dir()?;
let p = p.join(img_name);
let mut file = std::fs::File::create(p)?;
todo!("do HTTP GET and save the file!");
}
}
There a little bit to unpack here:
- Line 3: parses the string URL into an instance of
Url
with fallibility - Line 4: iterates the segments, and takes the last one.
- Line 5: Gets the current directory
- Line 6: Joins the current directory and the image file name to create a file path
- Line 7: Creates a file from the given file path for writing.
Since we already know how to perform a GET request, this next part should be a
breeze. The only added part is writing the bytes of the response to a file which
can be done with the std::io::Read
trait, which provides the write!
method
for writing arbitrary bytes, and the bytes()
method of the response which
turns the response into raw bytes, which can be dereferenced into &[u8]
that
write!
accepts.
fn save(&self) -> Result<()> {
use std::io::Read;
// .. snip, same as before
let body = reqwest::blocking::get(&self.img_url)?;
file.write_all(&*body.bytes()?).map_err(|e| e.into())
}
We once again have to map the error type so it fits our anyhow::Error
, but not
to much drama.
And we're done!
Notes on Differences
Even though this article isn't to point out differences between Rust and Go, it's still interesting to compare the two.
We can see that the Rust code is a little more verbose than the Go code in some areas, while the Go code is more verbose others. This, in my opinion stems from the Rust code being much more explicit about the errors that are possible at all levels.
We've essentially, ignored most errors and let them bubble up to the terminal
function for a simple error: foo
message, which we've already acknowledge may
include some errors that seem mysterious to the users. Again, if this were a
real application, we could create our own errors and utilizing (or map
) the
ones the originating errors into more friendly messages.
The Rust code also relies on more external libraries. Reasonable people can disagree on if this is good, bad, or a non-issue. I personally like the power that these libraries bring, and their quality is astounding in most cases.
clap
handles a ton for us, that makes some differences between the Go and
Rust version for almost no boilerplate. We've already seen what happens if we
pass an invalid option to --output
, but consider what happens if we typo
--timeout
and instead type --timeotu
:
$ grab-xkcd --timeotu
error: Found argument '--timeotu' which wasn't expected, or isn't valid in this context
Did you mean '--timeout'?
If you tried to supply `--timeotu` as a PATTERN use `-- --timeotu`
USAGE:
grab-xkcd --timeout <timeout>
For more information try --help
We can also handle any of the following permutations of options (only using
--output
as example):
-o text
-otext
-o=text
--output=text
--output text
We can even "stack" short flags (in our case only --save
exists), making any
of these valid as well ("stacking" --save
with --output
):
-so text
-so=text
-sotext
Another note about serde
is we could have forgone the ComicResponse
struct
altogether and just used Comic
to even further reduce the code. This works
because serde
can be told to #[serde(skip)]
fields that we don't care about.
However, I left it in to be closer to the Go version.
Conclusion
Hopefully this article shed some light on building a tiny CLI program in Rust.
Rust is an amazing language for building CLI utilities! In fact, it's replaced Python as my go-to language for CLIs of any size quite some time ago.
In the next post we will add CLI Shell completions to our program!
The full code from this article can be found at github.com/kbknapp/grab-xkcd
Discussed this post on DEV.to!