接受命令行参数
Accepting Command Line Arguments
让我们一如既往地使用 cargo new 创建一个新项目。我们将项目命名为 minigrep,以区别于系统中可能已经存在的 grep 工具:
Let’s create a new project with, as always, cargo new. We’ll call our project minigrep to distinguish it from the grep tool that you might already have on your system:
$ cargo new minigrep
Created binary (application) `minigrep` project
$ cd minigrep
第一个任务是让 minigrep 接受它的两个命令行参数:文件路径和要搜索的字符串。也就是说,我们希望能够使用 cargo run 运行程序,后面跟着两个连字符(表示接下来的参数是给我们的程序的,而不是给 cargo 的),然后是要搜索的字符串,以及要搜索的文件路径,如下所示:
The first task is to make minigrep accept its two command line arguments: the file path and a string to search for. That is, we want to be able to run our program with cargo run, two hyphens to indicate the following arguments are for our program rather than for cargo, a string to search for, and a path to a file to search in, like so:
$ cargo run -- searchstring example-filename.txt
目前,由 cargo new 生成的程序无法处理我们给它的参数。一些 crates.io 上现有的库可以帮助编写接受命令行参数的程序,但由于你正在学习这个概念,让我们自己来实现这个功能。
Right now, the program generated by cargo new cannot process arguments we give it. Some existing libraries on crates.io can help with writing a program that accepts command line arguments, but because you’re just learning this concept, let’s implement this capability ourselves.
读取参数值
Reading the Argument Values
为了使 minigrep 能够读取传给它的命令行参数的值,我们需要 Rust 标准库中提供的 std::env::args 函数。这个函数返回一个传递给 minigrep 的命令行参数的迭代器。我们将在 第 13 章 详细讲解迭代器。目前,你只需要了解关于迭代器的两个细节:迭代器产生一系列值,并且我们可以在迭代器上调用 collect 方法将其转换为一个集合,比如包含迭代器产生的所有元素的 vector。
To enable minigrep to read the values of command line arguments we pass to it, we’ll need the std::env::args function provided in Rust’s standard library. This function returns an iterator of the command line arguments passed to minigrep. We’ll cover iterators fully in Chapter 13. For now, you only need to know two details about iterators: Iterators produce a series of values, and we can call the collect method on an iterator to turn it into a collection, such as a vector, which contains all the elements the iterator produces.
示例 12-1 中的代码允许你的 minigrep 程序读取任何传给它的命令行参数,然后将这些值收集到一个 vector 中。
The code in Listing 12-1 allows your minigrep program to read any command line arguments passed to it and then collect the values into a vector.
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
dbg!(args);
}
首先,我们使用 use 语句将 std::env 模块引入作用域,以便我们可以使用它的 args 函数。注意 std::env::args 函数嵌套在两层模块中。正如我们在 第 7 章 中讨论过的,当所需函数嵌套在多于一层模块中时,我们选择将父模块引入作用域,而不是函数本身。通过这样做,我们可以轻松地使用 std::env 中的其他函数。这也比添加 use std::env::args 然后只用 args 调用该函数更不容易产生歧义,因为 args 很容易被误认为是当前模块中定义的函数。
First, we bring the std::env module into scope with a use statement so that we can use its args function. Notice that the std::env::args function is nested in two levels of modules. As we discussed in Chapter 7, in cases where the desired function is nested in more than one module, we’ve chosen to bring the parent module into scope rather than the function. By doing so, we can easily use other functions from std::env. It’s also less ambiguous than adding use std::env::args and then calling the function with just args, because args might easily be mistaken for a function that’s defined in the current module.
args函数与无效的 Unicode
The
argsFunction and Invalid Unicode注意,如果任何参数包含无效的 Unicode,
std::env::args将会 panic。如果你的程序需要接受包含无效 Unicode 的参数,请改用std::env::args_os。该函数返回产生OsString值而不是String值的迭代器。为了简单起见,我们在这里选择了使用std::env::args,因为OsString值在不同平台上有所不同,且处理起来比String值更复杂。
Note that
std::env::argswill panic if any argument contains invalid Unicode. If your program needs to accept arguments containing invalid Unicode, usestd::env::args_osinstead. That function returns an iterator that producesOsStringvalues instead ofStringvalues. We’ve chosen to usestd::env::argshere for simplicity becauseOsStringvalues differ per platform and are more complex to work with thanStringvalues.
在 main 的第一行,我们调用 env::args,并立即使用 collect 将迭代器转换为包含迭代器产生的所有值的 vector。我们可以使用 collect 函数来创建多种集合,所以我们显式地标注 args 的类型,以指定我们想要一个字符串 vector。虽然在 Rust 中你很少需要标注类型,但 collect 是你经常需要标注的函数之一,因为 Rust 无法推断出你想要哪种集合。
On the first line of main, we call env::args, and we immediately use collect to turn the iterator into a vector containing all the values produced by the iterator. We can use the collect function to create many kinds of collections, so we explicitly annotate the type of args to specify that we want a vector of strings. Although you very rarely need to annotate types in Rust, collect is one function you do often need to annotate because Rust isn’t able to infer the kind of collection you want.
最后,我们使用 debug 宏打印该 vector。让我们先在没有参数的情况下运行代码,然后再带两个参数运行:
Finally, we print the vector using the debug macro. Let’s try running the code first with no arguments and then with two arguments:
$ cargo run
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.61s
Running `target/debug/minigrep`
[src/main.rs:5:5] args = [
"target/debug/minigrep",
]
$ cargo run -- needle haystack
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.57s
Running `target/debug/minigrep needle haystack`
[src/main.rs:5:5] args = [
"target/debug/minigrep",
"needle",
"haystack",
]
请注意,vector 中的第一个值是 "target/debug/minigrep",这是我们二进制文件的名称。这与 C 语言中参数列表的行为一致,允许程序在其执行过程中使用被调用的名称。如果你想在消息中打印程序名称,或者根据调用程序时使用的命令行别名来更改程序的行为,那么能够访问程序名称通常是很方便的。但就本章而言,我们将忽略它,只保存我们需要的那两个参数。
Notice that the first value in the vector is "target/debug/minigrep", which is the name of our binary. This matches the behavior of the arguments list in C, letting programs use the name by which they were invoked in their execution. It’s often convenient to have access to the program name in case you want to print it in messages or change the behavior of the program based on what command line alias was used to invoke the program. But for the purposes of this chapter, we’ll ignore it and save only the two arguments we need.
将参数值保存到变量中
Saving the Argument Values in Variables
该程序目前能够访问指定为命令行参数的值。现在我们需要将这两个参数的值保存到变量中,以便在程序的其余部分中使用这些值。我们在示例 12-2 中这样做。
The program is currently able to access the values specified as command line arguments. Now we need to save the values of the two arguments in variables so that we can use the values throughout the rest of the program. We do that in Listing 12-2.
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
let query = &args[1];
let file_path = &args[2];
println!("Searching for {query}");
println!("In file {file_path}");
}
正如我们在打印 vector 时所看到的,程序的名称占据了 vector 中 args[0] 的第一个值,所以我们从索引 1 开始获取参数。minigrep 接受的第一个参数是我们正在搜索的字符串,因此我们将第一个参数的引用放入变量 query 中。第二个参数将是文件路径,因此我们将第二个参数的引用放入变量 file_path 中。
As we saw when we printed the vector, the program’s name takes up the first value in the vector at args[0], so we’re starting arguments at index 1. The first argument minigrep takes is the string we’re searching for, so we put a reference to the first argument in the variable query. The second argument will be the file path, so we put a reference to the second argument in the variable file_path.
我们暂时打印这些变量的值,以证明代码正按我们的预期工作。让我们再次使用参数 test 和 sample.txt 运行这个程序:
We temporarily print the values of these variables to prove that the code is working as we intend. Let’s run this program again with the arguments test and sample.txt:
$ cargo run -- test sample.txt
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.0s
Running `target/debug/minigrep test sample.txt`
Searching for test
In file sample.txt
太棒了,程序正常工作!我们需要的参数值正被保存到正确的变量中。稍后我们将添加一些错误处理,以处理某些潜在的错误情况,例如当用户不提供任何参数时;目前,我们将忽略这种情况,转而处理添加文件读取功能。
Great, the program is working! The values of the arguments we need are being saved into the right variables. Later we’ll add some error handling to deal with certain potential erroneous situations, such as when the user provides no arguments; for now, we’ll ignore that situation and work on adding file-reading capabilities instead.