Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

重构以改进模块化和错误处理

Refactoring to Improve Modularity and Error Handling

为了改进我们的程序,我们将修复四个与程序结构及其处理潜在错误方式有关的问题。首先,我们的 main 函数现在执行两个任务:解析参数和读取文件。随着程序的增长,main 函数处理的独立任务数量将会增加。当一个函数承担更多职责时,它会变得更难以理解、更难以测试,并且在不破坏其中一部分的情况下更难以更改。最好将功能分开,使每个函数只负责一个任务。

To improve our program, we’ll fix four problems that have to do with the program’s structure and how it’s handling potential errors. First, our main function now performs two tasks: It parses arguments and reads files. As our program grows, the number of separate tasks the main function handles will increase. As a function gains responsibilities, it becomes more difficult to reason about, harder to test, and harder to change without breaking one of its parts. It’s best to separate functionality so that each function is responsible for one task.

这个问题也引出了第二个问题:虽然 queryfile_path 是程序的配置变量,但像 contents 这里的变量是用于执行程序逻辑的。main 变得越长,我们就需要将越多的变量带入作用域;作用域内的变量越多,跟踪每个变量的用途就越困难。最好将配置变量分组到一个结构中,使其目的明确。

This issue also ties into the second problem: Although query and file_path are configuration variables to our program, variables like contents are used to perform the program’s logic. The longer main becomes, the more variables we’ll need to bring into scope; the more variables we have in scope, the harder it will be to keep track of the purpose of each. It’s best to group the configuration variables into one structure to make their purpose clear.

第三个问题是,我们在读取文件失败时使用了 expect 来打印错误信息,但该错误信息只是打印 Should have been able to read the file。读取文件可能会以多种方式失败:例如,文件可能缺失,或者我们可能没有权限打开它。目前,无论情况如何,我们都会为所有错误打印相同的错误信息,这不会给用户提供任何信息!

The third problem is that we’ve used expect to print an error message when reading the file fails, but the error message just prints Should have been able to read the file. Reading a file can fail in a number of ways: For example, the file could be missing, or we might not have permission to open it. Right now, regardless of the situation, we’d print the same error message for everything, which wouldn’t give the user any information!

第四,我们使用 expect 来处理错误,如果用户运行我们的程序时没有指定足够的参数,他们将收到 Rust 的 index out of bounds 错误,该错误无法清楚地解释问题。最好将所有的错误处理代码集中在一个地方,这样如果错误处理逻辑需要更改,未来的维护者只需在一个地方查阅代码。将所有错误处理代码放在一个地方也将确保我们打印的信息对最终用户是有意义的。

Fourth, we use expect to handle an error, and if the user runs our program without specifying enough arguments, they’ll get an index out of bounds error from Rust that doesn’t clearly explain the problem. It would be best if all the error-handling code were in one place so that future maintainers had only one place to consult the code if the error-handling logic needed to change. Having all the error-handling code in one place will also ensure that we’re printing messages that will be meaningful to our end users.

让我们通过重构项目来解决这四个问题。

Let’s address these four problems by refactoring our project.

二进制项目中的关注点分离

Separating Concerns in Binary Projects

将多个任务的职责分配给 main 函数的组织问题在许多二进制项目中都很常见。因此,许多 Rust 程序员发现,在 main 函数开始变大时,拆分二进制程序的独立关注点很有用。这个过程包括以下步骤:

The organizational problem of allocating responsibility for multiple tasks to the main function is common to many binary projects. As a result, many Rust programmers find it useful to split up the separate concerns of a binary program when the main function starts getting large. This process has the following steps:

  • 将你的程序拆分为 main.rs 文件和 lib.rs 文件,并将程序的逻辑移动到 lib.rs 中。

  • Split your program into a main.rs file and a lib.rs file and move your program’s logic to lib.rs.

  • 只要你的命令行解析逻辑很小,它就可以保留在 main 函数中。

  • As long as your command line parsing logic is small, it can remain in the main function.

  • 当命令行解析逻辑开始变得复杂时,将其从 main 函数中提取到其他函数或类型中。

  • When the command line parsing logic starts getting complicated, extract it from the main function into other functions or types.

在此过程之后留在 main 函数中的职责应仅限于以下内容:

The responsibilities that remain in the main function after this process should be limited to the following:

  • 使用参数值调用命令行解析逻辑

  • Calling the command line parsing logic with the argument values

  • 设置任何其他配置

  • Setting up any other configuration

  • 调用 lib.rs 中的 run 函数

  • Calling a run function in lib.rs

  • 如果 run 返回错误,则处理该错误

  • Handling the error if run returns an error

这种模式是为了分离关注点:main.rs 负责运行程序,而 lib.rs 负责处理当前任务的所有逻辑。因为你无法直接测试 main 函数,所以这种结构允许你通过将所有程序逻辑移出 main 函数来测试它。留在 main 函数中的代码将足够小,可以通过阅读来验证其正确性。让我们按照这个过程重新编写我们的程序。

This pattern is about separating concerns: main.rs handles running the program and lib.rs handles all the logic of the task at hand. Because you can’t test the main function directly, this structure lets you test all of your program’s logic by moving it out of the main function. The code that remains in the main function will be small enough to verify its correctness by reading it. Let’s rework our program by following this process.

提取参数解析器

Extracting the Argument Parser

我们将把解析参数的功能提取到 main 将调用的函数中。示例 12-5 显示了 main 函数的新开头,它调用了一个新的函数 parse_config,我们将在 src/main.rs 中定义它。

We’ll extract the functionality for parsing arguments into a function that main will call. Listing 12-5 shows the new start of the main function that calls a new function parse_config, which we’ll define in src/main.rs.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let (query, file_path) = parse_config(&args);

    // --snip--

    println!("Searching for {query}");
    println!("In file {file_path}");

    let contents = fs::read_to_string(file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");
}

fn parse_config(args: &[String]) -> (&str, &str) {
    let query = &args[1];
    let file_path = &args[2];

    (query, file_path)
}

我们仍然将命令行参数收集到一个 vector 中,但我们不是在 main 函数中将索引 1 的参数值分配给变量 query,将索引 2 的参数值分配给变量 file_path,而是将整个 vector 传递给 parse_config 函数。然后,parse_config 函数保存确定哪个参数进入哪个变量的逻辑,并将这些值传回给 main。我们仍然在 main 中创建 queryfile_path 变量,但 main 不再负责确定命令行参数和变量如何对应。

We’re still collecting the command line arguments into a vector, but instead of assigning the argument value at index 1 to the variable query and the argument value at index 2 to the variable file_path within the main function, we pass the whole vector to the parse_config function. The parse_config function then holds the logic that determines which argument goes in which variable and passes the values back to main. We still create the query and file_path variables in main, but main no longer has the responsibility of determining how the command line arguments and variables correspond.

对于我们的小程序来说,这种重做可能看起来有些大材小用,但我们正在以小的、增量的步骤进行重构。完成此更改后,再次运行程序以验证参数解析是否仍然有效。经常检查进度很有好处,这有助于在问题发生时识别原因。

This rework may seem like overkill for our small program, but we’re refactoring in small, incremental steps. After making this change, run the program again to verify that the argument parsing still works. It’s good to check your progress often, to help identify the cause of problems when they occur.

对配置值进行分组

Grouping Configuration Values

我们可以采取另一个小步骤来进一步改进 parse_config 函数。目前,我们返回的是一个元组,但随后我们立即再次将该元组拆分为各个部分。这迹象表明也许我们还没有找到正确的抽象。

We can take another small step to improve the parse_config function further. At the moment, we’re returning a tuple, but then we immediately break that tuple into individual parts again. This is a sign that perhaps we don’t have the right abstraction yet.

另一个表明有改进空间的迹象是 parse_configconfig 部分,这暗示我们返回的两个值是相关的,并且都是一个配置值的一部分。目前,除了将这两个值分组到一个元组之外,我们没有在数据结构中传达这种含义;相反,我们将这两个值放入一个结构体中,并为每个结构体字段赋予一个有意义的名称。这样做将使该代码未来的维护者更容易理解不同值之间如何关联以及它们的用途是什么。

Another indicator that shows there’s room for improvement is the config part of parse_config, which implies that the two values we return are related and are both part of one configuration value. We’re not currently conveying this meaning in the structure of the data other than by grouping the two values into a tuple; we’ll instead put the two values into one struct and give each of the struct fields a meaningful name. Doing so will make it easier for future maintainers of this code to understand how the different values relate to each other and what their purpose is.

示例 12-6 显示了对 parse_config 函数的改进。

Listing 12-6 shows the improvements to the parse_config function.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = parse_config(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    // --snip--

    println!("With text:\n{contents}");
}

struct Config {
    query: String,
    file_path: String,
}

fn parse_config(args: &[String]) -> Config {
    let query = args[1].clone();
    let file_path = args[2].clone();

    Config { query, file_path }
}

我们添加了一个名为 Config 的结构体,其定义具有名为 queryfile_path 的字段。parse_config 的签名现在表示它返回一个 Config 值。在 parse_config 的函数体中,我们以前返回引用 argsString 值的字符串切片,现在我们将 Config 定义为包含拥有的 String 值。main 中的 args 变量是参数值的所有者,仅允许 parse_config 函数借用它们,这意味着如果 Config 尝试获取 args 中值的所有权,我们将违反 Rust 的借用规则。

We’ve added a struct named Config defined to have fields named query and file_path. The signature of parse_config now indicates that it returns a Config value. In the body of parse_config, where we used to return string slices that reference String values in args, we now define Config to contain owned String values. The args variable in main is the owner of the argument values and is only letting the parse_config function borrow them, which means we’d violate Rust’s borrowing rules if Config tried to take ownership of the values in args.

有很多方法可以管理 String 数据;最简单但效率稍低的方法是在值上调用 clone 方法。这将为 Config 实例创建一个完整的数据副本,这比存储对字符串数据的引用需要更多的时间和内存。然而,克隆数据也使我们的代码非常直截了当,因为我们不必管理引用的生命周期;在这种情况下,牺牲一点性能来获得简洁性是值得的权衡。

There are a number of ways we could manage the String data; the easiest, though somewhat inefficient, route is to call the clone method on the values. This will make a full copy of the data for the Config instance to own, which takes more time and memory than storing a reference to the string data. However, cloning the data also makes our code very straightforward because we don’t have to manage the lifetimes of the references; in this circumstance, giving up a little performance to gain simplicity is a worthwhile trade-off.

使用 clone 的权衡

The Trade-Offs of Using clone

许多 Rustaceans 倾向于避免使用 clone 来解决所有权问题,因为它的运行时开销。在第 13 章中,你将学习如何在这种情况使用更有效的方法。但就目前而言,复制几个字符串以继续取得进展是可以的,因为你只会复制这些副本一次,而且你的文件路径和查询字符串非常小。与其在第一次尝试时就尝试过度优化代码,不如先拥有一个运行良好但效率稍低的程序。随着你对 Rust 变得更有经验,从最有效的解决方案开始会更容易,但就目前而言,调用 clone 是完全可以接受的。

There’s a tendency among many Rustaceans to avoid using clone to fix ownership problems because of its runtime cost. In Chapter 13, you’ll learn how to use more efficient methods in this type of situation. But for now, it’s okay to copy a few strings to continue making progress because you’ll make these copies only once and your file path and query string are very small. It’s better to have a working program that’s a bit inefficient than to try to hyperoptimize code on your first pass. As you become more experienced with Rust, it’ll be easier to start with the most efficient solution, but for now, it’s perfectly acceptable to call clone.

我们更新了 main,使其将 parse_config 返回的 Config 实例放入名为 config 的变量中,并且更新了之前使用独立的 queryfile_path 变量的代码,使其现在改用 Config 结构体上的字段。

We’ve updated main so that it places the instance of Config returned by parse_config into a variable named config, and we updated the code that previously used the separate query and file_path variables so that it now uses the fields on the Config struct instead.

现在我们的代码更清晰地传达了 queryfile_path 是相关的,并且它们的目的是配置程序将如何工作。任何使用这些值的代码都知道可以在 config 实例中以其用途命名的字段中找到它们。

Now our code more clearly conveys that query and file_path are related and that their purpose is to configure how the program will work. Any code that uses these values knows to find them in the config instance in the fields named for their purpose.

Config 创建构造函数

Creating a Constructor for Config

到目前为止,我们已经从 main 中提取了解析命令行参数的逻辑,并将其放在 parse_config 函数中。这样做帮助我们看到 queryfile_path 值是相关的,并且这种关系应该在我们的代码中传达出来。然后,我们添加了一个 Config 结构体来命名 queryfile_path 的相关目的,并能够从 parse_config 函数返回以结构体字段命名的值。

So far, we’ve extracted the logic responsible for parsing the command line arguments from main and placed it in the parse_config function. Doing so helped us see that the query and file_path values were related, and that relationship should be conveyed in our code. We then added a Config struct to name the related purpose of query and file_path and to be able to return the values’ names as struct field names from the parse_config function.

既然 parse_config 函数的目的是创建一个 Config 实例,我们可以将 parse_config 从普通函数改为与 Config 结构体关联的名为 new 的函数。进行此更改将使代码更具惯用性。我们可以通过调用 String::new 来创建标准库中类型的实例,例如 String。类似地,通过将 parse_config 更改为与 Config 关联的 new 函数,我们将能够通过调用 Config::new 来创建 Config 的实例。示例 12-7 显示了我们需要做的更改。

So, now that the purpose of the parse_config function is to create a Config instance, we can change parse_config from a plain function to a function named new that is associated with the Config struct. Making this change will make the code more idiomatic. We can create instances of types in the standard library, such as String, by calling String::new. Similarly, by changing parse_config into a new function associated with Config, we’ll be able to create instances of Config by calling Config::new. Listing 12-7 shows the changes we need to make.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::new(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");

    // --snip--
}

// --snip--

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn new(args: &[String]) -> Config {
        let query = args[1].clone();
        let file_path = args[2].clone();

        Config { query, file_path }
    }
}

我们更新了 main 中调用 parse_config 的地方,改为调用 Config::new。我们将 parse_config 的名称更改为 new 并将其移至 impl 块中,这使 new 函数与 Config 关联。尝试再次编译此代码以确保它正常工作。

We’ve updated main where we were calling parse_config to instead call Config::new. We’ve changed the name of parse_config to new and moved it within an impl block, which associates the new function with Config. Try compiling this code again to make sure it works.

修复错误处理

Fixing the Error Handling

现在我们将致力于修复错误处理。回想一下,如果 args vector 包含的项目少于三个,尝试访问索引 1 或索引 2 处的值将导致程序 panic。尝试在没有任何参数的情况下运行程序;它看起来像这样:

Now we’ll work on fixing our error handling. Recall that attempting to access the values in the args vector at index 1 or index 2 will cause the program to panic if the vector contains fewer than three items. Try running the program without any arguments; it will look like this:

$ cargo run
   Compiling minigrep v0.1.0 (file:///projects/minigrep)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.0s
     Running `target/debug/minigrep`

thread 'main' panicked at src/main.rs:27:21:
index out of bounds: the len is 1 but the index is 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

index out of bounds: the len is 1 but the index is 1 这一行是面向程序员的错误信息。它无法帮助最终用户理解他们应该做什么。现在让我们修复它。

The line index out of bounds: the len is 1 but the index is 1 is an error message intended for programmers. It won’t help our end users understand what they should do instead. Let’s fix that now.

改进错误信息

Improving the Error Message

在示例 12-8 中,我们在 new 函数中添加了一个检查,以便在访问索引 1 和索引 2 之前验证切片是否足够长。如果切片不够长,程序会 panic 并显示更好的错误信息。

In Listing 12-8, we add a check in the new function that will verify that the slice is long enough before accessing index 1 and index 2. If the slice isn’t long enough, the program panics and displays a better error message.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::new(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");
}

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    // --snip--
    fn new(args: &[String]) -> Config {
        if args.len() < 3 {
            panic!("not enough arguments");
        }
        // --snip--

        let query = args[1].clone();
        let file_path = args[2].clone();

        Config { query, file_path }
    }
}

这段代码类似于我们在示例 9-13 中编写的 Guess::new 函数,在其中当 value 参数超出有效值范围时我们调用了 panic!。在这里我们不是检查值的范围,而是检查 args 的长度至少为 3,并且函数的其余部分可以在满足此条件的假设下运行。如果 args 少于三个项目,此条件将为 true,我们调用 panic! 宏立即结束程序。

This code is similar to the Guess::new function we wrote in Listing 9-13, where we called panic! when the value argument was out of the range of valid values. Instead of checking for a range of values here, we’re checking that the length of args is at least 3 and the rest of the function can operate under the assumption that this condition has been met. If args has fewer than three items, this condition will be true, and we call the panic! macro to end the program immediately.

通过在 new 中添加这几行额外的代码,让我们在没有任何参数的情况下再次运行程序,看看现在的错误是什么样的:

With these extra few lines of code in new, let’s run the program without any arguments again to see what the error looks like now:

$ cargo run
   Compiling minigrep v0.1.0 (file:///projects/minigrep)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.0s
     Running `target/debug/minigrep`

thread 'main' panicked at src/main.rs:26:13:
not enough arguments
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

这个输出更好了:我们现在有了一个合理的错误信息。但是,我们也有一些不想提供给用户的无关信息。也许我们在示例 9-13 中使用的技术不是这里最好的:正如在第 9 章讨论的,调用 panic! 比起用法问题更适合编程问题。相反,我们将使用你在第 9 章学到的另一种技术——返回一个 Result,表示成功或错误。

This output is better: We now have a reasonable error message. However, we also have extraneous information we don’t want to give to our users. Perhaps the technique we used in Listing 9-13 isn’t the best one to use here: A call to panic! is more appropriate for a programming problem than a usage problem, as discussed in Chapter 9. Instead, we’ll use the other technique you learned about in Chapter 9—returning a Result that indicates either success or an error.

返回 Result 而不是调用 panic!

Returning a Result Instead of Calling panic!

我们可以转而返回一个 Result 值,在成功的情况下包含一个 Config 实例,在错误的情况下描述问题。我们还将把函数名称从 new 更改为 build,因为许多程序员期望 new 函数永远不会失败。当 Config::buildmain 通信时,我们可以使用 Result 类型来发出出现问题的信号。然后,我们可以更改 main 以将 Err 变体转换为对我们的用户更实用的错误,而不会产生由 panic! 调用引起的关于 thread 'main'RUST_BACKTRACE 的环绕文本。

We can instead return a Result value that will contain a Config instance in the successful case and will describe the problem in the error case. We’re also going to change the function name from new to build because many programmers expect new functions to never fail. When Config::build is communicating to main, we can use the Result type to signal there was a problem. Then, we can change main to convert an Err variant into a more practical error for our users without the surrounding text about thread 'main' and RUST_BACKTRACE that a call to panic! causes.

示例 12-9 显示了我们需要对现在称为 Config::build 的函数的返回值和返回 Result 所需的函数体所做的更改。请注意,在我们也更新 main 之前,这段代码将无法编译,我们将在下一个示例中进行更新。

Listing 12-9 shows the changes we need to make to the return value of the function we’re now calling Config::build and the body of the function needed to return a Result. Note that this won’t compile until we update main as well, which we’ll do in the next listing.

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::new(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");
}

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

我们的 build 函数在成功情况下返回包含 Config 实例的 Result,在错误情况下返回字符串字面量。我们的错误值将始终是具有 'static 生命周期的字符串字面量。

Our build function returns a Result with a Config instance in the success case and a string literal in the error case. Our error values will always be string literals that have the 'static lifetime.

我们在函数体中做了两处更改:当用户没有传递足够的参数时,我们不再调用 panic!,而是返回一个 Err 值,并且我们将 Config 返回值包装在 Ok 中。这些更改使函数符合其新的类型签名。

We’ve made two changes in the body of the function: Instead of calling panic! when the user doesn’t pass enough arguments, we now return an Err value, and we’ve wrapped the Config return value in an Ok. These changes make the function conform to its new type signature.

Config::build 返回 Err 值允许 main 函数处理从 build 函数返回的 Result 值,并在错误情况下更干净地退出进程。

Returning an Err value from Config::build allows the main function to handle the Result value returned from the build function and exit the process more cleanly in the error case.

调用 Config::build 并处理错误

Calling Config::build and Handling Errors

为了处理错误情况并打印用户友好的信息,我们需要更新 main 以处理 Config::build 返回的 Result,如示例 12-10 所示。我们还将承担从 panic! 那里接管的职责,手动实现以非零错误代码退出命令行工具。非零退出状态是一种惯例,用于向调用我们程序的进程发出信号,表明程序以错误状态退出。

To handle the error case and print a user-friendly message, we need to update main to handle the Result being returned by Config::build, as shown in Listing 12-10. We’ll also take the responsibility of exiting the command line tool with a nonzero error code away from panic! and instead implement it by hand. A nonzero exit status is a convention to signal to the process that called our program that the program exited with an error state.

use std::env;
use std::fs;
use std::process;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        println!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    // --snip--

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");
}

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

在这个示例中,我们使用了一个尚未详细讲解的方法:unwrap_or_else,它是由标准库在 Result<T, E> 上定义的。使用 unwrap_or_else 允许我们定义一些自定义的、非 panic! 的错误处理。如果 ResultOk 值,则此方法的行为类似于 unwrap:它返回 Ok 包装的内部值。但是,如果该值是 Err 值,则此方法调用闭包(closure)中的代码,闭包是我们定义的并作为参数传递给 unwrap_or_else 的匿名函数。我们将在第 13 章中更详细地介绍闭包。目前,你只需要知道 unwrap_or_else 会将 Err 的内部值(在本例中是我们在示例 12-9 中添加的静态字符串 "not enough arguments")传递给出现在垂直管道符号之间的参数 err。闭包中的代码随后可以在运行时使用 err 值。

In this listing, we’ve used a method we haven’t covered in detail yet: unwrap_or_else, which is defined on Result<T, E> by the standard library. Using unwrap_or_else allows us to define some custom, some non-panic! error handling. If the Result is an Ok value, this method’s behavior is similar to unwrap: It returns the inner value that Ok is wrapping. However, if the value is an Err value, this method calls the code in the closure, which is an anonymous function we define and pass as an argument to unwrap_or_else. We’ll cover closures in more detail in Chapter 13. For now, you just need to know that unwrap_or_else will pass the inner value of the Err, which in this case is the static string "not enough arguments" that we added in Listing 12-9, to our closure in the argument err that appears between the vertical pipes. The code in the closure can then use the err value when it runs.

我们添加了一行新的 use 语句,将标准库中的 process 引入作用域。在错误情况下运行的闭包中的代码只有两行:我们打印 err 值,然后调用 process::exitprocess::exit 函数将立即停止程序并返回作为退出状态代码传递的数字。这类似于我们在示例 12-8 中使用的基于 panic! 的处理,但我们不再获得所有额外的输出。让我们尝试一下:

We’ve added a new use line to bring process from the standard library into scope. The code in the closure that will be run in the error case is only two lines: We print the err value and then call process::exit. The process::exit function will stop the program immediately and return the number that was passed as the exit status code. This is similar to the panic!-based handling we used in Listing 12-8, but we no longer get all the extra output. Let’s try it:

$ cargo run
   Compiling minigrep v0.1.0 (file:///projects/minigrep)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.48s
     Running `target/debug/minigrep`
Problem parsing arguments: not enough arguments

太棒了!这个输出对我们的用户友好得多。

Great! This output is much friendlier for our users.

main 中提取逻辑

Extracting Logic from main

现在我们已经完成了配置解析的重构,让我们转向程序的逻辑。正如我们在“二进制项目中的关注点分离”中所述,我们将提取一个名为 run 的函数,它将保存目前 main 函数中所有不涉及设置配置或处理错误以外的逻辑。完成后,main 函数将简洁且易于通过检查进行验证,并且我们将能够为所有其他逻辑编写测试。

Now that we’ve finished refactoring the configuration parsing, let’s turn to the program’s logic. As we stated in “Separating Concerns in Binary Projects”, we’ll extract a function named run that will hold all the logic currently in the main function that isn’t involved with setting up configuration or handling errors. When we’re done, the main function will be concise and easy to verify by inspection, and we’ll be able to write tests for all the other logic.

示例 12-11 显示了提取 run 函数这一小的增量改进。

Listing 12-11 shows the small, incremental improvement of extracting a run function.

use std::env;
use std::fs;
use std::process;

fn main() {
    // --snip--

    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        println!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    run(config);
}

fn run(config: Config) {
    let contents = fs::read_to_string(config.file_path)
        .expect("Should have been able to read the file");

    println!("With text:\n{contents}");
}

// --snip--

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

run 函数现在包含从读取文件开始的所有剩余 main 逻辑。run 函数将 Config 实例作为参数。

The run function now contains all the remaining logic from main, starting from reading the file. The run function takes the Config instance as an argument.

run 返回错误

Returning Errors from run

随着剩余的程序逻辑被分离到 run 函数中,我们可以像示例 12-9 中对 Config::build 所做的那样改进错误处理。run 函数不再通过调用 expect 允许程序 panic,而是在出现问题时返回一个 Result<T, E>。这将使我们能够以用户友好的方式将有关处理错误的逻辑进一步合并到 main 中。示例 12-12 显示了我们需要对 run 的签名和主体所做的更改。

With the remaining program logic separated into the run function, we can improve the error handling, as we did with Config::build in Listing 12-9. Instead of allowing the program to panic by calling expect, the run function will return a Result<T, E> when something goes wrong. This will let us further consolidate the logic around handling errors into main in a user-friendly way. Listing 12-12 shows the changes we need to make to the signature and body of run.

use std::env;
use std::fs;
use std::process;
use std::error::Error;

// --snip--


fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        println!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    run(config);
}

fn run(config: Config) -> Result<(), Box<dyn Error>> {
    let contents = fs::read_to_string(config.file_path)?;

    println!("With text:\n{contents}");

    Ok(())
}

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

我们在这里做了三个显著的更改。首先,我们将 run 函数的返回类型更改为 Result<(), Box<dyn Error>>。此函数以前返回单元类型 (),我们将其作为 Ok 情况中返回的值保留。

We’ve made three significant changes here. First, we changed the return type of the run function to Result<(), Box<dyn Error>>. This function previously returned the unit type, (), and we keep that as the value returned in the Ok case.

对于错误类型,我们使用了 trait 对象 Box<dyn Error>(并且我们在顶部使用 use 语句将 std::error::Error 引入了作用域)。我们将在第 18 章中介绍 trait 对象。目前,只需知道 Box<dyn Error> 意味着该函数将返回一个实现了 Error trait 的类型,但我们不必指定返回值的具体类型。这给了我们灵活性,可以在不同的错误情况下返回可能属于不同类型的错误值。dyn 关键字是 dynamic(动态)的缩写。

For the error type, we used the trait object Box<dyn Error> (and we brought std::error::Error into scope with a use statement at the top). We’ll cover trait objects in Chapter 18. For now, just know that Box<dyn Error> means the function will return a type that implements the Error trait, but we don’t have to specify what particular type the return value will be. This gives us flexibility to return error values that may be of different types in different error cases. The dyn keyword is short for dynamic.

其次,我们删除了对 expect 的调用,转而使用 ? 运算符,正如我们在第 9 章中所讨论的那样。? 不会在发生错误时调用 panic!,而是从当前函数返回错误值供调用者处理。

Second, we’ve removed the call to expect in favor of the ? operator, as we talked about in Chapter 9. Rather than panic! on an error, ? will return the error value from the current function for the caller to handle.

第三,run 函数现在在成功情况下返回一个 Ok 值。我们在签名中将 run 函数的成功类型声明为 (),这意味着我们需要将单元类型值包装在 Ok 值中。这种 Ok(()) 语法起初看起来可能有点奇怪。但是这样使用 () 是表示我们调用 run 只是为了它的副作用的惯用方式;它不返回我们需要的值。

Third, the run function now returns an Ok value in the success case. We’ve declared the run function’s success type as () in the signature, which means we need to wrap the unit type value in the Ok value. This Ok(()) syntax might look a bit strange at first. But using () like this is the idiomatic way to indicate that we’re calling run for its side effects only; it doesn’t return a value we need.

当你运行这段代码时,它可以编译但会显示警告:

When you run this code, it will compile but will display a warning:

$ cargo run -- the poem.txt
   Compiling minigrep v0.1.0 (file:///projects/minigrep)
warning: unused `Result` that must be used
  --> src/main.rs:19:5
   |
19 |     run(config);
   |     ^^^^^^^^^^^
   |
   = note: this `Result` may be an `Err` variant, which should be handled
   = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
   |
19 |     let _ = run(config);
   |     +++++++

warning: `minigrep` (bin "minigrep") generated 1 warning
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.71s
     Running `target/debug/minigrep the poem.txt`
Searching for the
In file poem.txt
With text:
I'm nobody! Who are you?
Are you nobody, too?
Then there's a pair of us - don't tell!
They'd banish us, you know.

How dreary to be somebody!
How public, like a frog
To tell your name the livelong day
To an admiring bog!

Rust 告诉我们,我们的代码忽略了 Result 值,而 Result 值可能表明发生了错误。但我们没有检查是否存在错误,编译器提醒我们可能原本打算在这里编写一些错误处理代码!现在让我们纠正这个问题。

Rust tells us that our code ignored the Result value and the Result value might indicate that an error occurred. But we’re not checking to see whether or not there was an error, and the compiler reminds us that we probably meant to have some error-handling code here! Let’s rectify that problem now.

main 中处理 run 返回的错误

Handling Errors Returned from run in main

我们将使用类似于在示例 12-10 中对 Config::build 使用的技术来检查并处理错误,但略有不同:

We’ll check for errors and handle them using a technique similar to one we used with Config::build in Listing 12-10, but with a slight difference:

文件名:src/main.rs

Filename: src/main.rs

use std::env;
use std::error::Error;
use std::fs;
use std::process;

fn main() {
    // --snip--

    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        println!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    println!("Searching for {}", config.query);
    println!("In file {}", config.file_path);

    if let Err(e) = run(config) {
        println!("Application error: {e}");
        process::exit(1);
    }
}

fn run(config: Config) -> Result<(), Box<dyn Error>> {
    let contents = fs::read_to_string(config.file_path)?;

    println!("With text:\n{contents}");

    Ok(())
}

struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

我们使用 if let 而不是 unwrap_or_else 来检查 run 是否返回 Err 值,并在返回时调用 process::exit(1)run 函数并不像 Config::build 返回 Config 实例那样返回一个我们想要 unwrap 的值。因为 run 在成功情况下返回 (),所以我们只关心检测错误,因此不需要 unwrap_or_else 来返回被解包的值,因为它只会是 ()

We use if let rather than unwrap_or_else to check whether run returns an Err value and to call process::exit(1) if it does. The run function doesn’t return a value that we want to unwrap in the same way that Config::build returns the Config instance. Because run returns () in the success case, we only care about detecting an error, so we don’t need unwrap_or_else to return the unwrapped value, which would only be ().

在两种情况下,if letunwrap_or_else 函数的主体是相同的:我们打印错误并退出。

The bodies of the if let and the unwrap_or_else functions are the same in both cases: We print the error and exit.

将代码拆分为库 Crate

Splitting Code into a Library Crate

到目前为止,我们的 minigrep 项目看起来不错!现在我们将拆分 src/main.rs 文件并将一些代码放入 src/lib.rs 文件中。这样,我们可以测试代码,并拥有一个职责更少的 src/main.rs 文件。

Our minigrep project is looking good so far! Now we’ll split the src/main.rs file and put some code into the src/lib.rs file. That way, we can test the code and have a src/main.rs file with fewer responsibilities.

让我们在 src/lib.rs 而不是 src/main.rs 中定义负责搜索文本的代码,这将使我们(或任何其他使用我们的 minigrep 库的人)可以从比我们的 minigrep 二进制文件更多的上下文中调用搜索函数。

Let’s define the code responsible for searching text in src/lib.rs rather than in src/main.rs, which will let us (or anyone else using our minigrep library) call the searching function from more contexts than our minigrep binary.

首先,让我们在 src/lib.rs 中定义 search 函数签名,如示例 12-13 所示,其函数体调用 unimplemented! 宏。当我们填写实现时,我们将更详细地解释签名。

First, let’s define the search function signature in src/lib.rs as shown in Listing 12-13, with a body that calls the unimplemented! macro. We’ll explain the signature in more detail when we fill in the implementation.

pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
    unimplemented!();
}

我们在函数定义上使用了 pub 关键字,将 search 指定为我们的库 crate 公共 API 的一部分。现在我们有了一个可以从二进制 crate 中使用并且可以测试的库 crate!

We’ve used the pub keyword on the function definition to designate search as part of our library crate’s public API. We now have a library crate that we can use from our binary crate and that we can test!

现在我们需要将 src/lib.rs 中定义的代码引入 src/main.rs 中二进制 crate 的作用域并调用它,如示例 12-14 所示。

Now we need to bring the code defined in src/lib.rs into the scope of the binary crate in src/main.rs and call it, as shown in Listing 12-14.

use std::env;
use std::error::Error;
use std::fs;
use std::process;

// --snip--
use minigrep::search;

fn main() {
    // --snip--
    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        println!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    if let Err(e) = run(config) {
        println!("Application error: {e}");
        process::exit(1);
    }
}

// --snip--


struct Config {
    query: String,
    file_path: String,
}

impl Config {
    fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        Ok(Config { query, file_path })
    }
}

fn run(config: Config) -> Result<(), Box<dyn Error>> {
    let contents = fs::read_to_string(config.file_path)?;

    for line in search(&config.query, &contents) {
        println!("{line}");
    }

    Ok(())
}

我们添加了一行 use minigrep::search,将库 crate 中的 search 函数引入二进制 crate 的作用域。然后,在 run 函数中,我们不再打印文件的内容,而是调用 search 函数并将 config.query 值和 contents 作为参数传递。然后,run 将使用 for 循环来打印从 search 返回的每个与查询匹配的行。这也是删除 main 函数中显示查询和文件路径的 println! 调用(如果未发生错误,则我们的程序仅打印搜索结果)的好时机。

We add a use minigrep::search line to bring the search function from the library crate into the binary crate’s scope. Then, in the run function, rather than printing out the contents of the file, we call the search function and pass the config.query value and contents as arguments. Then, run will use a for loop to print each line returned from search that matched the query. This is also a good time to remove the println! calls in the main function that displayed the query and the file path so that our program only prints the search results (if no errors occur).

请注意,在进行任何打印之前,搜索函数将把所有结果收集到它返回的 vector 中。在搜索大文件时,此实现显示结果的速度可能会很慢,因为结果在找到时不会被打印出来;我们将在第 13 章讨论一种使用迭代器解决此问题的可能方法。

Note that the search function will be collecting all the results into a vector it returns before any printing happens. This implementation could be slow to display results when searching large files, because results aren’t printed as they’re found; we’ll discuss a possible way to fix this using iterators in Chapter 13.

呼!做了很多工作,但我们已经为未来做好了准备。现在处理错误变得更加容易,并且我们使代码更加模块化。从现在开始,我们几乎所有的工作都将在 src/lib.rs 中完成。

Whew! That was a lot of work, but we’ve set ourselves up for success in the future. Now it’s much easier to handle errors, and we’ve made the code more modular. Almost all of our work will be done in src/lib.rs from here on out.

让我们利用这种新发现的模块化,做一些用旧代码很难但用新代码很容易的事情:我们将编写一些测试!

Let’s take advantage of this newfound modularity by doing something that would have been difficult with the old code but is easy with the new code: We’ll write some tests!