使用线程同时运行代码
Using Threads to Run Code Simultaneously
在大多数当前操作系统中,执行程序的代码运行在“进程”(process)中,操作系统会同时管理多个进程。在程序内部,你也可以拥有同时运行的独立部分。运行这些独立部分的功能被称为“线程”(threads)。例如,一个 Web 服务器可以拥有多个线程,以便它能同时响应多个请求。
In most current operating systems, an executed program’s code is run in a process, and the operating system will manage multiple processes at once. Within a program, you can also have independent parts that run simultaneously. The features that run these independent parts are called threads. For example, a web server could have multiple threads so that it can respond to more than one request at the same time.
将程序中的计算拆分为多个线程以同时运行多个任务可以提高性能,但它也增加了复杂性。因为线程可以同时运行,所以无法预先保证不同线程中代码部分的运行顺序。这会导致一些问题,例如:
Splitting the computation in your program into multiple threads to run multiple tasks at the same time can improve performance, but it also adds complexity. Because threads can run simultaneously, there’s no inherent guarantee about the order in which parts of your code on different threads will run. This can lead to problems, such as:
-
竞态条件(Race conditions),线程以不一致的顺序访问数据或资源
-
死锁(Deadlocks),两个线程互相等待,导致两个线程都无法继续运行
-
只在某些特定情况下发生,且难以可靠地重现和修复的 bug
-
Race conditions, in which threads are accessing data or resources in an inconsistent order
-
Deadlocks, in which two threads are waiting for each other, preventing both threads from continuing
-
Bugs that only happen in certain situations and are hard to reproduce and fix reliably
Rust 试图减轻使用线程的负面影响,但在多线程上下文中编程仍然需要仔细思考,并且需要一个与单线程运行的程序不同的代码结构。
Rust attempts to mitigate the negative effects of using threads, but programming in a multithreaded context still takes careful thought and requires a code structure that is different from that in programs running in a single thread.
编程语言通过几种不同的方式实现线程,许多操作系统提供了编程语言可以调用以创建新线程的 API。Rust 标准库使用“1:1”线程实现模型,即程序为每个语言线程使用一个操作系统线程。有些 crate 实现了其他线程模型,这些模型在 1:1 模型的基础上做出了不同的权衡。(Rust 的异步系统,我们将在下一章看到,也提供了另一种并发处理方法。)
Programming languages implement threads in a few different ways, and many operating systems provide an API the programming language can call for creating new threads. The Rust standard library uses a 1:1 model of thread implementation, whereby a program uses one operating system thread per one language thread. There are crates that implement other models of threading that make different trade-offs to the 1:1 model. (Rust’s async system, which we will see in the next chapter, provides another approach to concurrency as well.)
使用 spawn 创建新线程
Creating a New Thread with spawn
要创建一个新线程,我们调用 thread::spawn 函数并传递给它一个闭包(我们在第 13 章讨论过闭包),该闭包包含我们想在新线程中运行的代码。示例 16-1 在主线程中打印一些文本,并在新线程中打印另一些文本。
To create a new thread, we call the thread::spawn function and pass it a
closure (we talked about closures in Chapter 13) containing the code we want to
run in the new thread. The example in Listing 16-1 prints some text from a main
thread and other text from a new thread.
use std::thread;
use std::time::Duration;
fn main() {
thread::spawn(|| {
for i in 1..10 {
println!("hi number {i} from the spawned thread!");
thread::sleep(Duration::from_millis(1));
}
});
for i in 1..5 {
println!("hi number {i} from the main thread!");
thread::sleep(Duration::from_millis(1));
}
}
请注意,当 Rust 程序的主线程结束时,所有派生线程(spawned threads)都会被关闭,无论它们是否已运行结束。这个程序的输出每次可能会略有不同,但看起来会类似于以下内容:
Note that when the main thread of a Rust program completes, all spawned threads are shut down, whether or not they have finished running. The output from this program might be a little different every time, but it will look similar to the following:
hi number 1 from the main thread!
hi number 1 from the spawned thread!
hi number 2 from the main thread!
hi number 2 from the spawned thread!
hi number 3 from the main thread!
hi number 3 from the spawned thread!
hi number 4 from the main thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
对 thread::sleep 的调用强制线程停止执行一小段时间,从而允许不同的线程运行。线程可能会轮流运行,但这并不保证:这取决于你的操作系统如何调度线程。在这次运行中,尽管派生线程的打印语句在代码中首先出现,但主线程先打印了。而且尽管我们告诉派生线程打印直到 i 为 9,但在主线程关闭之前它只运行到了 5。
The calls to thread::sleep force a thread to stop its execution for a short
duration, allowing a different thread to run. The threads will probably take
turns, but that isn’t guaranteed: It depends on how your operating system
schedules the threads. In this run, the main thread printed first, even though
the print statement from the spawned thread appears first in the code. And even
though we told the spawned thread to print until i is 9, it only got to 5
before the main thread shut down.
如果你运行这段代码只看到主线程的输出,或者没有看到任何交叉输出,请尝试增加范围中的数值,以便为操作系统提供更多在线程之间切换的机会。
If you run this code and only see output from the main thread, or don’t see any overlap, try increasing the numbers in the ranges to create more opportunities for the operating system to switch between the threads.
等待所有线程结束
Waiting for All Threads to Finish
示例 16-1 中的代码不仅由于主线程结束而导致派生线程大多数时候提前停止,而且由于无法保证线程运行的顺序,我们甚至无法保证派生线程是否能够运行!
The code in Listing 16-1 not only stops the spawned thread prematurely most of the time due to the main thread ending, but because there is no guarantee on the order in which threads run, we also can’t guarantee that the spawned thread will get to run at all!
我们可以通过将 thread::spawn 的返回值保存在变量中,来解决派生线程不运行或提前结束的问题。thread::spawn 的返回类型是 JoinHandle<T>。JoinHandle<T> 是一个拥有所有权的值,当我们对其调用 join 方法时,它将等待其线程结束。示例 16-2 展示了如何使用我们在示例 16-1 中创建的线程的 JoinHandle<T>,并展示了如何调用 join 以确保派生线程在 main 退出之前完成运行。
We can fix the problem of the spawned thread not running or of it ending
prematurely by saving the return value of thread::spawn in a variable. The
return type of thread::spawn is JoinHandle<T>. A JoinHandle<T> is an
owned value that, when we call the join method on it, will wait for its
thread to finish. Listing 16-2 shows how to use the JoinHandle<T> of the
thread we created in Listing 16-1 and how to call join to make sure the
spawned thread finishes before main exits.
use std::thread;
use std::time::Duration;
fn main() {
let handle = thread::spawn(|| {
for i in 1..10 {
println!("hi number {i} from the spawned thread!");
thread::sleep(Duration::from_millis(1));
}
});
for i in 1..5 {
println!("hi number {i} from the main thread!");
thread::sleep(Duration::from_millis(1));
}
handle.join().unwrap();
}
在句柄(handle)上调用 join 会阻塞当前正在运行的线程,直到该句柄所代表的线程终止。“阻塞”(Blocking)线程意味着该线程被阻止执行工作或退出。因为我们将对 join 的调用放在了主线程的 for 循环之后,运行示例 16-2 应该产生类似于以下的输出:
Calling join on the handle blocks the thread currently running until the
thread represented by the handle terminates. Blocking a thread means that
thread is prevented from performing work or exiting. Because we’ve put the call
to join after the main thread’s for loop, running Listing 16-2 should
produce output similar to this:
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 1 from the spawned thread!
hi number 3 from the main thread!
hi number 2 from the spawned thread!
hi number 4 from the main thread!
hi number 3 from the spawned thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!
这两个线程继续交替进行,但主线程因为调用了 handle.join() 而等待,并且在派生线程结束之前不会退出。
The two threads continue alternating, but the main thread waits because of the
call to handle.join() and does not end until the spawned thread is finished.
但是,让我们看看如果我们改为将 handle.join() 移到 main 中的 for 循环之前,会发生什么:
But let’s see what happens when we instead move handle.join() before the
for loop in main, like this:
use std::thread;
use std::time::Duration;
fn main() {
let handle = thread::spawn(|| {
for i in 1..10 {
println!("hi number {i} from the spawned thread!");
thread::sleep(Duration::from_millis(1));
}
});
handle.join().unwrap();
for i in 1..5 {
println!("hi number {i} from the main thread!");
thread::sleep(Duration::from_millis(1));
}
}
主线程将等待派生线程运行结束,然后才运行它自己的 for 循环,因此输出将不再交错,如下所示:
The main thread will wait for the spawned thread to finish and then run its
for loop, so the output won’t be interleaved anymore, as shown here:
hi number 1 from the spawned thread!
hi number 2 from the spawned thread!
hi number 3 from the spawned thread!
hi number 4 from the spawned thread!
hi number 5 from the spawned thread!
hi number 6 from the spawned thread!
hi number 7 from the spawned thread!
hi number 8 from the spawned thread!
hi number 9 from the spawned thread!
hi number 1 from the main thread!
hi number 2 from the main thread!
hi number 3 from the main thread!
hi number 4 from the main thread!
细小的细节,例如在何处调用 join,都会影响你的线程是否同时运行。
Small details, such as where join is called, can affect whether or not your
threads run at the same time.
在线程中使用 move 闭包
Using move Closures with Threads
我们经常会对传递给 thread::spawn 的闭包使用 move 关键字,因为闭包会获取它从环境中使用的值的所有权,从而将这些值的所有权从一个线程转移到另一个线程。在第 13 章的 “捕获引用或移动所有权” 中,我们在闭包的上下文中讨论了 move。现在我们将更多地关注 move 与 thread::spawn 之间的交互。
We’ll often use the move keyword with closures passed to thread::spawn
because the closure will then take ownership of the values it uses from the
environment, thus transferring ownership of those values from one thread to
another. In “Capturing References or Moving Ownership” in Chapter 13, we discussed move in the context of closures. Now we’ll
concentrate more on the interaction between move and thread::spawn.
请注意,在示例 16-1 中,我们传递给 thread::spawn 的闭包不带任何参数:我们没有在派生线程的代码中使用来自主线程的任何数据。要在派生线程中使用来自主线程的数据,派生线程的闭包必须捕获它需要的值。示例 16-3 尝试在主线程中创建一个 vector 并在派生线程中使用它。然而,这目前还行不通,稍后你就会看到原因。
Notice in Listing 16-1 that the closure we pass to thread::spawn takes no
arguments: We’re not using any data from the main thread in the spawned
thread’s code. To use data from the main thread in the spawned thread, the
spawned thread’s closure must capture the values it needs. Listing 16-3 shows
an attempt to create a vector in the main thread and use it in the spawned
thread. However, this won’t work yet, as you’ll see in a moment.
use std::thread;
fn main() {
let v = vec![1, 2, 3];
let handle = thread::spawn(|| {
println!("Here's a vector: {v:?}");
});
handle.join().unwrap();
}
闭包使用了 v,因此它将捕获 v 并使其成为闭包环境的一部分。因为 thread::spawn 在新线程中运行这个闭包,所以我们应该能够在该新线程内部访问 v。但当我们编译这个示例时,会得到以下错误:
The closure uses v, so it will capture v and make it part of the closure’s
environment. Because thread::spawn runs this closure in a new thread, we
should be able to access v inside that new thread. But when we compile this
example, we get the following error:
$ cargo run
Compiling threads v0.1.0 (file:///projects/threads)
error[E0373]: closure may outlive the current function, but it borrows `v`, which is owned by the current function
--> src/main.rs:6:32
|
6 | let handle = thread::spawn(|| {
| ^^ may outlive borrowed value `v`
7 | println!("Here's a vector: {v:?}");
| - `v` is borrowed here
|
note: function requires argument type to outlive `'static`
--> src/main.rs:6:18
|
6 | let handle = thread::spawn(|| {
| __________________^
7 | | println!("Here's a vector: {v:?}");
8 | | });
| |______^
help: to force the closure to take ownership of `v` (and any other referenced variables), use the `move` keyword
|
6 | let handle = thread::spawn(move || {
| ++++
For more information about this error, try `rustc --explain E0373`.
error: could not compile `threads` (bin "threads") due to 1 previous error
Rust 会“推断”如何捕获 v,由于 println! 只需要 v 的引用,因此闭包尝试借用 v。然而,存在一个问题:Rust 无法判断派生线程会运行多久,因此它不知道对 v 的引用是否始终有效。
Rust infers how to capture v, and because println! only needs a reference
to v, the closure tries to borrow v. However, there’s a problem: Rust can’t
tell how long the spawned thread will run, so it doesn’t know whether the
reference to v will always be valid.
示例 16-4 提供了一个更有可能出现无效引用的场景。
Listing 16-4 provides a scenario that’s more likely to have a reference to v
that won’t be valid.
use std::thread;
fn main() {
let v = vec![1, 2, 3];
let handle = thread::spawn(|| {
println!("Here's a vector: {v:?}");
});
drop(v); // oh no!
handle.join().unwrap();
}
如果 Rust 允许我们运行这段代码,那么派生线程极有可能立即被置于后台而根本没有运行。派生线程内部拥有对 v 的引用,但主线程立即调用了我们在第 15 章讨论过的 drop 函数丢弃了 v。然后,当派生线程开始执行时,v 不再有效,因此对它的引用也无效了。噢不!
If Rust allowed us to run this code, there’s a possibility that the spawned
thread would be immediately put in the background without running at all. The
spawned thread has a reference to v inside, but the main thread immediately
drops v, using the drop function we discussed in Chapter 15. Then, when the
spawned thread starts to execute, v is no longer valid, so a reference to it
is also invalid. Oh no!
要修复示例 16-3 中的编译错误,我们可以使用错误消息的建议:
To fix the compiler error in Listing 16-3, we can use the error message’s advice:
help: to force the closure to take ownership of `v` (and any other referenced variables), use the `move` keyword
|
6 | let handle = thread::spawn(move || {
| ++++
通过在闭包前添加 move 关键字,我们强制闭包获取其正在使用的值的所有权,而不是让 Rust 推断它应该借用这些值。对示例 16-3 进行修改后的示例 16-5 将按预期编译并运行。
By adding the move keyword before the closure, we force the closure to take
ownership of the values it’s using rather than allowing Rust to infer that it
should borrow the values. The modification to Listing 16-3 shown in Listing
16-5 will compile and run as we intend.
use std::thread;
fn main() {
let v = vec![1, 2, 3];
let handle = thread::spawn(move || {
println!("Here's a vector: {v:?}");
});
handle.join().unwrap();
}
我们可能会尝试使用 move 闭包来修复示例 16-4 中主线程调用 drop 的代码。然而,这个修复将不起作用,因为示例 16-4 尝试执行的操作因另一个原因而被禁止。如果我们给闭包添加了 move,我们就会将 v 移入闭包的环境中,于是我们无法再在主线程中对其调用 drop。我们将得到如下编译错误:
We might be tempted to try the same thing to fix the code in Listing 16-4 where
the main thread called drop by using a move closure. However, this fix will
not work because what Listing 16-4 is trying to do is disallowed for a
different reason. If we added move to the closure, we would move v into the
closure’s environment, and we could no longer call drop on it in the main
thread. We would get this compiler error instead:
$ cargo run
Compiling threads v0.1.0 (file:///projects/threads)
error[E0382]: use of moved value: `v`
--> src/main.rs:10:10
|
4 | let v = vec![1, 2, 3];
| - move occurs because `v` has type `Vec<i32>`, which does not implement the `Copy` trait
5 |
6 | let handle = thread::spawn(move || {
| ------- value moved into closure here
7 | println!("Here's a vector: {v:?}");
| - variable moved due to use in closure
...
10 | drop(v); // oh no!
| ^ value used here after move
|
help: consider cloning the value before moving it into the closure
|
6 ~ let value = v.clone();
7 ~ let handle = thread::spawn(move || {
8 ~ println!("Here's a vector: {value:?}");
|
For more information about this error, try `rustc --explain E0382`.
error: could not compile `threads` (bin "threads") due to 1 previous error
Rust 的所有权规则再次拯救了我们!我们从示例 16-3 的代码中得到了一个错误,是因为 Rust 过于保守,仅为线程借用了 v,这意味着主线程理论上可能会使派生线程的引用失效。通过告诉 Rust 将 v 的所有权转移到派生线程,我们向 Rust 保证主线程将不再使用 v。如果我们以同样的方式修改示例 16-4,那么当我们尝试在主线程中使用 v 时,就违反了所有权规则。move 关键字覆盖了 Rust 保守的借用默认行为;它并不允许我们违反所有权规则。
Rust’s ownership rules have saved us again! We got an error from the code in
Listing 16-3 because Rust was being conservative and only borrowing v for the
thread, which meant the main thread could theoretically invalidate the spawned
thread’s reference. By telling Rust to move ownership of v to the spawned
thread, we’re guaranteeing to Rust that the main thread won’t use v anymore.
If we change Listing 16-4 in the same way, we’re then violating the ownership
rules when we try to use v in the main thread. The move keyword overrides
Rust’s conservative default of borrowing; it doesn’t let us violate the
ownership rules.
既然我们已经介绍了什么是线程以及线程 API 提供的方法,让我们来看一些可以使用线程的场景。
Now that we’ve covered what threads are and the methods supplied by the thread API, let’s look at some situations in which we can use threads.