Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help


x-i18n: generated_at: “2026-03-01T14:59:43Z” model: gemini-3-flash-preview provider: google-gemini-cli source_hash: 5d91b909eec72d9a9438ed2f9d6db65cfa1243e607109d4f152e0be8e2295d7b source_path: ch20-01-unsafe-rust.md workflow: 16

不安全 Rust (Unsafe Rust)

到目前为止,我们讨论的所有代码都在编译时强制执行了 Rust 的内存安全保证。然而,Rust 内部隐藏着第二种不强制执行这些内存安全保证的语言:它被称为“不安全 Rust (unsafe Rust)”,它的工作方式与常规 Rust 相同,但赋予了我们额外超能力。

All the code we’ve discussed so far has had Rust’s memory safety guarantees enforced at compile time. However, Rust has a second language hidden inside it that doesn’t enforce these memory safety guarantees: It’s called unsafe Rust and works just like regular Rust but gives us extra superpowers.

不安全 Rust 的存在是因为,本质上,静态分析是保守的。当编译器试图确定代码是否遵守保证时,它宁愿拒绝一些有效的程序也不愿接受一些无效的程序。虽然代码“可能”没问题,但如果 Rust 编译器没有足够的信息来确信,它就会拒绝代码。在这些情况下,你可以使用不安全代码告诉编译器:“相信我,我知道我在做什么。”然而,请注意,使用不安全 Rust 的风险由你承担:如果你不正确地使用不安全代码,可能会由于内存不安全而发生问题,例如空指针解引用。

Unsafe Rust exists because, by nature, static analysis is conservative. When the compiler tries to determine whether or not code upholds the guarantees, it’s better for it to reject some valid programs than to accept some invalid programs. Although the code might be okay, if the Rust compiler doesn’t have enough information to be confident, it will reject the code. In these cases, you can use unsafe code to tell the compiler, “Trust me, I know what I’m doing.” Be warned, however, that you use unsafe Rust at your own risk: If you use unsafe code incorrectly, problems can occur due to memory unsafety, such as null pointer dereferencing.

Rust 具有“不安全”另一面的另一个原因是,底层的计算机硬件本质上是不安全的。如果 Rust 不允许你执行不安全操作,你就无法完成某些任务。Rust 需要允许你进行低级系统编程,例如直接与操作系统交互,甚至编写你自己的操作系统。进行低级系统编程是该语言的目标之一。让我们探索一下我们可以用不安全 Rust 做什么以及如何做。

Another reason Rust has an unsafe alter ego is that the underlying computer hardware is inherently unsafe. If Rust didn’t let you do unsafe operations, you couldn’t do certain tasks. Rust needs to allow you to do low-level systems programming, such as directly interacting with the operating system or even writing your own operating system. Working with low-level systems programming is one of the goals of the language. Let’s explore what we can do with unsafe Rust and how to do it.

执行不安全超能力 (Performing Unsafe Superpowers)

Performing Unsafe Superpowers

要切换到不安全 Rust,请使用 unsafe 关键字,然后开始一个持有不安全代码的新块。在不安全 Rust 中,你可以执行五项在安全 Rust 中不能执行的操作,我们称之为“不安全超能力 (unsafe superpowers)”。这些超能力包括以下能力:

  1. 解引用原始指针 (raw pointer)
  2. 调用不安全函数或方法
  3. 访问或修改可变的静态变量 (static variable)
  4. 实现不安全特征 (unsafe trait)
  5. 访问 union 的字段

To switch to unsafe Rust, use the unsafe keyword and then start a new block that holds the unsafe code. You can take five actions in unsafe Rust that you can’t in safe Rust, which we call unsafe superpowers. Those superpowers include the ability to:

  1. Dereference a raw pointer.
  2. Call an unsafe function or method.
  3. Access or modify a mutable static variable.
  4. Implement an unsafe trait.
  5. Access fields of unions.

重要的是要理解, unsafe 并不会关闭借用检查器或禁用任何 Rust 的其他安全检查:如果你在不安全代码中使用引用,它仍然会被检查。 unsafe 关键字只允许你访问这五个随后不由编译器进行内存安全检查的特性。在不安全块内部,你仍然可以获得某种程度的安全。

It’s important to understand that unsafe doesn’t turn off the borrow checker or disable any of Rust’s other safety checks: If you use a reference in unsafe code, it will still be checked. The unsafe keyword only gives you access to these five features that are then not checked by the compiler for memory safety. You’ll still get some degree of safety inside an unsafe block.

此外, unsafe 并不意味着块内的代码一定是危险的,或者它肯定会有内存安全问题:其意图是,作为程序员,你将确保 unsafe 块内的代码将以有效的方式访问内存。

In addition, unsafe does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems: The intent is that as the programmer, you’ll ensure that the code inside an unsafe block will access memory in a valid way.

人非圣贤,孰能无过,但通过要求将这五项不安全操作放在用 unsafe 标注的块内,你就会知道任何与内存安全相关的错误都必然发生在 unsafe 块内。请保持 unsafe 块足够小;当你调查内存 bug 时,你会感谢现在的决定的。

People are fallible and mistakes will happen, but by requiring these five unsafe operations to be inside blocks annotated with unsafe, you’ll know that any errors related to memory safety must be within an unsafe block. Keep unsafe blocks small; you’ll be thankful later when you investigate memory bugs.

为了尽可能隔离不安全代码,最好将此类代码封装在安全抽象中并提供一个安全的 API,我们将在本章后面研究不安全函数和方法时讨论这一点。标准库的部分内容被实现为对已审核过的不安全代码的安全抽象。将不安全代码包装在安全抽象中,可以防止 unsafe 的使用泄露到你或你的用户可能想要使用由 unsafe 代码实现的功能的所有地方,因为使用安全抽象是安全的。

To isolate unsafe code as much as possible, it’s best to enclose such code within a safe abstraction and provide a safe API, which we’ll discuss later in the chapter when we examine unsafe functions and methods. Parts of the standard library are implemented as safe abstractions over unsafe code that has been audited. Wrapping unsafe code in a safe abstraction prevents uses of unsafe from leaking out into all the places that you or your users might want to use the functionality implemented with unsafe code, because using a safe abstraction is safe.

让我们轮流看看这五个不安全超能力。我们还将看一些为不安全代码提供安全接口的抽象。

Let’s look at each of the five unsafe superpowers in turn. We’ll also look at some abstractions that provide a safe interface to unsafe code.

解引用原始指针 (Dereferencing a Raw Pointer)

在第 4 章“悬垂引用”部分中,我们提到编译器确保引用始终有效。不安全 Rust 有两种类似于引用的新类型,称为“原始指针 (raw pointers)”。与引用一样,原始指针可以是不可变的或可变的,分别写作 *const T*mut T 。星号不是解引用运算符;它是类型名称的一部分。在原始指针的上下文中,“不可变”意味着指针在解引用后不能被直接赋值。

In Chapter 4, in the “Dangling References” section, we mentioned that the compiler ensures that references are always valid. Unsafe Rust has two new types called raw pointers that are similar to references. As with references, raw pointers can be immutable or mutable and are written as *const T and *mut T, respectively. The asterisk isn’t the dereference operator; it’s part of the type name. In the context of raw pointers, immutable means that the pointer can’t be directly assigned to after being dereferenced.

与引用和智能指针不同,原始指针:

  • 允许通过同时拥有指向同一位置的不可变和可变指针,或多个可变指针来忽略借用规则
  • 不保证指向有效的内存
  • 允许为 null
  • 不实现任何自动清理

Different from references and smart pointers, raw pointers:

  • Are allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
  • Aren’t guaranteed to point to valid memory
  • Are allowed to be null
  • Don’t implement any automatic cleanup

通过选择不让 Rust 强制执行这些保证,你可以放弃保证的安全,以换取更高的性能,或者与 Rust 保证不适用的另一种语言或硬件进行接口。

By opting out of having Rust enforce these guarantees, you can give up guaranteed safety in exchange for greater performance or the ability to interface with another language or hardware where Rust’s guarantees don’t apply.

示例 20-1 展示了如何创建一个不可变和一个可变的原始指针。

Listing 20-1 shows how to create an immutable and a mutable raw pointer.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-01/src/main.rs:here}}
}

注意我们在这段代码中没有包含 unsafe 关键字。我们可以在安全代码中创建原始指针;我们只是不能在不安全块之外解引用原始指针,你稍后就会看到。

Notice that we don’t include the unsafe keyword in this code. We can create raw pointers in safe code; we just can’t dereference raw pointers outside an unsafe block, as you’ll see in a bit.

我们通过使用原始借用运算符创建了原始指针: &raw const num 创建了一个 *const i32 不可变原始指针,而 &raw mut num 创建了一个 *mut i32 可变原始指针。因为我们直接从局部变量创建了它们,所以我们知道这些特定的原始指针是有效的,但我们不能对任何原始指针都做这种假设。

We’ve created raw pointers by using the raw borrow operators: &raw const num creates a *const i32 immutable raw pointer, and &raw mut num creates a *mut i32 mutable raw pointer. Because we created them directly from a local variable, we know these particular raw pointers are valid, but we can’t make that assumption about just any raw pointer.

为了证明这一点,接下来我们将使用关键字 as 来强制转换一个值,而不是使用原始借用运算符,从而创建一个有效性不那么确定的原始指针。示例 20-2 展示了如何创建一个指向内存中任意位置的原始指针。尝试使用任意内存是未定义的:那个地址可能有数据,也可能没有,编译器可能会优化代码导致没有内存访问,或者程序可能会以分段错误 (segmentation fault) 终止。通常,没有理由编写这样的代码,特别是在你可以改用原始借用运算符的情况下,但它是可能的。

To demonstrate this, next we’ll create a raw pointer whose validity we can’t be so certain of, using the keyword as to cast a value instead of using the raw borrow operator. Listing 20-2 shows how to create a raw pointer to an arbitrary location in memory. Trying to use arbitrary memory is undefined: There might be data at that address or there might not, the compiler might optimize the code so that there is no memory access, or the program might terminate with a segmentation fault. Usually, there is no good reason to write code like this, especially in cases where you can use a raw borrow operator instead, but it is possible.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-02/src/main.rs:here}}
}

回想一下,我们可以在安全代码中创建原始指针,但我们不能解引用原始指针并读取被指向的数据。在示例 20-3 中,我们在一个原始指针上使用了需要 unsafe 块的解引用运算符 *

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-03/src/main.rs:here}}
}

创建指针没有任何坏处;只有当我们尝试访问它所指向的值时,我们才可能最终处理一个无效值。

Creating a pointer does no harm; it’s only when we try to access the value that it points at that we might end up dealing with an invalid value.

还要注意,在示例 20-1 和 20-3 中,我们创建了 *const i32*mut i32 原始指针,它们都指向存储 num 的相同内存位置。如果我们转而尝试为 num 创建一个不可变和一个可变的引用,代码将无法通过编译,因为 Rust 的所有权规则不允许在存在任何不可变引用的同时存在可变引用。使用原始指针,我们可以创建一个可变指针和一个不可变指针指向相同的位置,并通过可变指针更改数据,这可能会创建数据竞争。请务必小心!

Note also that in Listings 20-1 and 20-3, we created *const i32 and *mut i32 raw pointers that both pointed to the same memory location, where num is stored. If we instead tried to create an immutable and a mutable reference to num, the code would not have compiled because Rust’s ownership rules don’t allow a mutable reference at the same time as any immutable references. With raw pointers, we can create a mutable pointer and an immutable pointer to the same location and change data through the mutable pointer, potentially creating a data race. Be careful!

既然有这么多危险,你为什么还要使用原始指针呢?一个主要的用例是与 C 代码交互时,你将在下一节看到。另一种情况是构建借用检查器无法理解的安全抽象。我们将介绍不安全函数,然后看一个使用不安全代码的安全抽象例子。

With all of these dangers, why would you ever use raw pointers? One major use case is when interfacing with C code, as you’ll see in the next section. Another case is when building up safe abstractions that the borrow checker doesn’t understand. We’ll introduce unsafe functions and then look at an example of a safe abstraction that uses unsafe code.

调用不安全函数或方法 (Calling an Unsafe Function or Method)

Calling an Unsafe Function or Method

你可以在不安全块中执行的第二类操作是调用不安全函数。不安全函数和方法看起来与常规函数和方法完全一样,但在定义的其余部分之前有一个额外的 unsafe 。在此上下文中的 unsafe 关键字表示,当我们调用此函数时,我们需要遵守一些要求,因为 Rust 无法保证我们已经满足了这些要求。通过在 unsafe 块内调用不安全函数,我们表示我们已经阅读了此函数的文档,并且我们承担维护函数合同的责任。

The second type of operation you can perform in an unsafe block is calling unsafe functions. Unsafe functions and methods look exactly like regular functions and methods, but they have an extra unsafe before the rest of the definition. The unsafe keyword in this context indicates the function has requirements we need to uphold when we call this function, because Rust can’t guarantee we’ve met these requirements. By calling an unsafe function within an unsafe block, we’re saying that we’ve read this function’s documentation and we take responsibility for upholding the function’s contracts.

这里有一个名为 dangerous 的不安全函数,它的主体里什么也没做:

Here is an unsafe function named dangerous that doesn’t do anything in its body:

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/no-listing-01-unsafe-fn/src/main.rs:here}}
}

我们必须在一个单独的 unsafe 块内调用 dangerous 函数。如果我们尝试在没有 unsafe 块的情况下调用 dangerous ,我们将得到一个错误:

{{#include ../listings/ch20-advanced-features/output-only-01-missing-unsafe/output.txt}}

有了 unsafe 块,我们就向 Rust 断言我们已经阅读了函数的文档,我们了解如何正确使用它,并且我们已经验证了我们正在履行函数的合同。

With the unsafe block, we’re asserting to Rust that we’ve read the function’s documentation, we understand how to use it properly, and we’ve verified that we’re fulfilling the contract of the function.

要在不安全函数的函数体内执行不安全操作,你仍然需要像在常规函数内部一样使用 unsafe 块,如果你忘记了,编译器会警告你。这有助于我们将 unsafe 块保持得尽可能小,因为不安全操作可能不需要贯穿整个函数体。

To perform unsafe operations in the body of an unsafe function, you still need to use an unsafe block, just as within a regular function, and the compiler will warn you if you forget. This helps us keep unsafe blocks as small as possible, as unsafe operations may not be needed across the whole function body.

在不安全代码上创建安全抽象 (Creating a Safe Abstraction over Unsafe Code)

仅仅因为一个函数包含不安全代码,并不意味着我们需要将整个函数标记为不安全。事实上,在安全函数中包装不安全代码是一种常见的抽象。作为一个例子,让我们研究一下标准库中的 split_at_mut 函数,它需要一些不安全代码。我们将探索如何实现它。这个安全方法定义在可变切片上:它接收一个切片,并根据作为参数给出的索引将其一分为二。示例 20-4 展示了如何使用 split_at_mut

Just because a function contains unsafe code doesn’t mean we need to mark the entire function as unsafe. In fact, wrapping unsafe code in a safe function is a common abstraction. As an example, let’s study the split_at_mut function from the standard library, which requires some unsafe code. We’ll explore how we might implement it. This safe method is defined on mutable slices: It takes one slice and makes it two by splitting the slice at the index given as an argument. Listing 20-4 shows how to use split_at_mut.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-04/src/main.rs:here}}
}

我们无法仅使用安全 Rust 来实现此函数。一次尝试可能看起来像示例 20-5,它将无法通过编译。为了简单起见,我们将 split_at_mut 实现为一个函数而不是方法,并且仅针对 i32 值的切片而不是针对泛型 T

We can’t implement this function using only safe Rust. An attempt might look something like Listing 20-5, which won’t compile. For simplicity, we’ll implement split_at_mut as a function rather than a method and only for slices of i32 values rather than for a generic type T.

{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-05/src/main.rs:here}}

该函数首先获取切片的总长度。然后,它通过检查索引是否小于或等于长度,来断言作为参数给出的索引位于切片内。断言意味着如果我们传递一个大于切片分割长度的索引,函数将在尝试使用该索引之前引发恐慌。

This function first gets the total length of the slice. Then, it asserts that the index given as a parameter is within the slice by checking whether it’s less than or equal to the length. The assertion means that if we pass an index that is greater than the length to split the slice at, the function will panic before it attempts to use that index.

然后,我们在元组中返回两个可变切片:一个从原始切片的开始到 mid 索引,另一个从 mid 到切片的末尾。

Then, we return two mutable slices in a tuple: one from the start of the original slice to the mid index and another from mid to the end of the slice.

当我们尝试编译示例 20-5 中的代码时,我们将得到一个错误:

{{#include ../listings/ch20-advanced-features/listing-20-05/output.txt}}

Rust 的借用检查器无法理解我们正在借用切片的不同部分;它只知道我们两次从同一个切片中借用。借用切片的不同部分从根本上说没问题,因为这两个切片没有重叠,但 Rust 不够聪明,无法知道这一点。当我们知道代码没问题但 Rust 不知道时,就是该寻求不安全代码的时候了。

Rust’s borrow checker can’t understand that we’re borrowing different parts of the slice; it only knows that we’re borrowing from the same slice twice. Borrowing different parts of a slice is fundamentally okay because the two slices aren’t overlapping, but Rust isn’t smart enough to know this. When we know code is okay, but Rust doesn’t, it’s time to reach for unsafe code.

示例 20-6 展示了如何使用 unsafe 块、原始指针和一些对不安全函数的调用,来使 split_at_mut 的实现正常工作。

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-06/src/main.rs:here}}
}

回想第 4 章“切片类型”一节可知,切片是一个指向某些数据的指针和切片的长度。我们使用 len 方法获取切片的长度,使用 as_mut_ptr 方法访问切片的原始指针。在这种情况下,因为我们有一个 i32 值的可变切片, as_mut_ptr 返回一个类型为 *mut i32 的原始指针,我们将其存储在变量 ptr 中。

Recall from “The Slice Type” section in Chapter 4 that a slice is a pointer to some data and the length of the slice. We use the len method to get the length of a slice and the as_mut_ptr method to access the raw pointer of a slice. In this case, because we have a mutable slice to i32 values, as_mut_ptr returns a raw pointer with the type *mut i32, which we’ve stored in the variable ptr.

我们保留了 mid 索引在切片内的断言。然后,我们进入不安全代码: slice::from_raw_parts_mut 函数接收一个原始指针和一个长度,并创建一个切片。我们使用此函数创建一个从 ptr 开始且长度为 mid 项的切片。然后,我们在 ptr 上调用以 mid 为参数的 add 方法,以获得一个从 mid 开始的原始指针,并使用该指针和 mid 之后剩余的项数作为长度来创建一个切片。

We keep the assertion that the mid index is within the slice. Then, we get to the unsafe code: The slice::from_raw_parts_mut function takes a raw pointer and a length, and it creates a slice. We use this function to create a slice that starts from ptr and is mid items long. Then, we call the add method on ptr with mid as an argument to get a raw pointer that starts at mid, and we create a slice using that pointer and the remaining number of items after mid as the length.

函数 slice::from_raw_parts_mut 是不安全的,因为它接收一个原始指针,并且必须相信这个指针是有效的。原始指针上的 add 方法也是不安全的,因为它必须相信偏移位置也是一个有效的指针。因此,我们必须在调用 slice::from_raw_parts_mutadd 时套上一个 unsafe 块,以便我们可以调用它们。通过观察代码并加上 mid 必须小于或等于 len 的断言,我们可以判断在 unsafe 块内使用的所有原始指针都将是切片内数据的有效指针。这是一种可以接受且恰当的 unsafe 用法。

The function slice::from_raw_parts_mut is unsafe because it takes a raw pointer and must trust that this pointer is valid. The add method on raw pointers is also unsafe because it must trust that the offset location is also a valid pointer. Therefore, we had to put an unsafe block around our calls to slice::from_raw_parts_mut and add so that we could call them. By looking at the code and by adding the assertion that mid must be less than or equal to len, we can tell that all the raw pointers used within the unsafe block will be valid pointers to data within the slice. This is an acceptable and appropriate use of unsafe.

注意,我们不需要将生成的 split_at_mut 函数标记为 unsafe ,并且我们可以从安全 Rust 中调用此函数。我们已经通过以安全方式使用 unsafe 代码的函数实现,为不安全代码创建了一个安全抽象,因为它仅从该函数有权访问的数据中创建有效指针。

Note that we don’t need to mark the resultant split_at_mut function as unsafe, and we can call this function from safe Rust. We’ve created a safe abstraction to the unsafe code with an implementation of the function that uses unsafe code in a safe way, because it creates only valid pointers from the data this function has access to.

相比之下,示例 20-7 中 slice::from_raw_parts_mut 的使用在切片被使用时很可能会崩溃。这段代码获取一个任意的内存位置,并创建一个 10,000 项长的切片。

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-07/src/main.rs:here}}
}

我们并不拥有这个任意位置的内存,并且无法保证这段代码创建的切片包含有效的 i32 值。尝试像使用有效切片一样使用 values 会导致未定义行为。

We don’t own the memory at this arbitrary location, and there is no guarantee that the slice this code creates contains valid i32 values. Attempting to use values as though it’s a valid slice results in undefined behavior.

使用 extern 函数调用外部代码 (Using extern Functions to Call External Code)

有时你的 Rust 代码可能需要与另一种语言编写的代码进行交互。为此,Rust 提供了关键字 extern ,它促进了“外部函数接口 (Foreign Function Interface,FFI)”的创建和使用,FFI 是编程语言定义函数并允许另一种(外部)编程语言调用这些函数的一种方式。

Sometimes your Rust code might need to interact with code written in another language. For this, Rust has the keyword extern that facilitates the creation and use of a Foreign Function Interface (FFI), which is a way for a programming language to define functions and enable a different (foreign) programming language to call those functions.

示例 20-8 演示了如何建立与 C 标准库中的 abs 函数的集成。在 extern 块中声明的函数通常在从 Rust 代码中调用时是不安全的,因此 extern 块也必须被标记为 unsafe 。原因是其他语言并不强制执行 Rust 的规则和保证,且 Rust 无法检查它们,因此确保安全的责任落在了程序员身上。

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-08/src/main.rs}}
}

unsafe extern "C" 块内,我们列出了我们想要调用的另一种语言的外部函数的名称和签名。 "C" 部分定义了外部函数使用的“应用二进制接口 (application binary interface,ABI)”:ABI 定义了如何在汇编级别调用函数。 "C" ABI 是最常见的,遵循 C 编程语言的 ABI。关于 Rust 支持的所有 ABI 的信息可以在 Rust 参考手册 中找到。

Within the unsafe extern "C" block, we list the names and signatures of external functions from another language we want to call. The "C" part defines which application binary interface (ABI) the external function uses: The ABI defines how to call the function at the assembly level. The "C" ABI is the most common and follows the C programming language’s ABI. Information about all the ABIs Rust supports is available in the Rust Reference.

unsafe extern 块中声明的每一项都是隐式不安全的。然而,一些 FFI 函数“是”安全调用的。例如,C 标准库中的 abs 函数没有任何内存安全方面的考虑,并且我们知道它可以用任何 i32 调用。在这种情况下,我们可以使用 safe 关键字来说明这个特定函数是安全调用的,尽管它位于 unsafe extern 块中。一旦我们做出此更改,调用它就不再需要 unsafe 块,如示例 20-9 所示。

Every item declared within an unsafe extern block is implicitly unsafe. However, some FFI functions are safe to call. For example, the abs function from C’s standard library does not have any memory safety considerations, and we know it can be called with any i32. In cases like this, we can use the safe keyword to say that this specific function is safe to call even though it is in an unsafe extern block. Once we make that change, calling it no longer requires an unsafe block, as shown in Listing 20-9.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-09/src/main.rs}}
}

将函数标记为 safe 并不代表它天生就是安全的!相反,这就像你向 Rust 做出的一个保证它是安全的承诺。确保该承诺得到履行仍然是你的责任!

Marking a function as safe does not inherently make it safe! Instead, it is like a promise you are making to Rust that it is safe. It is still your responsibility to make sure that promise is kept!

从其他语言调用 Rust 函数 (Calling Rust Functions from Other Languages)

Calling Rust Functions from Other Languages

我们也可以使用 extern 来创建一个允许其他语言调用 Rust 函数的接口。我们不是创建一个整个 extern 块,而是在相关函数的 fn 关键字之前添加 extern 关键字并指定要使用的 ABI。我们还需要添加一个 #[unsafe(no_mangle)] 注解,告诉 Rust 编译器不要对该函数的名称进行混淆 (mangle)。“混淆 (Mangling)” 是指编译器将我们赋予函数的名称更改为一个包含更多信息供编译过程其他部分消耗但可读性较低的不同名称。每种编程语言的编译器对名称的混淆方式都略有不同,因此,为了让 Rust 函数能被其他语言命名,我们必须禁用 Rust 编译器的名称混淆。这是不安全的,因为如果没有内置的混淆,库之间可能会发生名称冲突,因此我们的责任是确保我们选择的名称可以安全地在不混淆的情况下导出。

We can also use extern to create an interface that allows other languages to call Rust functions. Instead of creating a whole extern block, we add the extern keyword and specify the ABI to use just before the fn keyword for the relevant function. We also need to add an #[unsafe(no_mangle)] annotation to tell the Rust compiler not to mangle the name of this function. Mangling is when a compiler changes the name we’ve given a function to a different name that contains more information for other parts of the compilation process to consume but is less human readable. Every programming language compiler mangles names slightly differently, so for a Rust function to be nameable by other languages, we must disable the Rust compiler’s name mangling. This is unsafe because there might be name collisions across libraries without the built-in mangling, so it is our responsibility to make sure the name we choose is safe to export without mangling.

在下面的例子中,我们在将 call_from_c 函数编译为共享库并从 C 链接后,使其可以被 C 代码访问:

In the following example, we make the call_from_c function accessible from C code, after it’s compiled to a shared library and linked from C:

#![allow(unused)]
fn main() {
#[unsafe(no_mangle)]
pub extern "C" fn call_from_c() {
    println!("Just called a Rust function from C!");
}
}

这种 extern 的用法仅在属性中需要 unsafe ,而不需要在 extern 块上。

This usage of extern requires unsafe only in the attribute, not on the extern block.

访问或修改可变的静态变量 (Accessing or Modifying a Mutable Static Variable)

Accessing or Modifying a Mutable Static Variable

在本书中,我们还没有谈到全局变量,Rust 确实支持全局变量,但对于 Rust 的所有权规则来说,它们可能会有问题。如果两个线程正在访问同一个可变全局变量,可能会导致数据竞争。

In this book, we’ve not yet talked about global variables, which Rust does support but which can be problematic with Rust’s ownership rules. If two threads are accessing the same mutable global variable, it can cause a data race.

在 Rust 中,全局变量被称为“静态 (static)”变量。示例 20-10 显示了一个带有字符串切片作为值的静态变量的声明和使用示例。

In Rust, global variables are called static variables. Listing 20-10 shows an example declaration and use of a static variable with a string slice as a value.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-10/src/main.rs}}
}

静态变量类似于我们在第 3 章“声明常量”部分讨论过的常量。按照惯例,静态变量的名称使用 SCREAMING_SNAKE_CASE 风格。静态变量只能存储具有 'static 生命周期的引用,这意味着 Rust 编译器可以算出该生命周期,且我们不被要求显式标注它。访问不可变的静态变量是安全的。

Static variables are similar to constants, which we discussed in the “Declaring Constants” section in Chapter 3. The names of static variables are in SCREAMING_SNAKE_CASE by convention. Static variables can only store references with the 'static lifetime, which means the Rust compiler can figure out the lifetime and we aren’t required to annotate it explicitly. Accessing an immutable static variable is safe.

常量和不可变静态变量之间的一个细微区别是,静态变量中的值在内存中有一个固定的地址。使用该值将始终访问相同的数据。另一方面,常量允许在每次使用时复制它们的数据。另一个区别是静态变量可以是可变的。访问和修改可变静态变量是“不安全的”。示例 20-11 展示了如何声明、访问和修改一个名为 COUNTER 的可变静态变量。

A subtle difference between constants and immutable static variables is that values in a static variable have a fixed address in memory. Using the value will always access the same data. Constants, on the other hand, are allowed to duplicate their data whenever they’re used. Another difference is that static variables can be mutable. Accessing and modifying mutable static variables is unsafe. Listing 20-11 shows how to declare, access, and modify a mutable static variable named COUNTER.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-11/src/main.rs}}
}

与常规变量一样,我们使用 mut 关键字指定可变性。任何读取或写入 COUNTER 的代码都必须位于 unsafe 块内。示例 20-11 中的代码可以编译并如我们所料打印出 COUNTER: 3 ,因为它是单线程的。让多个线程访问 COUNTER 可能会导致数据竞争,因此这是未定义行为。因此,我们需要将整个函数标记为 unsafe ,并记录安全限制,以便任何调用该函数的人知道他们被允许以及不被允许安全执行的操作。

As with regular variables, we specify mutability using the mut keyword. Any code that reads or writes from COUNTER must be within an unsafe block. The code in Listing 20-11 compiles and prints COUNTER: 3 as we would expect because it’s single threaded. Having multiple threads access COUNTER would likely result in data races, so it is undefined behavior. Therefore, we need to mark the entire function as unsafe and document the safety limitation so that anyone calling the function knows what they are and are not allowed to do safely.

每当我们编写一个不安全函数时,惯例是写一条以 SAFETY 开头的注释,解释调用者为了安全地调用该函数需要做些什么。同样,每当我们执行一个不安全操作时,惯例也是写一条以 SAFETY 开头的注释来解释是如何遵守安全规则的。

Whenever we write an unsafe function, it is idiomatic to write a comment starting with SAFETY and explaining what the caller needs to do to call the function safely. Likewise, whenever we perform an unsafe operation, it is idiomatic to write a comment starting with SAFETY to explain how the safety rules are upheld.

此外,编译器将默认通过编译器 lint 拒绝任何通过编译器引用创建对可变静态变量引用的尝试。你必须显式通过添加 #[allow(static_mut_refs)] 注解来选择退出该 lint 的保护,或者通过使用原始借用运算符之一创建的原始指针来访问可变静态变量。这包括引用被无形创建的情况,例如在本代码清单的 println! 中使用它时。要求通过原始指针创建对静态可变变量的引用,有助于使使用它们的安全要求更加明显。

Additionally, the compiler will deny by default any attempt to create references to a mutable static variable through a compiler lint. You must either explicitly opt out of that lint’s protections by adding an #[allow(static_mut_refs)] annotation or access the mutable static variable via a raw pointer created with one of the raw borrow operators. That includes cases where the reference is created invisibly, as when it is used in the println! in this code listing. Requiring references to static mutable variables to be created via raw pointers helps make the safety requirements for using them more obvious.

对于全局可访问的可变数据,很难确保没有数据竞争,这就是 Rust 认为可变静态变量是不安全的原因。在可能的情况下,最好使用我们在第 16 章讨论过的并发技术和线程安全智能指针,以便编译器检查来自不同线程的数据访问是否安全执行。

With mutable data that is globally accessible, it’s difficult to ensure that there are no data races, which is why Rust considers mutable static variables to be unsafe. Where possible, it’s preferable to use the concurrency techniques and thread-safe smart pointers we discussed in Chapter 16 so that the compiler checks that data access from different threads is done safely.

实现不安全特征 (Implementing an Unsafe Trait)

Implementing an Unsafe Trait

我们可以使用 unsafe 来实现一个不安全特征。当一个特征的至少一个方法具有编译器无法验证的某些不变量时,该特征就是不安全的。我们通过在 trait 之前添加 unsafe 关键字来声明特征是 unsafe 的,并将该特征的实现也标记为 unsafe ,如示例 20-12 所示。

We can use unsafe to implement an unsafe trait. A trait is unsafe when at least one of its methods has some invariant that the compiler can’t verify. We declare that a trait is unsafe by adding the unsafe keyword before trait and marking the implementation of the trait as unsafe too, as shown in Listing 20-12.

#![allow(unused)]
fn main() {
{{#rustdoc_include ../listings/ch20-advanced-features/listing-20-12/src/main.rs:here}}
}

通过使用 unsafe impl ,我们承诺我们将维护编译器无法验证的不变量。

By using unsafe impl, we’re promising that we’ll uphold the invariants that the compiler can’t verify.

作为一个例子,回想一下我们在第 16 章“使用 SendSync 的可扩展并发”部分讨论过的 SendSync 标记特征:如果我们的类型完全由实现了 SendSync 的其他类型组成,编译器就会自动实现这些特征。如果我们实现了一个包含未实现 SendSync 类型(如原始指针)的类型,并且我们想将该类型标记为 SendSync ,我们必须使用 unsafe 。Rust 无法验证我们的类型是否维护了它可以被安全地跨线程发送或从多个线程访问的保证;因此,我们需要手动执行这些检查并使用 unsafe 进行指示。

As an example, recall the Send and Sync marker traits we discussed in the “Extensible Concurrency with Send and Sync section in Chapter 16: The compiler implements these traits automatically if our types are composed entirely of other types that implement Send and Sync. If we implement a type that contains a type that does not implement Send or Sync, such as raw pointers, and we want to mark that type as Send or Sync, we must use unsafe. Rust can’t verify that our type upholds the guarantees that it can be safely sent across threads or accessed from multiple threads; therefore, we need to do those checks manually and indicate as such with unsafe.

访问 Union 的字段 (Accessing Fields of a Union)

Accessing Fields of a Union

仅适用于 unsafe 的最后一项操作是访问 union 的字段。 一个 union 类似于 struct ,但在特定实例中一次只使用一个声明的字段。Union 主要用于与 C 代码中的 union 进行接口。访问 union 字段是不安全的,因为 Rust 无法保证 union 实例中当前存储的数据类型。你可以在 Rust 参考手册 中了解更多关于 union 的信息。

The final action that works only with unsafe is accessing fields of a union. A union is similar to a struct, but only one declared field is used in a particular instance at one time. Unions are primarily used to interface with unions in C code. Accessing union fields is unsafe because Rust can’t guarantee the type of the data currently being stored in the union instance. You can learn more about unions in the Rust Reference.

使用 Miri 检查不安全代码 (Using Miri to Check Unsafe Code)

Using Miri to Check Unsafe Code

在编写不安全代码时,你可能想检查你编写的代码是否真正安全且正确。实现这一目标的最佳方法之一是使用 Miri,这是一个用于检测未定义行为的官方 Rust 工具。借用检查器是一个在编译时工作的“静态 (static)”工具,而 Miri 是一个在运行时工作的“动态 (dynamic)”工具。它通过运行你的程序或其测试套件来检查你的代码,并检测你何时违反了它所理解的关于 Rust 应如何工作的规则。

When writing unsafe code, you might want to check that what you have written actually is safe and correct. One of the best ways to do that is to use Miri, an official Rust tool for detecting undefined behavior. Whereas the borrow checker is a static tool that works at compile time, Miri is a dynamic tool that works at runtime. It checks your code by running your program, or its test suite, and detecting when you violate the rules it understands about how Rust should work.

使用 Miri 需要 Rust 的每夜构建版 (nightly build)(我们在附录 G:Rust 的制造过程与 “Nightly Rust”中会更多地讨论它)。你可以通过输入 rustup +nightly component add miri 来同时安装 nightly 版 Rust 和 Miri 工具。这并不会改变你的项目所使用的 Rust 版本;它只是将该工具添加到你的系统中,以便你可以在想要时使用它。你可以通过输入 cargo +nightly miri runcargo +nightly miri test 在项目上运行 Miri。

Using Miri requires a nightly build of Rust (which we talk about more in Appendix G: How Rust is Made and “Nightly Rust”). You can install both a nightly version of Rust and the Miri tool by typing rustup +nightly component add miri. This does not change what version of Rust your project uses; it only adds the tool to your system so you can use it when you want to. You can run Miri on a project by typing cargo +nightly miri run or cargo +nightly miri test.

为了演示这有多大帮助,请考虑当我们对示例 20-7 运行它时会发生什么。

For an example of how helpful this can be, consider what happens when we run it against Listing 20-7.

{{#include ../listings/ch20-advanced-features/listing-20-07/output.txt}}

Miri 正确地警告我们,我们正在将一个整数强制转换为一个指针,这可能是一个问题,但 Miri 无法确定是否存在问题,因为它不知道指针是如何起源的。然后,由于我们有一个悬垂指针,Miri 返回了一个指出示例 20-7 具有未定义行为的错误。多亏了 Miri,我们现在知道存在未定义行为的风险,并且我们可以考虑如何使代码安全。在某些情况下,Miri 甚至可以就如何修复错误提出建议。

Miri correctly warns us that we’re casting an integer to a pointer, which might be a problem, but Miri can’t determine whether a problem exists because it doesn’t know how the pointer originated. Then, Miri returns an error where Listing 20-7 has undefined behavior because we have a dangling pointer. Thanks to Miri, we now know there is a risk of undefined behavior, and we can think about how to make the code safe. In some cases, Miri can even make recommendations about how to fix errors.

Miri 无法捕捉到你在编写不安全代码时可能犯下的所有错误。Miri 是一个动态分析工具,因此它只能捕捉到真正运行的代码的问题。这意味着你需要结合良好的测试技术来使用它,以增加你对自己编写的不安全代码的信心。Miri 也无法涵盖你的代码可能存在不健全性的所有可能方式。

Miri doesn’t catch everything you might get wrong when writing unsafe code. Miri is a dynamic analysis tool, so it only catches problems with code that actually gets run. That means you will need to use it in conjunction with good testing techniques to increase your confidence about the unsafe code you have written. Miri also does not cover every possible way your code can be unsound.

换句话说:如果 Miri “确实” 捕捉到了一个问题,你就知道存在一个 bug,但仅仅因为 Miri “没有” 捕捉到一个 bug,并不意味着不存在问题。不过,它能捕捉到很多问题。尝试在本章的其他不安全代码示例上运行它,看看它会说些什么!

Put another way: If Miri does catch a problem, you know there’s a bug, but just because Miri doesn’t catch a bug doesn’t mean there isn’t a problem. It can catch a lot, though. Try running it on the other examples of unsafe code in this chapter and see what it says!

你可以在 其 GitHub 仓库 了解更多关于 Miri 的信息。

You can learn more about Miri at its GitHub repository.

正确使用不安全代码 (Using Unsafe Code Correctly)

Using Unsafe Code Correctly

使用 unsafe 来使用刚才讨论的五项超能力中的一项并没有错,甚至不被反对,但由于编译器无法帮助维护内存安全,正确编写 unsafe 代码更具挑战性。当你有理由使用 unsafe 代码时,你可以这样做,并且显式的 unsafe 注解使得在问题发生时追踪其源头变得更容易。每当你编写不安全代码时,你都可以使用 Miri 来帮助你更确信你编写的代码遵守了 Rust 的规则。

Using unsafe to use one of the five superpowers just discussed isn’t wrong or even frowned upon, but it is trickier to get unsafe code correct because the compiler can’t help uphold memory safety. When you have a reason to use unsafe code, you can do so, and having the explicit unsafe annotation makes it easier to track down the source of problems when they occur. Whenever you write unsafe code, you can use Miri to help you be more confident that the code you have written upholds Rust’s rules.

为了更深入地探索如何有效地处理不安全 Rust,请阅读 Rust 关于 unsafe 的官方指南 《The Rustonomicon》

For a much deeper exploration of how to work effectively with unsafe Rust, read Rust’s official guide for unsafe, The Rustonomicon.