Reputation: 1081
From the standard library:
Both pointers must be derived from a pointer to the same object. (See below for an example.)
let ptr1 = Box::into_raw(Box::new(0u8)); let ptr2 = Box::into_raw(Box::new(1u8)); let diff = (ptr2 as isize).wrapping_sub(ptr1 as isize); // Make ptr2_other an "alias" of ptr2, but derived from ptr1. let ptr2_other = (ptr1 as *mut u8).wrapping_offset(diff); assert_eq!(ptr2 as usize, ptr2_other as usize); // Since ptr2_other and ptr2 are derived from pointers to different // objects, computing their offset is undefined behavior, even though // they point to the same address! unsafe { let zero = ptr2_other.offset_from(ptr2); // Undefined Behavior }
I do not understand why this must be the case.
Upvotes: 6
Views: 393
Reputation: 60447
This has to do with a concept called "provenance" meaning "the place of origin". The Rust Unsafe Code Guidelines has a section on Pointer Provenance. Its a pretty abstract rule but it explains that its an extra bit of information that is used during compilation that helps guide what pointer transformations are well defined.
// Let's assume the two allocations here have base addresses 0x100 and 0x200.
// We write pointer provenance as `@N` where `N` is some kind of ID uniquely
// identifying the allocation.
let raw1 = Box::into_raw(Box::new(13u8));
let raw2 = Box::into_raw(Box::new(42u8));
let raw2_wrong = raw1.wrapping_add(raw2.wrapping_sub(raw1 as usize) as usize);
// These pointers now have the following values:
// raw1 points to address 0x100 and has provenance @1.
// raw2 points to address 0x200 and has provenance @2.
// raw2_wrong points to address 0x200 and has provenance @1.
// In other words, raw2 and raw2_wrong have same *address*...
assert_eq!(raw2 as usize, raw2_wrong as usize);
// ...but it would be UB to dereference raw2_wrong, as it has the wrong *provenance*:
// it points to address 0x200, which is in allocation @2, but the pointer
// has provenance @1.
The guidelines link to a good article: Pointers Are Complicated and its follow up Pointers Are Complicated II that go into more detail and coined the phrase:
Just because two pointers point to the same address, does not mean they are equal and can be used interchangeably.
Essentially, it is invalid to read a value via a pointer that is outside that pointer's original "allocation" even if you can guarantee a valid object exists there. Allowing such behavior could wreak havoc on the language's aliasing rules and possible optimizations. And there's pretty much never a good reason to do it.
This concept is mostly inherited from C and C++.
If you're curious if you've written code that violates this rule. Running it through miri, the undefined behavior analysis tool, can often find it.
fn main() {
let ptr1 = Box::into_raw(Box::new(0u8));
let ptr2 = Box::into_raw(Box::new(1u8));
let diff = (ptr2 as isize).wrapping_sub(ptr1 as isize);
let ptr2_other = (ptr1 as *mut u8).wrapping_offset(diff);
assert_eq!(ptr2 as usize, ptr2_other as usize);
unsafe { println!("{} {} {}", *ptr1, *ptr2, *ptr2_other) };
}
error: Undefined Behavior: memory access failed: pointer must be in-bounds at offset 1200, but is outside bounds of alloc1444 which has size 1
--> src/main.rs:7:49
|
7 | unsafe { println!("{} {} {}", *ptr1, *ptr2, *ptr2_other) };
| ^^^^^^^^^^^ memory access failed: pointer must be in-bounds at offset 1200, but is outside bounds of alloc1444 which has size 1
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
Upvotes: 6