vak: (Знайка)
[personal profile] vak
Нет лучше способа понять язык программирования, чем глянуть на реализацию. В языке Rust многое напоминает Си++, но всё ж таки "трошечки не так". Разберёмся с конструкцией Box<T> для начала. Спросим grok.com. Результат смотрите в файле box.cpp.

Question: Imagine I need to re-implement Box<T> in C++. How would it look like?

Re-implementing Rust’s Box<T> in C++ is a fun challenge that lets us peek into Rust’s memory management internals and translate them into C++’s world. In Rust, Box<T> is a smart pointer that provides heap-allocated ownership with automatic cleanup, backed by the global allocator (usually malloc/free via Rust’s alloc crate). It’s simple but powerful: it allocates memory for a T, moves the value onto the heap, and deallocates it when the Box goes out of scope via Drop. In C++, we don’t have Rust’s ownership model or Drop trait, so we’ll use RAII and a custom class to mimic it.

Let’s design a C++ equivalent, breaking down the essentials of Box<T> and handling the nuances of C++’s memory model.

What Box<T> Does in Rust

  • Allocation: Allocates memory on the heap for a T using the global allocator.
  • Ownership: Sole owner of the heap memory; no sharing or borrowing by default.
  • Move Semantics: Transfers ownership when assigned or passed (no copying unless T: Clone).
  • Deallocation: Automatically frees the memory when dropped, calling T’s destructor if it has one.
  • Dereferencing: Provides access to T via * or . (through Deref).

C++ Implementation

We’ll create a Box class in C++ that:
  • Uses new and delete for allocation/deallocation (mirroring malloc/free).
  • Implements RAII for automatic cleanup.
  • Supports move semantics (C++11 and later) to mimic Rust’s ownership transfer.
  • Prevents copying to enforce single ownership (like Rust).
  • Provides dereference operators (* and ->).
Here’s the code: box.cpp.
template <typename T>
class Box {
private:
T *ptr; // Raw pointer to heap-allocated memory

// Private helper to deallocate
void deallocate()
{
if (ptr) {
delete ptr; // Calls T's destructor, then frees memory
ptr = nullptr;
}
}

public:
// Constructor: Allocate and construct T in-place
template <typename... Args>
explicit Box(Args &&...args) : ptr(new T(std::forward<Args>(args)...))
{
}

// Destructor: Automatically clean up
~Box() { deallocate(); }

// Disable copying (Rust Box doesn't allow this)
Box(const Box &) = delete;
Box &operator=(const Box &) = delete;

// Move constructor: Transfer ownership
Box(Box &&other) noexcept : ptr(other.ptr)
{
other.ptr = nullptr; // Leave other in a valid, empty state
}

// Move assignment: Transfer ownership
Box &operator=(Box &&other) noexcept
{
if (this != &other) {
deallocate(); // Free current memory
ptr = other.ptr; // Take ownership
other.ptr = nullptr; // Leave other empty
}
return *this;
}

// Dereference operators
T &operator*() { return *ptr; }
const T &operator*() const { return *ptr; }

T *operator->() { return ptr; }
const T *operator->() const { return ptr; }

// Optional: Raw pointer access (like Box::into_raw in Rust)
T *as_raw() { return ptr; }

// Optional: Take ownership back (like Box::from_raw, but consumes self)
T into_inner()
{
T value = std::move(*ptr);
deallocate();
return value;
}
};

// Factory function to simplify creation (like Box::new in Rust)
template <typename T, typename... Args>
Box<T> make_box(Args &&...args)
{
return Box<T>(std::forward<Args>(args)...);
}

Breaking It Down

1. Allocation
  • Rust: Box::new(x) allocates via alloc::alloc (typically malloc) and moves x into the heap.
  • C++: Box(Args&&...) uses new T(std::forward<Args>(args)...) to allocate and construct T in-place with perfect forwarding for variadic arguments.
  • Why: C++’s new combines allocation and construction, mimicking Rust’s heap placement.
2. Ownership
  • Rust: Single ownership, enforced by the borrow checker.
  • C++: We disable copying (= delete) and implement move semantics (&&) to transfer ownership. ptr is set to nullptr in the moved-from object, leaving it safe but empty.
  • Why: C++ doesn’t have a borrow checker, so we enforce ownership manually.
3. Deallocation
  • Rust: Drop trait calls alloc::dealloc when the Box goes out of scope.
  • C++: RAII via ~Box() calls delete ptr, which runs T’s destructor and frees the memory.
  • Why: C++’s RAII is the closest analog to Rust’s Drop.
4. Dereferencing
  • Rust: Implements Deref and DerefMut for *box and box.field.
  • C++: Overloads operator* and operator-> to access T.
  • Why: Provides ergonomic access similar to Rust.
5. Extras
  • make_box: Like Box::new, simplifies creation with forwarding.
  • into_inner: Mimics Rust’s Box consuming itself to return the inner T.
  • as_raw: Exposes the raw pointer, like Box::into_raw.

Differences from Rust’s Box<T>

Allocator:
  • Rust uses a global allocator (customizable via #[global_allocator]).
  • C++ uses new/delete, which calls the default allocator (usually malloc/free). You could use std::allocator or a custom allocator, but that’s more complex.
Safety:
  • Rust guarantees no use-after-free via the borrow checker.
  • C++ relies on the programmer not to dereference a moved-from Box (we set ptr to nullptr to help, but it’s not foolproof).
Metadata:
  • Rust’s Box can handle fat pointers (e.g., trait objects with vtables).
  • This C++ version is simpler, assuming a single pointer. For trait-like behavior, you’d need virtual functions or std::unique_ptr with a custom deleter.
Move Semantics:
Rust moves by default; C++ requires std::move explicitly.

Rust Internals Peek

In Rust, Box<T> is roughly:
pub struct Box<T: ?Sized> {
ptr: *mut T,
// Allocator hidden in implementation
}

impl<T> Box<T> {
pub fn new(x: T) -> Box<T> {
let ptr = unsafe { alloc::alloc(Layout::new::<T>()) as *mut T };
unsafe { ptr.write(x); }
Box { ptr }
}
}

impl<T: ?Sized> Drop for Box<T> {
fn drop(&mut self) {
unsafe { alloc::dealloc(self.ptr as *mut u8, Layout::new::<T>()); }
}
}
Our C++ version mirrors this but uses C++’s RAII and move semantics instead of raw pointers and Drop.

Enhancements

  • Custom Allocator: Add a template parameter for an allocator, like template <typename T, typename Alloc = std::allocator<T>>.
  • Fat Pointers: Support Box<dyn Trait> by storing a vtable or using std::unique_ptr with a deleter.
  • Debug Checks: Add assertions to catch misuse of moved-from objects.

Date: 2025-02-26 23:17 (UTC)
juan_gandhi: (Default)
From: [personal profile] juan_gandhi
This is cool. A good sane C++ code, as I remember it from the old days.

Date: 2025-02-26 23:52 (UTC)
juan_gandhi: (Default)
From: [personal profile] juan_gandhi
Cool. And it looks so neat.

Date: 2025-02-27 07:25 (UTC)
fenikso: (Default)
From: [personal profile] fenikso
Мне вот интересно, это stochastic parrot такое нагенерировал, где-то подходящая статья была или это custom case с имплементацией на питоне на бэкенде триггернулся :)

Date: 2025-02-27 08:00 (UTC)
fenikso: (Default)
From: [personal profile] fenikso
>Эк ты его пренебрежительно: stochastic parrot. 😀
:) Это отсылка на https://en.wikipedia.org/wiki/Stochastic_parrot - без пренебрежения, просто констатирую факт.

>А между тем в наших желеобразных мозгах ровно такая же фигня
>творится, что и в кремниевой башке этого Грока.
Вот тут я не совсем уверен :) Но я не AI-эксперт, так что спорить не буду.


>Ещё покажу дальше.
Looking forward! :)

Date: 2025-02-27 20:35 (UTC)
From: [personal profile] nz

TL;DR: AI invented unique_ptr.

Date: 2025-02-27 23:32 (UTC)
From: [personal profile] nz

Слідкуйте за руками:

auto a = std::move(box);
auto b = *box; // segmentation fault

Date: 2025-02-28 00:04 (UTC)
From: [personal profile] nz

Звісно.
Я і не кажу, що то повний аналог Rust Box.
Я про те, що корректна відповідь на "Imagine I need to re-implement Box in C++. How would it look like?" - "it would look like unique_ptr".
Перевигадувати його та писати руцями ото все нема особливого сенсу, бо краще все одно не вийде через обмеження мови.

Date: 2025-02-28 00:51 (UTC)
From: [personal profile] dedekha
Как-то не вижу больших отличий от std::unique_ptr, главное достоинство нет функции get().

И в любом случае с++ позволяет сделать

Box x(...);

delete &*x;