vak | Entries tagged with rust

Почему бы не подурачиться 1 апреля. Думаю, парсер грамматики Си вполне сойдёт за дурацкую затею. Тем более я ж не сам напрягаться собираюсь, а ИИ подряжу. Я упоминал про ASDL, а есть ещё Ungrammar, другой способ описывать синтаксические деревья. Я попросил Грок сбацать мне грамматику языка Си. Он предложил упрощённый вариант:

// Top-level program structure
TranslationUnit = Declaration*

// Declarations
Declaration = Type Identifier ';'
            | Type Identifier '(' ParameterList ')' CompoundStmt
ParameterList = (Type Identifier (',' Type Identifier)*)?

// Types
Type = 'int'
     | 'float'
     | 'char'
     | 'void'

// Statements
Stmt = CompoundStmt
     | ExprStmt
     | IfStmt
     | WhileStmt
     | ReturnStmt

CompoundStmt = '{' Declaration* Stmt* '}'
ExprStmt = Expr ';'
IfStmt = 'if' '(' Expr ')' Stmt ('else' Stmt)?
WhileStmt = 'while' '(' Expr ')' Stmt
ReturnStmt = 'return' Expr? ';'

// Expressions
Expr = AssignmentExpr
AssignmentExpr = EqualityExpr ('=' EqualityExpr)?
EqualityExpr = RelationalExpr (('==' | '!=') RelationalExpr)?
RelationalExpr = AddExpr (('<' | '>' | '<=' | '>=') AddExpr)?
AddExpr = MultExpr (('+' | '-') MultExpr)*
MultExpr = UnaryExpr (('*' | '/') UnaryExpr)*
UnaryExpr = PrimaryExpr
          | '-' UnaryExpr
          | '!' UnaryExpr
PrimaryExpr = Identifier
            | Number
            | String
            | '(' Expr ')'

// Basic tokens
Identifier = <token>
Number = <token>
String = <token>

Неплохо, вполне годится. А теперь говорю, напиши парсер этой грамматики на Rust. С третьей попытки получился работающий код: c_parser/src/main.rs. Полтыщи строк нетривиального кода одним махом побиваши.

В качестве теста обработаем Си-шную программу:

int x = 42;
int main(int argc) {
    int y = 3 + 5 * 2;
    if (y > x) {
        return 1;
    } else {
        return 0;
    }
}

Компилируем, запускаем:

$ cargo run
...
     Running `/home/vak/c_parser/target/debug/c_parser`
( результат )

Остаётся добавить указатели, массивы, структуры, тайпдефы и прочие фичи, и язык Си у нас в кармане. 😀

Наконец появилось вменяемое описание языка.

spec.ferrocene.dev

Этот документ ещё не стал официальным, но со временем станет. Подробности здесь.

Грамматика стандарта Си-2011:

Полезные статьи:

Пара Си компиляторов, написанных на языке Rust по упомянутой книжке:

Красивая реализация вычисления пи на Rust.

https://github.com/elkasztano/piday25

Концепция copy-on-write не представлена в традиционных языках программирования, поэтому вызывает трудности у народа. Тип Cow<T> в Rust часто используется со строками, чтобы избежать лишнего клонирования. Грок покажет нам упрощённую реализацию.

Question: Please show a possible implementation of Rust's type std::borrow::Cow in C++.

I'll show you how to implement a simplified version of Rust's std::borrow::Cow (Copy-on-Write) in C++. Rust's Cow is an enum that can hold either a borrowed reference (&'a T) or an owned value (T), typically used to avoid unnecessary cloning. In C++, we can approximate this using a combination of templates, variants, and smart pointers.

Here's a possible implementation: cow.cpp.

#include <variant>
#include <memory>
#include <string>
#include <stdexcept>

// A simplified Cow implementation in C++
template<typename T>
class Cow {
private:
    // Using variant to hold either a borrowed pointer or an owned value
    std::variant<const T*, std::unique_ptr<T>> data;

public:
    // Constructor for borrowed value
    explicit Cow(const T* borrowed) : data(borrowed) {
        if (!borrowed) {
            throw std::invalid_argument("Borrowed pointer cannot be null");
        }
    }

    // Constructor for owned value
    explicit Cow(T&& owned) : data(std::make_unique<T>(std::move(owned))) {}

    // Copy constructor
    Cow(const Cow& other) {
        if (std::holds_alternative<const T*>(other.data)) {
            data = std::get<const T*>(other.data);
        } else {
            const auto& owned = std::get<std::unique_ptr<T>>(other.data);
            data = std::make_unique<T>(*owned);
        }
    }

    // Move constructor
    Cow(Cow&& other) noexcept = default;

    // Assignment operators
    Cow& operator=(const Cow& other) {
        if (this != &other) {
            Cow temp(other);
            swap(*this, temp);
        }
        return *this;
    }

    Cow& operator=(Cow&& other) noexcept = default;

    // Access the underlying value (non-mutating)
    const T& operator*() const {
        if (std::holds_alternative<const T*>(data)) {
            return *std::get<const T*>(data);
        }
        return *std::get<std::unique_ptr<T>>(data);
    }

    const T* operator->() const {
        return &(**this);
    }

    // Check if the Cow holds a borrowed value
    bool is_borrowed() const {
        return std::holds_alternative<const T*>(data);
    }

    // Check if the Cow holds an owned value
    bool is_owned() const {
        return std::holds_alternative<std::unique_ptr<T>>(data);
    }

    // Convert to owned value if borrowed (clone on write)
    void to_owned() {
        if (is_borrowed()) {
            data = std::make_unique<T>(*std::get<const T*>(data));
        }
    }

    // Swap implementation
    friend void swap(Cow& lhs, Cow& rhs) noexcept {
        using std::swap;
        swap(lhs.data, rhs.data);
    }
};

( объяснения и пример )

Тип Mutex<T> хорош, но разрешает доступ нескольким потокам только по очереди. Что если нам чаще требуется только читать объект, а изменять только изредка? Через RwLock<T> можно получить множественный доступ на чтение.

Question: Please show a meaningful C++ implementation of type RwLock<T>.

I'll create a C++ implementation of RwLock<T> similar to Rust's RwLock, which provides a reader-writer lock allowing multiple readers or one writer. We'll use C++'s std::shared_mutex (available since C++17) as the underlying synchronization primitive.

Here's the implementation: rwlock.cpp.

#include <shared_mutex>
#include <memory>

template<typename T>
class RwLock {
private:
    T data;
    mutable std::shared_mutex mtx;  // mutable to allow locking in const methods

public:
    // Constructor
    explicit RwLock(T value) : data(std::move(value)) {}

    // Read guard class (multiple readers allowed)
    class ReadGuard {
    private:
        const T& data_ref;
        std::shared_lock<std::shared_mutex> lock;

    public:
        ReadGuard(const T& d, std::shared_mutex& m)
            : data_ref(d), lock(m) {}

        const T& operator*() const { return data_ref; }
        const T* operator->() const { return &data_ref; }
    };

    // Write guard class (exclusive writer)
    class WriteGuard {
    private:
        T& data_ref;
        std::unique_lock<std::shared_mutex> lock;

    public:
        WriteGuard(T& d, std::shared_mutex& m)
            : data_ref(d), lock(m) {}

        T& operator*() { return data_ref; }
        T* operator->() { return &data_ref; }
    };

    // Lock methods
    ReadGuard read_lock() const {
        return ReadGuard(data, mtx);
    }

    WriteGuard write_lock() {
        return WriteGuard(data, mtx);
    }

    // Prevent copying
    RwLock(const RwLock&) = delete;
    RwLock& operator=(const RwLock&) = delete;

    // Allow move construction
    RwLock(RwLock&&) = default;
    RwLock& operator=(RwLock&&) = default;
};

( объяснения )

Продолжаем изучать примитивы языка Rust. В этом деле Грок неплохой помощник. Как выясняется, тип Mutex<T> представляет собой удобную комбинацию классов std::mutex и std::unique_lock из Си++.

Question: Please show me a possible implementation of type Mutex<T> in C++.

I'll provide an implementation of a Mutex<T> type in C++ that mimics Rust's Mutex. Rust's Mutex provides mutual exclusion with a lock guard pattern, so we'll create a similar construct in C++ using the standard library's threading primitives.

Here's a possible implementation: mutex.cpp.

#include <mutex>
#include <memory>

template<typename T>
class Mutex {
private:
    T data;
    std::mutex mtx;

public:
    // Constructor
    explicit Mutex(T value) : data(std::move(value)) {}

    // Lock guard class (similar to Rust's MutexGuard)
    class LockGuard {
    private:
        T& data_ref;
        std::unique_lock<std::mutex> lock;

    public:
        LockGuard(T& d, std::mutex& m)
            : data_ref(d), lock(m) {}

        // Access the underlying data
        T& operator*() { return data_ref; }
        T* operator->() { return &data_ref; }
    };

    // Lock method that returns the guard
    LockGuard lock() {
        return LockGuard(data, mtx);
    }

    // Prevent copying
    Mutex(const Mutex&) = delete;
    Mutex& operator=(const Mutex&) = delete;

    // Allow move construction
    Mutex(Mutex&&) = default;
    Mutex& operator=(Mutex&&) = default;
};

( объяснения и пример )

Устав от сложных концепций, глянем что-нибудь простое, скажем тип Option<T>. Это контейнер на одно значение произвольного типа, которое может отсутствовать. В Си++17 появился практически идентичный класс std::optional. Но мы попросим Грок состряпать нам решение, максимально близкое к Rust.

Question: Please implement type Option in C++, making it as close to Rust as possible, but still convenient for C++.

Let's create a C++ implementation of Rust's Option type. In Rust, Option is an enum that represents either Some(value) or None, used for handling cases where a value might be absent. Here's a C++ version that mimics this behavior while adapting to C++ idioms:

#include <utility>
#include <stdexcept>

template <typename T>
class Option {
private:
    bool has_value;
    union {
        T value;
    };

public:
    // Constructors
    Option() : has_value(false) {}

    explicit Option(const T& val) : has_value(true), value(val) {}

    explicit Option(T&& val) : has_value(true), value(std::move(val)) {}

    // Copy constructor
    Option(const Option& other) : has_value(other.has_value) {
        if (has_value) {
            new(&value) T(other.value);
        }
    }

    // Move constructor
    Option(Option&& other) noexcept : has_value(other.has_value) {
        if (has_value) {
            new(&value) T(std::move(other.value));
            other.has_value = false;
        }
    }

    // Destructor
    ~Option() {
        if (has_value) {
            value.~T();
        }
    }

    // Assignment operators
    Option& operator=(const Option& other) {
        if (this != &other) {
            if (has_value) {
                value.~T();
            }
            has_value = other.has_value;
            if (has_value) {
                new(&value) T(other.value);
            }
        }
        return *this;
    }

    Option& operator=(Option&& other) noexcept {
        if (this != &other) {
            if (has_value) {
                value.~T();
            }
            has_value = other.has_value;
            if (has_value) {
                new(&value) T(std::move(other.value));
                other.has_value = false;
            }
        }
        return *this;
    }

    // Static methods to create Some and None
    static Option Some(const T& val) { return Option(val); }
    static Option None() { return Option(); }

    // Check if contains value
    bool is_some() const { return has_value; }
    bool is_none() const { return !has_value; }

    // Access value (unsafe - throws if None)
    T& unwrap() {
        if (!has_value) {
            throw std::runtime_error("Called unwrap on None");
        }
        return value;
    }

    const T& unwrap() const {
        if (!has_value) {
            throw std::runtime_error("Called unwrap on None");
        }
        return value;
    }

    // Access with default value
    T unwrap_or(const T& default_value) const {
        return has_value ? value : default_value;
    }

    // Get value or compute default
    template <typename F>
    T unwrap_or_else(F&& func) const {
        return has_value ? value : func();
    }
};

// Helper function to make creation more convenient
template <typename T>
Option<T> some(const T& value) {
    return Option<T>::Some(value);
}

template <typename T>
Option<T> none() {
    return Option<T>::None();
}

( объяснения и пример )

Тип RefCell<T>, аналогично Cell<T>, тоже контейнер на одно значение произвольного типа. Но к самому значению нельзя достучаться прямо, а надо сначала получить ссылку на него методами borrow() или borrow_mut().

Метод borrow() даёт константную ссылку. Значение можно получить методом get(), но нельзя изменить. Таких константных ссылок можно насоздавать сколько угодно - при условии, что не существует мутабельной ссылки.

Метод borrow_mut() даёт мутабельную ссылку. Значение можно изменить методом set(). Но мутабельную ссылку можно получить только одну-единственную, и при отсутствии константных ссылок.

Question: Please show me implementation of type RefCell<T> in C++.

Let’s tackle implementing Rust’s RefCell<T> in C++. This is an exciting challenge because RefCell<T> takes interior mutability a step further than Cell<T>. While Cell<T> allows mutation through a shared reference with a simple copy or swap, RefCell<T> provides borrowed access—either immutable (shared) or mutable (exclusive)—with runtime checks to enforce Rust’s borrowing rules: you can have many immutable borrows or one mutable borrow, but never both at once. In Rust, this is all about safety without locks, and we’ll aim to replicate that dynamic enforcement in C++.

In C++, we don’t have a borrow checker, so we’ll use a runtime mechanism to track borrows, similar to how RefCell uses a counter in Rust. The implementation will involve:

A container for the value.
Counters for active immutable and mutable borrows.
Methods to borrow the value immutably (borrow) or mutably (borrow_mut), returning smart pointers that enforce the borrowing rules.
Runtime checks to panic (or throw, in C++ terms) if the rules are violated.

Here’s the implementation: refcell.cpp.

#include <iostream>
#include <memory>
#include <stdexcept>
#include <utility>

template <typename T>
class RefCell {
private:
    T value;
    mutable size_t borrow_count;  // Tracks immutable borrows
    mutable bool borrowed_mut;    // Tracks if there's a mutable borrow

public:
    explicit RefCell(T initial) : value(std::move(initial)), borrow_count(0), borrowed_mut(false) {}

    class Ref {
    private:
        const RefCell<T>* cell;
        friend class RefCell<T>;

        Ref(const RefCell<T>* c) : cell(c) {
            if (cell->borrowed_mut) {
                throw std::runtime_error("Already mutably borrowed");
            }
            cell->borrow_count++;
        }

    public:
        ~Ref() {
            if (cell) {
                cell->borrow_count--;
            }
        }

        Ref(const Ref&) = delete;
        Ref& operator=(const Ref&) = delete;
        Ref(Ref&& other) noexcept : cell(other.cell) { other.cell = nullptr; }

        const T& get() const { return cell->value; }
    };

    class RefMut {
    private:
        RefCell<T>* cell;
        friend class RefCell<T>;

        RefMut(RefCell<T>* c) : cell(c) {
            if (c->borrow_count > 0 || c->borrowed_mut) {
                throw std::runtime_error("Already borrowed");
            }
            c->borrowed_mut = true;
        }

    public:
        ~RefMut() {
            if (cell) {
                cell->borrowed_mut = false;
            }
        }

        RefMut(const RefMut&) = delete;
        RefMut& operator=(const RefMut&) = delete;
        RefMut(RefMut&& other) noexcept : cell(other.cell) { other.cell = nullptr; }

        T& get() { return cell->value; }

        void set(T new_value) {
            cell->value = std::move(new_value);
        }
    };

    Ref borrow() const { return Ref(this); }
    RefMut borrow_mut() { return RefMut(this); }
};

( объяснения )

Продолжаю двигаться по списку: изучаю базовые примитивы языка Rust. Сегодня рассмотрим Cell<T>, RefCell<T> и, на десерт, Option<T>.

Cell это простой контейнер на одно значение произвольного типа. С одной хитрой фичей: если значение неизменяемое (то есть имеет атрибут const), Cell позволяет обходить это ограничение и всё-таки изменять его. Аналогично ключевому слову "mutable" в Си++. Спросим у grok.com.

Question: Please show me how type Cell<T> can be implemented in C++.

Let’s dive into how we might implement Rust’s Cell<T> in C++. Rust’s Cell<T> is a fascinating type - it provides interior mutability, allowing you to mutate its contents even when you only have a shared reference, all while being safe in a single-threaded context. It’s lightweight, with no runtime overhead beyond the mutation itself, and it doesn’t involve locks since it’s not thread-safe (that’s what RefCell or atomics are for in Rust). In C++, we don’t have the exact same ownership and borrowing rules, so we’ll need to adapt the concept while preserving its spirit.

The core idea of Cell<T> is simple: it wraps a value of type T and provides methods to get and set that value, bypassing C++’s usual const-correctness restrictions in a controlled way. Rust uses it to allow mutation through shared references, so in C++, we’ll aim for something that lets us modify a value even when we might otherwise be constrained, like through a const reference, while keeping the implementation straightforward.

Here’s a basic implementation in C++: cell.cpp.

#include <utility>
#include <iostream>

template <typename T>
class Cell {
private:
    mutable T value; // Mutable to allow mutation even when Cell is const

public:
    // Constructor
    explicit Cell(T initial) : value(std::move(initial)) {}

    // Get a copy of the value
    T get() const {
        return value;
    }

    // Set a new value
    void set(T new_value) const {
        value = std::move(new_value);
    }

    // Replace the value and return the old one
    T replace(T new_value) const {
        T old_value = std::move(value);
        value = std::move(new_value);
        return old_value;
    }
};

( объяснения и пример )

Реализация слабого указателя Weak для Arc отличается от Weak для Rc. Суть та же, но для работы в многопоточной среде используются атомарные счётчики и инкременты-декременты.

Смысл слабого указателя в том, что он не мешает удалить объект. От указателя Arc можно вызвать метод create_weak(), и он вернёт "слабый" указатель типа Weak на тот же объект. Со слабым указателем мало что можно делать, кроме как вызовом метода upgrade() снова получить сильный указатель. Или не получить - если объект уже к этому времени пропал.

Question: Weak for Rc was good. Now please do the same for Weak for Arc. Please show me how it could look in C++.

Let’s implement a version of Rust’s Weak for Arc (Atomic Reference Counting) in C++. In Rust, Arc is similar to Rc but is thread-safe, using atomic operations for reference counting. This allows Arc to be shared across threads, and its companion Weak provides a non-owning reference that can be upgraded to an Arc if the data still exists. The key difference from Rc is that Arc uses atomic counters (std::sync::atomic::AtomicUsize in Rust) instead of plain integers, ensuring safe concurrent access.

In C++, we’ll use std::atomic<size_t> for the counters to mimic this thread-safety. The structure will be similar to our Rc and Weak implementation, but we’ll adjust the reference counting to be atomic. Here’s how it could look: weak-arc.cpp.

( исходный код на Си++ )

Key Differences from Rc/Weak

1. Atomic Counters:

ArcControlBlock uses std::atomic<size_t> for strong_count and weak_count.
Operations like fetch_add, fetch_sub, and compare_exchange_weak ensure thread-safe updates.

2. Memory Ordering:

std::memory_order_relaxed is used for non-critical reads/writes where ordering isn’t needed.
std::memory_order_acq_rel (acquire-release) is used for decrements and upgrades to ensure proper synchronization:
- Acquire ensures we see all prior writes when reading the count.
- Release ensures our writes are visible before the count drops.
upgrade() uses compare_exchange_weak to atomically increment the strong count only if it’s still non-zero, preventing race conditions.

3. Thread-Safety:

The reference counting is safe across threads, but accessing *data isn’t inherently thread-safe. In a real application, you’d need to synchronize data access (e.g., with a mutex) if T is modified concurrently.

Notes

Comparison to std::shared_ptr: C++’s std::shared_ptr with std::weak_ptr is similar but uses a control block with atomic counters internally. Our version makes this explicit for clarity.

This implementation captures the essence of Weak for Arc in a thread-safe way.

Продолжим разбираться с умными указателями в Rust. Вчера я показывал Rc и Arc. Они похожи на shared_ptr. Теперь глянем типы Weak для них. Начнём с Weak для Rc.

Фишка указателя Weak в том, что он не мешает удалить объект. От указателя Rc<T> можно вызвать метод create_weak(), и он вернёт "слабый" указатель типа Weak<T> на тот же объект. Со слабым указателем мало что можно делать, кроме как вызовом метода upgrade() снова получить сильный указатель. Или не получить - если объект уже к этому времени пропал.

Question: Let's talk about Weak for Rc for now. Please show me how it can be implemented in C++.

Let’s dive into implementing a Weak pointer analogous to Rust’s Weak for Rc in C++. In Rust, Rc (Reference Counting) provides shared ownership of data, and Weak is a non-owning reference that prevents reference cycles and allows access to the data only if it still exists (i.e., if the strong reference count hasn’t dropped to zero). We’ll replicate this behavior in C++ using a combination of smart pointers and manual reference counting.

In C++, the closest equivalent to Rc is std::shared_ptr, which manages a strong reference count. For Weak, we’ll mimic Rust’s Weak by using std::weak_ptr, which is designed to work with std::shared_ptr and provides a non-owning reference. However, to make this a clear learning exercise and align it closely with Rust’s semantics, I’ll first show a simplified manual implementation before tying it to the standard library.

Manual Implementation Concept

Rust’s Rc and Weak rely on two counters:

Strong count: Number of owning Rc references.
Weak count: Number of non-owning Weak references.

The data is deallocated when the strong count reaches zero, and the control block (holding the counters) is deallocated when both strong and weak counts reach zero. Let’s implement this in C++.

C++ Implementation

Here’s a basic implementation weak-rc.cpp:

( исходный код на Си++ )

Explanation

1. RcControlBlock: Manages the strong and weak reference counts.
2. Rc<T>:

Owns the data and increments the strong count on copy.
Deletes the data when the strong count hits zero.
Provides a method to create a Weak pointer.

3. Weak<T>:

Non-owning reference; increments the weak count on creation.
Provides upgrade(), which returns an Rc if the strong count is greater than zero, or std::nullopt if the data is gone.
Cleans up the control block when both counts reach zero.

Key Differences

Rust’s Weak returns an Option<Rc<T>> on upgrade(), while std::weak_ptr::lock() returns a std::shared_ptr or null.
The manual C++ version uses std::optional to mimic Rust’s Option.
Rust enforces thread-safety with Arc separately, while std::shared_ptr is not thread-safe for modification without synchronization (akin to Rc).

Давайте выясним, чем тип Arc<T> отличается от Rc<T>. Название означает Atomically Reference Counted. Как объясняет нам grok.com, эта штука почти эквивалентна std::shared_ptr из Си++20 и позже. Сгенерённый исходник смотрите в файле: arc.cpp

Question: How Arc<T> would look if implemented in C++?

Let’s dive into how we might implement something like Rust’s Arc<T> in C++. For those unfamiliar, Arc<T> in Rust is a thread-safe, reference-counted smart pointer that allows multiple owners of some data, incrementing a counter when cloned and decrementing it when dropped, freeing the data when the count hits zero. It’s built for concurrent use, so it uses atomic operations to manage the count safely across threads. We’ll aim to replicate that behavior in C++.

In C++, we don’t have a direct equivalent in the standard library, but std::shared_ptr comes close. However, std::shared_ptr isn’t guaranteed to use atomic reference counting for the control block in all implementations (pre-C++20, at least), and its atomic operations were only standardized with std::atomic<std::shared_ptr> in C++20. To truly mirror Arc<T>, we’ll roll our own simplified version using atomic counters, focusing on thread-safety and the core mechanics.

Here’s how it could look:

#include <atomic>
#include <memory>
#include <utility>
#include <type_traits>

// Define Inner as a standalone template struct
template <typename T>
struct ArcInner {
    T data;
    std::atomic<int> ref_count;

    template <typename... Args>
    ArcInner(Args&&... args) : data(std::forward<Args>(args)...), ref_count(1) {}
};

template <typename T>
class Arc {
private:
    ArcInner<T>* ptr;

    void increment_count() {
        if (ptr) {
            ptr->ref_count.fetch_add(1, std::memory_order_relaxed);
        }
    }

    void decrement_count() {
        if (ptr) {
            if (ptr->ref_count.fetch_sub(1, std::memory_order_acq_rel) == 1) {
                delete ptr;
                ptr = nullptr;
            }
        }
    }

public:
    // Constructor with constraint: Args must be convertible to T
    template <typename... Args, typename = std::enable_if_t<std::is_constructible_v<T, Args...>>>
    explicit Arc(Args&&... args) : ptr(new ArcInner<T>(std::forward<Args>(args)...)) {}

    // Copy constructor
    Arc(const Arc& other) : ptr(other.ptr) {
        increment_count();
    }

    // Move constructor
    Arc(Arc&& other) noexcept : ptr(other.ptr) {
        other.ptr = nullptr;
    }

    // Assignment operators
    Arc& operator=(const Arc& other) {
        if (this != &other) {
            decrement_count();
            ptr = other.ptr;
            increment_count();
        }
        return *this;
    }

    Arc& operator=(Arc&& other) noexcept {
        if (this != &other) {
            decrement_count();
            ptr = other.ptr;
            other.ptr = nullptr;
        }
        return *this;
    }

    // Destructor
    ~Arc() {
        decrement_count();
    }

    // Dereference operators
    T& operator*() const { return ptr->data; }
    T* operator->() const { return &ptr->data; }

    // Get the reference count
    int use_count() const { return ptr ? ptr->ref_count.load(std::memory_order_relaxed) : 0; }
};

// Helper function to create an Arc
template <typename T, typename... Args>
Arc<T> make_arc(Args&&... args) {
    return Arc<T>(std::forward<Args>(args)...);
}

( объяснения и пример )

Продолжим разбираться с умными указателями в Rust. Глянем на тип Rc<T>. Название расшифровывается как Reference Counted, что намекает. Спросим grok.com. Выходит почти тот же shared_ptr, с двумя отличиями: он не может оказаться NULL, и счетчик простой, а не std::atomic. Сгенерённый ним исходник смотрите в файле: rc.cpp

Question: How Rc<T> would look if implemented in C++?

Let’s re-implement Rust’s Rc<T> in C++ - a fascinating exercise since Rc<T> introduces reference-counted shared ownership, contrasting with the unique ownership of Box<T>. In Rust, Rc<T> (short for "Reference Counted") allows multiple owners of a heap-allocated T, tracking the number of references via a counter. When the last Rc is dropped, the T is deallocated. This is akin to C++’s std::shared_ptr<T>, but we’ll craft a custom version to mirror Rust’s Rc<T> behavior, focusing on its single-threaded nature (unlike Arc<T> for multi-threading).

What Rc<T> Does in Rust

Allocation: Places a T on the heap with a reference count.
Shared Ownership: Multiple Rc<T> instances can point to the same T, incrementing the count.
Cloning: Rc::clone creates a new reference, bumping the count.
Deallocation: When the last Rc is dropped (count hits 0), T is destroyed and memory freed.
No Mutability: Rc<T> alone doesn’t allow mutation (use RefCell<T> or Cell<T> for that).
Non-Atomic: Single-threaded, unlike Arc<T>.

C++ Implementation

We’ll create an Rc<T> class in C++ that:

Uses a control block to store the reference count and T together.
Supports copying (to increment the count) and RAII cleanup.
Mimics Rust’s immutability by providing const access (mutation would need a separate mechanism).
Avoids atomic operations, matching Rc’s single-threaded design.

Here’s the code:

#include <cassert>
#include <utility>
#include <iostream>

// Control block to hold the reference count and the value
template <typename T>
struct RcControlBlock {
    size_t count;
    T value;

    template <typename... Args>
    RcControlBlock(Args&&... args) : count(1), value(std::forward<Args>(args)...) {}
};

template <typename T>
class Rc {
private:
    RcControlBlock<T>* block;

    // Private constructor from raw block (for internal use)
    explicit Rc(RcControlBlock<T>* b) : block(b) {}

    void increment() {
        if (block) {
            ++block->count;
        }
    }

    void decrement() {
        if (block && --block->count == 0) {
            delete block;
            block = nullptr;
        }
    }

public:
    // Constructor: Allocate and construct T
    template <typename... Args>
    explicit Rc(Args&&... args) : block(new RcControlBlock<T>(std::forward<Args>(args)...)) {}

    // Destructor: Decrease count, cleanup if last reference
    ~Rc() {
        decrement();
    }

    // Copy constructor: Share ownership
    Rc(const Rc& other) : block(other.block) {
        increment();
    }

    // Copy assignment: Share ownership
    Rc& operator=(const Rc& other) {
        if (this != &other) {
            decrement();           // Drop old reference
            block = other.block;   // Take new reference
            increment();
        }
        return *this;
    }

    // Move constructor: Transfer pointer, no count change
    Rc(Rc&& other) noexcept : block(other.block) {
        other.block = nullptr; // Leave moved-from empty
    }

    // Move assignment: Transfer pointer, no count change
    Rc& operator=(Rc&& other) noexcept {
        if (this != &other) {
            decrement();           // Drop old reference
            block = other.block;   // Take new reference
            other.block = nullptr; // Leave moved-from empty
        }
        return *this;
    }

    // Accessors (const only, mimicking Rust's immutability)
    const T& operator*() const {
        assert(block != nullptr);
        return block->value;
    }

    const T* operator->() const {
        assert(block != nullptr);
        return &block->value;
    }

    // Clone method (like Rc::clone)
    Rc clone() const {
        return Rc(*this); // Uses copy constructor
    }

    // Get reference count (for debugging)
    size_t use_count() const {
        return block ? block->count : 0;
    }
};

// Factory function (like Rc::new)
template <typename T, typename... Args>
Rc<T> make_rc(Args&&... args) {
    return Rc<T>(std::forward<Args>(args)...);
}

This Rc<T> captures Rust’s essence - shared, immutable, single-threaded ownership.

( объяснения )

Есть в Rust популярная идиома: Result<(), String>. Используется для функций, которые в случае успехи ничего не возвращают, а в случае ошибки возвращают текстовую строку. К примеру, копирование одного файла в другой из реального проекта:

fn copy_file_contents(mut dest_file: &fs::File, src_filename: &str) -> Result<(), String> {
    let mut src_file = fs::File::open(src_filename)
                                .map_err(|e| format!("Failed to open source file '{}': {}", src_filename, e))?;
    io::copy(&mut src_file, &mut dest_file)
       .map_err(|e| format!("Failed to copy to destination: {}", e))?;
    Ok(())
}

Давайте попросим grok.com сделать аналогичный класс на Си++.

Question: There is a popular idiom: Result<(), String>. I would like to see how it could be implemented in C++.

Let’s dive into translating the Rust idiom Result<(), String> into C++. In Rust, Result is an enum used for error handling, with two variants: Ok(T) for success and Err(E) for failure. Here, Result<(), String> means a function returns either "success with no meaningful value" (Ok(()), where () is the unit type) or "failure with a string error message" (Err(String)). This is a common pattern in Rust for operations that either succeed without producing a value or fail with a descriptive error.

In C++, we don’t have a direct equivalent to Rust’s Result in the standard library, but we can approximate it using modern C++ features. The closest analogs are std::optional, std::variant, or a custom implementation.

Let’s define a reusable Result class with a Rust-like interface, using std::variant under the hood. This adds convenience methods like is_ok(), is_err(), unwrap(), and unwrap_err().
#include <variant>
#include <string>
#include <stdexcept>
#include <iostream>

template<typename T, typename E>
class Result {
public:
    Result(T value) : data(value) {}
    Result(E error) : data(error) {}

    bool is_ok() const { return std::holds_alternative<T>(data); }
    bool is_err() const { return std::holds_alternative<E>(data); }

    T unwrap() const {
        if (is_err()) throw std::runtime_error("Called unwrap() on an Err value");
        return std::get<T>(data);
    }

    E unwrap_err() const {
        if (is_ok()) throw std::runtime_error("Called unwrap_err() on an Ok value");
        return std::get<E>(data);
    }

private:
    std::variant<T, E> data;
};

// Specialize for empty value and std::string
struct Void {};
using VoidResult = Result<Void, std::string>;
This version is the most Rust-like. VoidResult directly corresponds to Result<(), String>. You can call is_ok() or is_err() to check the state, and unwrap() or unwrap_err() to extract values (with error checking). It’s reusable for other T and E types too.

А вот пример использования такого класса, тоже сгенерённый Гроком.

( Си++ )

Вполне разумная вещь получилась. Можно использовать в разных Си++ проектах. Полный исходник лежит здесь: result.cpp

Нет лучше способа понять язык программирования, чем глянуть на реализацию. В языке Rust многое напоминает Си++, но всё ж таки "трошечки не так". Разберёмся с конструкцией Box<T> для начала. Спросим grok.com. Результат смотрите в файле box.cpp.

Question: Imagine I need to re-implement Box<T> in C++. How would it look like?

Re-implementing Rust’s Box<T> in C++ is a fun challenge that lets us peek into Rust’s memory management internals and translate them into C++’s world. In Rust, Box<T> is a smart pointer that provides heap-allocated ownership with automatic cleanup, backed by the global allocator (usually malloc/free via Rust’s alloc crate). It’s simple but powerful: it allocates memory for a T, moves the value onto the heap, and deallocates it when the Box goes out of scope via Drop. In C++, we don’t have Rust’s ownership model or Drop trait, so we’ll use RAII and a custom class to mimic it.

Let’s design a C++ equivalent, breaking down the essentials of Box<T> and handling the nuances of C++’s memory model.

What Box<T> Does in Rust

Allocation: Allocates memory on the heap for a T using the global allocator.
Ownership: Sole owner of the heap memory; no sharing or borrowing by default.
Move Semantics: Transfers ownership when assigned or passed (no copying unless T: Clone).
Deallocation: Automatically frees the memory when dropped, calling T’s destructor if it has one.
Dereferencing: Provides access to T via * or . (through Deref).

C++ Implementation

We’ll create a Box class in C++ that:

Uses new and delete for allocation/deallocation (mirroring malloc/free).
Implements RAII for automatic cleanup.
Supports move semantics (C++11 and later) to mimic Rust’s ownership transfer.
Prevents copying to enforce single ownership (like Rust).
Provides dereference operators (* and ->).

Here’s the code: box.cpp.

template <typename T>
class Box {
private:
    T *ptr; // Raw pointer to heap-allocated memory

    // Private helper to deallocate
    void deallocate()
    {
        if (ptr) {
            delete ptr; // Calls T's destructor, then frees memory
            ptr = nullptr;
        }
    }

public:
    // Constructor: Allocate and construct T in-place
    template <typename... Args>
    explicit Box(Args &&...args) : ptr(new T(std::forward<Args>(args)...))
    {
    }

    // Destructor: Automatically clean up
    ~Box() { deallocate(); }

    // Disable copying (Rust Box doesn't allow this)
    Box(const Box &) = delete;
    Box &operator=(const Box &) = delete;

    // Move constructor: Transfer ownership
    Box(Box &&other) noexcept : ptr(other.ptr)
    {
        other.ptr = nullptr; // Leave other in a valid, empty state
    }

    // Move assignment: Transfer ownership
    Box &operator=(Box &&other) noexcept
    {
        if (this != &other) {
            deallocate();        // Free current memory
            ptr = other.ptr;     // Take ownership
            other.ptr = nullptr; // Leave other empty
        }
        return *this;
    }

    // Dereference operators
    T &operator*() { return *ptr; }
    const T &operator*() const { return *ptr; }

    T *operator->() { return ptr; }
    const T *operator->() const { return ptr; }

    // Optional: Raw pointer access (like Box::into_raw in Rust)
    T *as_raw() { return ptr; }

    // Optional: Take ownership back (like Box::from_raw, but consumes self)
    T into_inner()
    {
        T value = std::move(*ptr);
        deallocate();
        return value;
    }
};

// Factory function to simplify creation (like Box::new in Rust)
template <typename T, typename... Args>
Box<T> make_box(Args &&...args)
{
    return Box<T>(std::forward<Args>(args)...);
}

( объяснения )

Стал я пробовать grok.com вместо gemini.google.com, и надо сказать, прогресс заметный. Я в основном пристаю к ИИ на тему программирования. Здесь Grok на голову выше Gemini, как отличник супротив троечника. Поделюсь с вами одним примером. Я спросил про smart указатели в Rust, и Grok прочитал мне целую лекцию. Первые пять в списке действительно smart указатели, а остальные скорее smart контейнеры. Но всё равно хорошо.

Question: How many smart pointers like Box and Rc are present in Rust?

Rust’s standard library offers a rich set of smart pointers beyond Box<T> and Rc<T>, each tailored to specific ownership semantics, threading needs, and use cases. These smart pointers leverage Rust’s ownership model and type system to provide safe, ergonomic memory management. Let’s count and explore them, focusing on the main ones in std (and one notable exception from alloc). I’ll cover their purposes and how they fit into Rust’s ecosystem.

Smart Pointers in Rust’s Standard Library

Here are the primary smart pointers in Rust, all from std unless noted:

1. Box<T>

Module: std::boxed
Purpose: Unique ownership of a heap-allocated T. The simplest smart pointer, providing a single owner with automatic cleanup.
Use Case: When you need heap allocation without sharing (e.g., recursive data structures, large objects).
Threading: Safe to send across threads if T: Send.

2. Rc<T>

Module: std::rc
Purpose: Reference-counted shared ownership for single-threaded use. Multiple Rc<T> instances share the same T, deallocating when the last reference drops.
Use Case: Shared immutable data in a single thread (e.g., graph structures).
Threading: Not thread-safe (!Send, !Sync).
Notes: Paired with Weak<T> for non-owning refs.

( also Arc, Weak, Cell, RefCell, Mutex, RwLock, Cow )

Comparison to C++

C++ Equivalents:
- Box<T> ≈ std::unique_ptr<T>
- Rc<T> ≈ std::shared_ptr<T> (non-atomic version)
- Arc<T> ≈ std::shared_ptr<T> (atomic)
- Weak<T> ≈ std::weak_ptr<T>
- Mutex<T> ≈ std::mutex + std::unique_lock
- RwLock<T> ≈ std::shared_mutex (C++17)
- No direct Cell/RefCell - custom runtime checks needed.
- Cow ≈ custom variant-like type or manual ownership toggling.
Rust Unique: Cell and RefCell leverage Rust’s borrow checker, offering safety not easily replicated in C++.

Программировать с помощью ИИ становится прикольно. Зашёл на x.ai и попросил написать простой текстовый редактор типа emacs. Вполне неплохой код получился. Создаём файл src/main.rs:

( Исходники )

Компилируем, запускаем - работает!

cargo run

Это я туда некий текст скопипастил. Выход по ^C. С таким помощником удобно язык изучать.

github.com/pretzelhammer/rust-blog/blob/master/posts/learning-rust-in-2024.md

“After a short 19 to 30 hours you'll go from being a Rust beginner to a Rust advanced beginner.” 😀

The Initiative Vision: In collaboration with the Rust Foundation, Rust Project, and appropriate external stakeholders, make C++ and Rust interoperability easily accessible and approachable to the widest possible audience.

https://github.com/rustfoundation/interop-initiative/blob/main/problem-statement.md

Хорошая движуха пошла между Rust и Си++! Может созреет что полезное.

Профиль

Посетители

Пятнадцать байтов на стек от конца

Entries tagged with rust

Ungrammar

Спецификация языка Rust

Компиляторостроителям

Сегодня 3/14

Тип Cow<T> - клонирование при записи

Тип RwLock<T>

Что такое Mutex<T>

Тип Option<T> в языке Rust

Тип RefCell<T>

Тип Cell<T>

Что такое тип Weak для Arc

Key Differences from Rc/Weak

Notes

Исследуем тип Weak для Rc

Manual Implementation Concept

C++ Implementation

Explanation

Key Differences

Что такое Arc<T>

Что такое Rc<T>

What Rc<T> Does in Rust

C++ Implementation

Что такое Result<T, E>

Перепишем Box<T> на Си++

What Box<T> Does in Rust

C++ Implementation

Изучаем Rust с помощью ИИ

Smart Pointers in Rust’s Standard Library

Comparison to C++

Текстовый редактор на Rust

Как учить язык программирования Rust

C++/Rust Interoperability Problem Statement

Профиль

Метки

Посетители