vak: (Знайка)
Serge Vakulenko ([personal profile] vak) wrote2025-03-02 08:51 pm

Тип Cow<T> - клонирование при записи

Концепция copy-on-write не представлена в традиционных языках программирования, поэтому вызывает трудности у народа. Тип Cow<T> в Rust часто используется со строками, чтобы избежать лишнего клонирования. Грок покажет нам упрощённую реализацию.

Question: Please show a possible implementation of Rust's type std::borrow::Cow in C++.

I'll show you how to implement a simplified version of Rust's std::borrow::Cow (Copy-on-Write) in C++. Rust's Cow is an enum that can hold either a borrowed reference (&'a T) or an owned value (T), typically used to avoid unnecessary cloning. In C++, we can approximate this using a combination of templates, variants, and smart pointers.

Here's a possible implementation: cow.cpp.
#include <variant>
#include <memory>
#include <string>
#include <stdexcept>

// A simplified Cow implementation in C++
template<typename T>
class Cow {
private:
// Using variant to hold either a borrowed pointer or an owned value
std::variant<const T*, std::unique_ptr<T>> data;

public:
// Constructor for borrowed value
explicit Cow(const T* borrowed) : data(borrowed) {
if (!borrowed) {
throw std::invalid_argument("Borrowed pointer cannot be null");
}
}

// Constructor for owned value
explicit Cow(T&& owned) : data(std::make_unique<T>(std::move(owned))) {}

// Copy constructor
Cow(const Cow& other) {
if (std::holds_alternative<const T*>(other.data)) {
data = std::get<const T*>(other.data);
} else {
const auto& owned = std::get<std::unique_ptr<T>>(other.data);
data = std::make_unique<T>(*owned);
}
}

// Move constructor
Cow(Cow&& other) noexcept = default;

// Assignment operators
Cow& operator=(const Cow& other) {
if (this != &other) {
Cow temp(other);
swap(*this, temp);
}
return *this;
}

Cow& operator=(Cow&& other) noexcept = default;

// Access the underlying value (non-mutating)
const T& operator*() const {
if (std::holds_alternative<const T*>(data)) {
return *std::get<const T*>(data);
}
return *std::get<std::unique_ptr<T>>(data);
}

const T* operator->() const {
return &(**this);
}

// Check if the Cow holds a borrowed value
bool is_borrowed() const {
return std::holds_alternative<const T*>(data);
}

// Check if the Cow holds an owned value
bool is_owned() const {
return std::holds_alternative<std::unique_ptr<T>>(data);
}

// Convert to owned value if borrowed (clone on write)
void to_owned() {
if (is_borrowed()) {
data = std::make_unique<T>(*std::get<const T*>(data));
}
}

// Swap implementation
friend void swap(Cow& lhs, Cow& rhs) noexcept {
using std::swap;
swap(lhs.data, rhs.data);
}
};

Key Features and Explanation:

1. Core Design:
  • Uses std::variant to store either a const T* (borrowed) or a std::unique_ptr<T> (owned).
  • std::unique_ptr ensures proper ownership semantics and automatic cleanup, similar to Rust's owned variant.
2. Constructors:
  • Cow(const T*): Creates a borrowed instance.
  • Cow(T&&): Creates an owned instance by moving a value into a unique_ptr.
  • Copy constructor handles deep copying of owned values and shallow copying of borrowed pointers.
3. Accessors:
  • operator* and operator-> provide read-only access to the underlying value, regardless of whether it's borrowed or owned.
  • No mutable access is provided here to match the common use case of Cow (you'd typically call to_owned() first if mutation is needed).
4. State Checking:
  • is_borrowed() and is_owned() let you inspect the current state.
5. Copy-on-Write:
  • to_owned() converts a borrowed value to an owned one by cloning the data, mimicking Rust's Cow::to_owned().
6. Differences from Rust:
  • Rust's Cow uses lifetime parameters to ensure borrowed references remain valid. C++ can't enforce this at compile time, so we rely on runtime checks (e.g., null pointer check) and programmer discipline.
  • This implementation uses unique_ptr instead of a raw owned T to manage memory, which adds some overhead but ensures safety.
  • Rust's Cow is typically used with Borrow and ToOwned traits; here, we assume T is copyable for simplicity.

Example usage

#include <iostream>

int main() {
// Create a string we'll borrow from
std::string original = "Hello";

// Borrowed Cow
Cow<std::string> borrowed(&original);
std::cout << "Borrowed: " << *borrowed << ", Is borrowed: " << borrowed.is_borrowed() << "\n";

// Owned Cow
Cow<std::string> owned(std::string("World"));
std::cout << "Owned: " << *owned << ", Is owned: " << owned.is_owned() << "\n";

// Copy a borrowed Cow
Cow<std::string> borrowed_copy = borrowed;
std::cout << "Borrowed copy: " << *borrowed_copy << "\n";

// Convert borrowed to owned
borrowed_copy.to_owned();
std::cout << "After to_owned: " << *borrowed_copy << ", Is owned: " << borrowed_copy.is_owned() << "\n";

return 0;
}

Output of Example:

Borrowed: Hello, Is borrowed: 1
Owned: World, Is owned: 1
Borrowed copy: Hello
After to_owned: Hello, Is owned: 1

Limitations:

  • No compile-time lifetime checking (a fundamental difference between Rust and C++).
  • Only provides read-only access (could be extended with mutable access if needed).
  • Assumes T is copy-constructible for to_owned().
This is a simplified version but captures the essence of Cow.
kondybas: (Default)

[personal profile] kondybas 2025-03-03 07:24 am (UTC)(link)
Неудобно с телефона проматівать такие простіни. Думаю, кат можно начинать уже со второго абзаца.
lxe: (Default)

[personal profile] lxe 2025-03-03 09:14 am (UTC)(link)
Интересно, упоминались ли коровы в foundational paper дизайна copy on write.
juan_gandhi: (Default)

[personal profile] juan_gandhi 2025-03-03 09:55 am (UTC)(link)
How fucking complicated!

Thank you.

[personal profile] chabapok 2025-03-03 10:29 am (UTC)(link)
с растом все красиво, когда ты его изучаешь, читая доку и делая простые упражнения.
А вы попробуйте что-то сколь-нибудь реальное сделать на раст. Что-то не совсем тривиальное.

[personal profile] chabapok 2025-03-05 11:12 am (UTC)(link)
Не, это маленький и простой проектик.

Я подразумевал, что исходников должно быть хотябы пару-тройку мегабайт, без учета систем контроля версий.
Разумеется, без искусственного раздувания.
unsafe только в ограниченных случаях использовать можно.

там часто начинаются приколы, что нужны или циклические ссылки - или одновременно мутабельные и немутабельные указатели.