Qqwy
Qqwy

Reputation: 5629

How to create a subrange of a BTreeSet<(String, String, String)>? How to turn a tuple of bounds into a bound of a tuple?

I am trying to use a BTreeSet<(String, String, String)> as a way to create a simple in-memory 'triple store'.

To be precise:

type Entity = String;
type Attribute = String;
type Value = String;
type EAV = (Entity, Attribute, Value);
type EAVSet = BTreeSet<EAV>;


pub fn example_db() -> EAVSet {
    let mut example: EAVSet = BTreeSet::new();
    insert_strtup(&mut example, ("1", "type", "user"));
    insert_strtup(&mut example, ("1", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("1", "user/age", "33"));

    insert_strtup(&mut example, ("2", "type", "user"));
    insert_strtup(&mut example, ("2", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("2", "user/age", "42"));

    return example;
}


fn insert_strtup(db: &mut EAVSet, val: (&str, &str, &str)) -> () {
    db.insert((val.0.to_string(), val.1.to_string(), val.2.to_string()));
}

pub fn example()  {
    let db = example_db();

    // How to customize this?
    let range: (Bound<EAV>, Bound<EAV>) = (Bound::Unbounded, Bound::Unbounded);

    for elem in eavt.range(range) {
        println!("{:?}", elem);
    }
}

The problem I am facing, is that I want people to be able to iterate over a subrange of values in the set. However, a simple usage of std::ops::Bound is not possible because we store a tuples with multiple fields.

I'd like to be able to build range-queries for all of the following:

The only idea which came to mind so far, is to use a string key which we know for a fact compares lower resp. higher than what we're looking for for the 'placeholder' fields. But this feels very hackish/error-prone and like reinventing the wheel.

Is there maybe a way to turn a (Bound<String>, Bound<String>, Bound<String>) into a Bound<(String, String, String)> maybe? Or is there another approach to take here?

EDIT: Filtering/querying a multi-key btree index in Rust shows one solution by wrapping all values in an ordered Enum (Min, Exact(String), Max), but this solution requires altering what kind of values are stored inside the BTreeSet. It also feels like adding a memory overhead as we're in actuality never storing anything other than Exact(some_string) inside. Is there another approach that does not require altering the type of values stored in the BTreeSet?

Upvotes: 5

Views: 507

Answers (2)

Solomon Ucko
Solomon Ucko

Reputation: 6109

Since Borrow always returns a reference (grrrrrrrr), and Borrowed isn't necessarily Copy, you might be able to rely on a sentinel memory address?

Note that since associated static items aren't allowed, you probably need a copy of this code for each type you want to use.

use std::borrow::Borrow;
use std::cmp::Ordering;

#[repr(transparent)]
pub struct StringWithMinMaxSentinel(String);

// must be static, not const, to ensure a constant memory address
pub static STRING_MIN_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());
pub static STRING_MAX_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());

impl Borrow<StringWithMinMaxSentinel> for String {
    fn borrow(self: &String) -> &StringWithMinMaxSentinel {
        unsafe { &*(self as *const String as *const StringWithMinMaxSentinel) }
    }
}

impl PartialEq for StringWithMinMaxSentinel {
    fn eq(&self, other: &Self) -> bool {
        std::ptr::eq(self, other) || (!std::ptr::eq(self, &STRING_MIN_SENTINEL) && !std::ptr::eq(other, &STRING_MAX_SENTINEL) && !std::ptr::eq(other, &STRING_MIN_SENTINEL) && !std::ptr::eq(self, &STRING_MAX_SENTINEL) && self.0.eq(&other.0))
    }
}

impl Eq for StringWithMinMaxSentinel {}

impl PartialOrd for StringWithMinMaxSentinel {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

impl Ord for StringWithMinMaxSentinel {
    fn cmp(&self, other: &Self) -> Ordering {
        if std::ptr::eq(self, other) {
            Ordering::Equal
        } else if std::ptr::eq(self, &STRING_MIN_SENTINEL) || std::ptr::eq(other, &STRING_MAX_SENTINEL) {
            Ordering::Less
        } else if std::ptr::eq(self, &STRING_MAX_SENTINEL) || std::ptr::eq(other, &STRING_MIN_SENTINEL) {
            Ordering::Greater
        } else {
            self.0.cmp(&other.0)
        }
    }
}

Upvotes: 1

hkBst
hkBst

Reputation: 3360

I'd like to be able to build range-queries for all of the following:

all entities;
all entities with an ID in range x..y;
all fields of entity 1;
the current value of entity 1's "user/age" field).

Is there an[other] approach that does not require altering the type of values stored in the BTreeSet?

Given the above constraints, the following works. Since everything is a string, ranges use string comparison, meaning "a".."b" represents all strings starting with "a". The empty string is a natural minimal string value, but there is no ready maximal string value, so for that we use a large static string. This is of course not nice at all. This could perhaps be improved by using Option instead of String, with the None value standing for the maximum instead, Some("") would be the minimum. You would then also have to implement your own comparison...

use std::collections::BTreeSet;
use std::ops::Bound;

type Entity = String;
type Attribute = String;
type Value = String;
type EAV = (Entity, Attribute, Value);
type EAVSet = BTreeSet<EAV>;

pub fn example_db() -> EAVSet {
    let mut example: EAVSet = BTreeSet::new();
    insert_strtup(&mut example, ("1", "type", "user"));
    insert_strtup(&mut example, ("1", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("1", "user/age", "33"));

    insert_strtup(&mut example, ("2", "type", "user"));
    insert_strtup(&mut example, ("2", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("2", "user/age", "42"));

    insert_strtup(&mut example, ("11", "type", "user"));
    insert_strtup(&mut example, ("11", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("11", "user/age", "33"));

    insert_strtup(&mut example, ("12", "type", "user"));
    insert_strtup(&mut example, ("12", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("12", "user/age", "42"));
    return example;
}

fn insert_strtup(db: &mut EAVSet, val: (&str, &str, &str)) -> () {
    db.insert((val.0.to_string(), val.1.to_string(), val.2.to_string()));
}

static MAX_STRING: &str = "ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ";

pub fn main() {
    let db = example_db();

    // How to customize this?
    let range: (Bound<EAV>, Bound<EAV>) = (Bound::Unbounded, Bound::Unbounded);

    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tall entities with an ID in range \"11\"..=\"12\":");

    let range = (
        Bound::Included(("11".to_string(), "".to_string(), "".to_string())),
        Bound::Excluded(("120".to_string(), "".to_string(), "".to_string())),
    );

    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tall fields of entity 1:");

    let range = (
        Bound::Included(("1".to_string(), "".to_string(), "".to_string())),
        Bound::Excluded(("10".to_string(), "".to_string(), "".to_string())),
    );
    
    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tthe current value of entity 1's \"user/age\" field:");

    let range = (
        Bound::Included(("1".to_string(), "user/age".to_string(), "".to_string())),
        Bound::Excluded(("1".to_string(), "user/age".to_string(), MAX_STRING.to_string())),
    );
    
    for elem in db.range(range) {
        println!("{:?}", elem);
    }
}

Upvotes: 0

Related Questions