Edward
Edward

Reputation: 7100

Is there no standard hash for `std::filesystem::path`?

I have a simple program that's intended to store a set of C++17 std::filesystem::path objects. Since there is a std::filesystem::hash_value that's part of the standard, why doesn't this code compile without me having to supply my own std::hash?

When I compile and link using gcc 8.1.1 as g++ -std=c++17 -NO_HASH=1 hashtest.cpp -o hashtest -lstdc++fs my hash function is included and everything operates perfectly. However, if I change it to -NO_HASH=0, I get a very long list of error messages, the key one of which is this:

usr/include/c++/8/bits/hashtable.h:195:21: error: static assertion failed: hash function must be invocable with an argument of key type
       static_assert(__is_invocable<const _H1&, const _Key&>{},

Here's a live Coliru version if you'd like to play.

Is there really no defined std::hash<std::filesystem::path>? What am I missing?

For those who are interested in why I'd want such a thing, it's this: https://codereview.stackexchange.com/questions/124307/from-new-q-to-compiler-in-30-seconds

hashtest.cpp

#include <optional>
#include <unordered_set>
#include <filesystem>
#include <string>
#include <iostream>

namespace fs = std::filesystem;

#if NO_HASH
namespace std {
    template <>
    struct hash<fs::path> {
        std::size_t operator()(const fs::path &path) const {
            return hash_value(path);            }
    };
}
#endif
int main()
{
    using namespace std::literals;
    std::unordered_set< std::optional<fs::path> >  paths = {
            "/usr/bin"s, std::nullopt, "/usr//bin"s, "/var/log"s
    };

    for(const auto& p : paths)
        std::cout << p.value_or("(no path)") << ' ';
}

Upvotes: 7

Views: 3976

Answers (3)

Edward
Edward

Reputation: 7100

The resolution was to explicitly write and use my own hash.

#include <optional>
#include <unordered_set>
#include <filesystem>
#include <string>
#include <iostream>

namespace fs = std::filesystem;

struct opt_path_hash {
    std::size_t operator()(const std::optional<fs::path>& path) const {
        return path ? hash_value(path.value()) : 0;
    }
};

int main()
{
    using namespace std::literals;
    std::unordered_set< std::optional<fs::path>, opt_path_hash >  paths = {
            "/usr/bin"s, std::nullopt, "/usr//bin"s, "/var/log"s
    };

    for(const auto& p : paths)
        std::cout << p.value_or("(no path)") << '\n';
}

This produces the following output, correctly collapsing the two versions of "/usr/bin":

"/var/log"
"(no path)"
"/usr/bin"

Upvotes: 2

Barry
Barry

Reputation: 303517

Since there is a std::filesystem::hash_value that's part of the standard, why doesn't this code compile without me having to supply my own std::hash?

Right, there is a fs::hash_value() but there is no specialization of std::hash<fs::path>, which is what you would need. That's why it doesn't compile. As to why the library provides the former function but not the latter, I'll quote from Billy O'Neal (implementer for MSVC's standard library):

Looks like a defect.

However, putting paths as keys in a hash table is almost certainly incorrect; you need to test for path equivalence in most such scenarios. That is, "/foo/bar/../baz" and "/foo/baz" are the same target but are not the same path. Similarly, "./bar" and "./bar" may be different paths, depending on the value of current_path in the first context vs. in the second.

If what you want is canonically unique paths, then simply std::unordered_set<fs::path> wouldn't do what you want anyway. So perhaps it failing to compile isn't a bad thing? I don't know enough about filesystem to say one way or the other.


Note that you, yourself, providing a specialization of std::hash for fs::path is not allowed - you can only add specializations to std for types you control. Types that will be called "program-defined types." fs::path is not a type you control, so you can't specialize std::hash for it.

Upvotes: 14

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275790

namespace hashing {
  namespace adl {
    template<class T, class...Ts>
    auto hash_value( T const& t, Ts&&... )
    -> std::result_of_t< std::hash<T>&&(T const&) >
    {
      return std::hash<T>{}(t);
    }
    template<class T>
    auto hasher_private( T const& t )
    -> decltype( hash_value( t ) )
    { return hash_value(t); }
  }

  struct smart_hasher {
    template<class T>
    auto operator()( T const& t ) const
    ->decltype( adl::hasher_private( t ) )
    {    return adl::hasher_private( t ); }
  };      
};

so hashing::smart_hasher is a hash object that will look for hash_value(T const&) in the namespace of T, and if that fails will use std::hash<T> if available, and if not will generate a compiler error.

If you want to write additional hashers for std types, create a hash_value function overload in hashing::adl. For other types, create it in their associated namespace. For example, if you want to support hashing of tuples:

namespace hashing::adl {
  template<class...Ts>
  std::size_t hash_value( std::tuple<Ts...> const& tup ) {
    // get hash values and combine them here
    // use `smart_hasher{}( elem ) to hash each element for
    // recursive smart hashing
  }
}

and now anyone using smart_hasher automatically picks up the hasher for anything that provides that customization.

Upvotes: 1

Related Questions