JustAMartin
JustAMartin

Reputation: 13753

Traversing a graph of unknown object types and mutating some object properties

The specific requirement is to replace some values in some MVC model properties if a user doesn't have specific permission.

The solution should be universal to apply to any graph of any model, and also reasonably efficient because it will be used to mask values in large lists of objects.

The assumptions are:

It all seems to be pretty logic and common requirements, so I thought there should be some generic, well-tested solution I could adjust to my case, just passing a callback func and maybe some filter to define properties of interest before even starting traversal.

However, so far, everything I find is limited to single Node-like type or the implementation doesn't allow to mutate properties of my choice or it goes into deep recursions with reflections and without any performance considerations.

I could implement it myself but it might turn out to be reinventing the wheel with messy recursions and reflections. Isn't there anything already existing and well-known that "just works"?

Also, I've heard that reflection SetValue and GetValue methods are slow and I should better cache setters and getters as delegates and reuse them whenever I encounter the same type again. And I will encounter the same types again because it's a ASP.NET Core web application. So it could be possible to gain noticeable performance boost over naive reflection solutions if I cache every setter/getter of interest for future reuse.

Upvotes: 2

Views: 837

Answers (1)

JustAMartin
JustAMartin

Reputation: 13753

It took some struggling but with awesome graph traversal example from Eric Lippert and FastMember library I have something that works:

[AttributeUsage(AttributeTargets.Property, AllowMultiple = false, Inherited = true)]
public class SensitiveDataAttribute : Attribute
{
}

public abstract class PocoGraphPropertyWalker
{
    private enum TypeKind
    {
        Primitive,
        IterablePrimitive,
        Poco,
        IterablePoco
    }

    private class TypeAccessDescriptor
    {
        public TypeAccessor accessor;
        public List<PropertyInfo> primitives;
        public List<PropertyInfo> iterables;
        public List<PropertyInfo> singles;
    }

    private static ConcurrentDictionary<Type, TypeAccessDescriptor> _accessorCache =
        new ConcurrentDictionary<Type, TypeAccessDescriptor>();

    public IEnumerable<object> TraversePocoList(IEnumerable<object> pocos)
    {
        if (pocos == null)
            return null;

        foreach (var poco in pocos)
            TraversePoco(poco);

        return pocos;
    }

    public object TraversePoco(object poco)
    {
        var unwound = Traversal(poco, ChildrenSelector).ToList();

        foreach(var unw in unwound)
            VisitPoco(unw);

        return poco;
    }

    public object VisitPoco(object poco)
    {
        if (poco == null)
            return poco;

        var t = poco.GetType();

        // the registry ignores types that are not POCOs
        var typeDesc = TryGetOrRegisterForType(t);

        if (typeDesc == null)
            return poco;

        // do not attempt to parse Keys and Values as primitives,
        // even if they were specified as such
        if (IsKeyValuePair(t))
            return poco;

        foreach (var prop in typeDesc.primitives)
        {
            var oldValue = typeDesc.accessor[poco, prop.Name];
            var newValue = VisitProperty(poco, oldValue, prop);
            typeDesc.accessor[poco, prop.Name] = newValue;
        }

        return poco;
    }

    protected virtual object VisitProperty(object model,
        object currentValue, PropertyInfo prop)
    {
        return currentValue;
    }

    private IEnumerable<object> Traversal(
            object item,
            Func<object, IEnumerable<object>> children)
    {
        var seen = new HashSet<object>();
        var stack = new Stack<object>();

        seen.Add(item);
        stack.Push(item);
        yield return item;

        while (stack.Count > 0)
        {
            object current = stack.Pop();
            foreach (object newItem in children(current))
            {
                // protect against cyclic refs
                if (!seen.Contains(newItem))
                {
                    seen.Add(newItem);
                    stack.Push(newItem);
                    yield return newItem;
                }
            }
        }
    }

    private IEnumerable<object> ChildrenSelector(object poco)
    {
        if (poco == null)
            yield break;

        var t = poco.GetType();

        // the registry ignores types that are not POCOs
        var typeDesc = TryGetOrRegisterForType(t);

        if (typeDesc == null)
            yield break;

        // special hack for KeyValuePair - FastMember fails to access its Key and Value
        // maybe because it's a struct, not class?
        // and now we have prop accessors stored in singles / primitives
        // so we extract it manually
        if (IsKeyValuePair(t))
        {
            // reverting to good old slow reflection
            var k = t.GetProperty("Key").GetValue(poco, null);
            var v = t.GetProperty("Value").GetValue(poco, null);

            if (k != null)
            {
                foreach (var yp in YieldIfPoco(k))
                    yield return yp;
            }

            if (v != null)
            {
                foreach(var yp in YieldIfPoco(v))
                    yield return yp;
            }
            yield break;
        }

        // registration method should have registered correct singles
        foreach (var single in typeDesc.singles)
        {
             yield return typeDesc.accessor[poco, single.Name];
        }

        // registration method should have registered correct IEnumerables
        // to skip strings as enums and primitives as enums
        foreach (var iterable in typeDesc.iterables)
        {
            if (!(typeDesc.accessor[poco, iterable.Name] is IEnumerable iterVals))
                continue;

            foreach (var iterval in iterVals)
                yield return iterval;
        }
    }

    private IEnumerable<object> YieldIfPoco(object v)
    {
        var myKind = GetKindOfType(v.GetType());
        if (myKind == TypeKind.Poco)
        {
            foreach (var d in YieldDeeper(v))
                yield return d;
        }
        else if (myKind == TypeKind.IterablePoco && v is IEnumerable iterVals)
        {
            foreach (var i in iterVals)
                foreach (var d in YieldDeeper(i))
                    yield return d;
        }
    }

    private IEnumerable<object> YieldDeeper(object o)
    {
        yield return o;

        // going slightly recursive here - might have IEnumerable<IEnumerable<IEnumerable<POCO>>>...
        var chs = Traversal(o, ChildrenSelector);
        foreach (var c in chs)
            yield return c;
    }

    private TypeAccessDescriptor TryGetOrRegisterForType(Type t)
    {
        if (!_accessorCache.TryGetValue(t, out var typeAccessorsDescriptor))
        {
            // blacklist - cannot process dictionary KeyValues
            if (IsBlacklisted(t))
                return null;

            // check if I myself am a real Poco before registering my properties
            var myKind = GetKindOfType(t);

            if (myKind != TypeKind.Poco)
                return null;

            var properties = t.GetProperties(BindingFlags.Public | BindingFlags.Instance);
            var accessor = TypeAccessor.Create(t);

            var primitiveProps = new List<PropertyInfo>();
            var singlePocos = new List<PropertyInfo>();
            var iterablePocos = new List<PropertyInfo>();

            // now sort all props in subtypes:
            // 1) a primitive value or nullable primitive or string
            // 2) an iterable with primitives (including strings and nullable primitives)
            // 3) a subpoco
            // 4) an iterable with subpocos
            // for our purposes, 1 and 2 are the same - just properties,
            // not needing traversion

            // ignoring non-generic IEnumerable - can't know its inner types
            // and it is not expected to be used in our POCOs anyway
            foreach (var prop in properties)
            {
                var pt = prop.PropertyType;
                var propKind = GetKindOfType(pt);

                // 1) and 2)
                if (propKind == TypeKind.Primitive || propKind == TypeKind.IterablePrimitive)
                    primitiveProps.Add(prop);
                else
                if (propKind == TypeKind.IterablePoco)
                    iterablePocos.Add(prop); //4)
                else
                    singlePocos.Add(prop); // 3)
            }

            typeAccessorsDescriptor = new TypeAccessDescriptor {
                accessor = accessor,
                primitives = primitiveProps,
                singles = singlePocos,
                iterables = iterablePocos
            };

            if (!_accessorCache.TryAdd(t, typeAccessorsDescriptor))
            {
                // if failed add, a parallel process added it, just get it back
                if (!_accessorCache.TryGetValue(t, out typeAccessorsDescriptor))
                    throw new Exception("Failed to get a type descriptor that should exist");
            }
        }

        return typeAccessorsDescriptor;
    }

    private static TypeKind GetKindOfType(Type type)
    {
        // 1) a primitive value or nullable primitive or string
        // 2) an iterable with primitives (including strings and nullable primitives)
        // 3) a subpoco
        // 4) an iterable with subpocos

        // ignoring non-generic IEnumerable - can't know its inner types
        // and it is not expected to be used in our POCOs anyway

        // 1)
        if (IsSimpleType(type))
            return TypeKind.Primitive;

        var ienumerableInterfaces = type.GetInterfaces()
            .Where(x => x.IsGenericType && x.GetGenericTypeDefinition() ==
            typeof(IEnumerable<>)).ToList();

        // add itself, if the property is defined as IEnumerable<x>
        if (type.IsGenericType && type.GetGenericTypeDefinition() ==
            typeof(IEnumerable<>))
            ienumerableInterfaces.Add(type);

        if (ienumerableInterfaces.Any(x =>
                IsSimpleType(x.GenericTypeArguments[0])))
            return TypeKind.IterablePrimitive;

        if (ienumerableInterfaces.Count() != 0)
            // 4) - it was enumerable, but not primitive - maybe POCOs
            return TypeKind.IterablePoco;

        return TypeKind.Poco;
    }

    private static bool IsBlacklisted(Type type)
    {
        return false;
    }

    public static bool IsKeyValuePair(Type type)
    {
        return type.IsGenericType && 
            type.GetGenericTypeDefinition() == typeof(KeyValuePair<,>);
    }

    public static bool IsSimpleType(Type type)
    {
        return
            type.IsPrimitive ||
            new Type[] {
        typeof(string),
        typeof(decimal),
        typeof(DateTime),
        typeof(DateTimeOffset),
        typeof(TimeSpan),
        typeof(Guid)
            }.Contains(type) ||
            type.IsEnum ||
            Convert.GetTypeCode(type) != TypeCode.Object ||
            (type.IsGenericType && type.GetGenericTypeDefinition() == typeof(Nullable<>) && IsSimpleType(type.GetGenericArguments()[0]))
            ;
    }
}

public class ProjectSpecificDataFilter : PocoGraphPropertyWalker
{
    const string MASK = "******";

    protected override object VisitProperty(object model,
            object currentValue, PropertyInfo prop)
    {
        if (prop.GetCustomAttributes<SensitiveDataAttribute>().FirstOrDefault() == null)
            return currentValue;

        if (currentValue == null || (currentValue is string &&
            string.IsNullOrWhiteSpace((string)currentValue)))
            return currentValue;

        return MASK;
    }
}

For testing:

enum MyEnum
{
    One = 1,
    Two = 2
}

class A
{
    [SensitiveData]
    public string S { get; set; }
    public int I { get; set; }
    public int? I2 { get; set; }
    public MyEnum Enm { get; set; }
    public MyEnum? Enm1 { get; set; }
    public List<MyEnum> Enm2 { get; set; }
    public List<int> IL1 { get; set; }
    public int?[] IL2 { get; set; }
    public decimal Dc { get; set; }
    public decimal? Dc1 { get; set; }
    public IEnumerable<decimal> Dc3 { get; set; }
    public IEnumerable<decimal?> Dc4 { get; set; }
    public IList<decimal> Dc5 { get; set; }
    public DateTime D { get; set; }
    public DateTime? D2 { get; set; }
    public B Child { get; set; }
    public B[] Children { get; set; }
    public List<B> Children2 { get; set; }
    public IEnumerable<B> Children3 { get; set; }
    public IDictionary<int, int?> PrimDict { get; set; }
    public Dictionary<int, B> PocoDict { get; set; }
    public IDictionary<B, int?> PocoKeyDict { get; set; }
    public Dictionary<int, IEnumerable<B>> PocoDeepDict { get; set; }
}

class B
{
    [SensitiveData]
    public string S { get; set; }
    public int I { get; set; }
    public int? I2 { get; set; }
    public DateTime D { get; set; }
    public DateTime? D2 { get; set; }
    public A Parent { get; set; }
}

class Program
{

    static A root;

    static void Main(string[] args)
    {
        root = new A
        {
            D = DateTime.Now,
            D2 = DateTime.Now,
            I = 10,
            I2 = 20,
            S = "stringy",
            Child = new B
            {
                D = DateTime.Now,
                D2 = DateTime.Now,
                I = 10,
                I2 = 20,
                S = "stringy"
            },
            Children = new B[] {
                new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" },
                new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" },
            },
            Children2 = new List<B> {
                new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" },
                new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" },
            },
            PrimDict = new Dictionary<int, int?> {
                { 1, 2 },
                { 3, 4 }
            },
            PocoDict = new Dictionary<int, B> {
                { 1,  new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" } },
                { 3, new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" } }
            },
            PocoKeyDict = new Dictionary<B, int?> {
                { new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" }, 1 },
                { new B {
                    D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" }, 3 }
            },

            PocoDeepDict = new Dictionary<int, IEnumerable<B>>
            {
                { 1, new [] { new B {D = DateTime.Now,
                    D2 = DateTime.Now,
                    I = 10,
                    I2 = 20,
                    S = "stringy" } } }
            }
        };

        // add cyclic ref for test
        root.Child.Parent = root;

        var f = new ProjectSpecificDataFilter();
        var r = f.TraversePoco(root);
    }
}

It replaces marked strings, no matter how deep inside POCOs they are. Also I can use the walker for every other property access / mutation case I can imagine. Still not sure if I reinvented a wheel here...

Upvotes: 3

Related Questions