I got a method which accepts a collection as below public IList<CountryDto> ApplyDefaults(IList<CountryDto> dtos) { //Iterates the collection //Validates the items in collection //If items are invalid //Removes items e.g dtos.Remove(currentCountryDto) return dtos;//Do I need to do this? } My question is since, the reference to the collection is not changed, should I return the collection again from the method? For: By returning the collection back, I make it explicit in the signature and user is aware that the items in the collection could be different from the original source. Sort of it avoid ambiguity. Against : Since the validation doesnt change the reference of the collection, it doesn't make sense technically to return it. What is the best approach in this case? Note: I am not sure if this question is opinion based. I think probably I missing something here on design side.

Reputation: 2470

Should I return a collection when the reference to the collection is not changed?

I got a method which accepts a collection as below

 public IList<CountryDto> ApplyDefaults(IList<CountryDto> dtos)
        {
            //Iterates the collection
                //Validates the items in collection
                //If items are invalid
                //Removes items e.g dtos.Remove(currentCountryDto)

            return dtos;//Do I need to do this?
        }

My question is since, the reference to the collection is not changed, should I return the collection again from the method?

For: By returning the collection back, I make it explicit in the signature and user is aware that the items in the collection could be different from the original source. Sort of it avoid ambiguity.
Against: Since the validation doesnt change the reference of the collection, it doesn't make sense technically to return it.

What is the best approach in this case?
Note: I am not sure if this question is opinion based. I think probably I missing something here on design side.

Upvotes: 2

Answers (7)

Chris Sinclair

Reputation: 23198

Given your comments/requirements:

Does not need to report if defaults are applied.
ApplyDefaults is complicated and invoking other services and not intended to produce a fluent API
ApplyDefaults is a "black box"; validation logic is injected so the calling code doesn't know/care about the validation

I think based on these, this method definitely should not return the reference to the incoming list, even if no validation is applied. Firstly, unless the API is clearly built around method chaining (which you indicated you do not want), returning a List<T> type usually indicates a new List is being created. Secondly, if a new list is not created, users may find themselves modifying the list in ways they didn't expect.

Consider:

IList<CountryDto> originalCountries = Service.GetCountries();
IList<CountryDto> validatedCountries = ApplyDefaults(originalCountries);
validatedCountries.Add(mySpecialCountry);

OutputOriginalCountries(originalCountries);
OutputValidatedCountries(validatedCountries);

This code isn't very special, and a fairly common pattern. If ApplyDefaults returned a reference to the same originalCountries collection, then mySpecialCountry would also be added to originalCountries. This would violate the Principle of Least Astonishment.

This would be exacerbated if this behaviour changed depending on whether or not items were validated/filtered. Since the validation logic is a black-box of behaviour that the caller doesn't know or care about, the API consumer could not depend on whether or not it returned the same reference. They would either have to do their own reference check (e.g., if (myValidatedCountries == myInputCountries)), or simply make a copy every time. Regardless, this becomes another weird behaviour that the programmer has to juggle when working with the API.

I think that the method should either:

A) always return a copied list with the items filtered out (public IList<CountryDto> ApplyDefaults(IEnumerable<CountryDto> dtos))

B) modify the incoming list in-place (public void ApplyDefaults(IList<CountryDto> dtos))

For option A, depending on the size of your list, this incurs the possible unnecessary work of creating a copied list every time even if no filtering is performed. However, the validation/filtering logic might be simpler. You might be able to use LINQ queries to apply the filtering nicely. Additionally, removing items from a list is generally costly as it has to rebuild the internal array. So it might actually be faster to build a new list. You may even simplify the signature here to be IEnumerable<CountryDto>; this allows for wider usage and is extremely obvious that you're creating a new collection.

For option B, if no validation is required, then no work is done and the method is essentially "free" (no array rebuilding, no copying, no reference changes). But if there is significant validation, the removal aspect may be costly. Since you're not method chaining, this version should have a void return type as it's much more obvious to the developer that this is modifying the list in-place. This follows other commonly known methods like List<T>.Sort. Furthermore, if a user wants to have a separate originalCountries and validatedCountries they can always make a copy:

var validatedCountries = originalCountries.ToList();
ApplyDefaults(validatedCountries);

Ultimately, which one you choose might depend on performance. If validation/removal is cheap and rare, then modifying the list in-place might be best. If you're expecting a lot of changes to the list, it might simply be faster to produce a new copy every time.

Regardless, I would suggest you name the method with a little more clarity as well. For example:

public IList<CountryDto> GetValidCountries(IEnumerable<CountryDto> dtos)

public void RemoveInvalidCountries(IList<CountryDto> dtos)

Of course, the naming might be different depending on your actual code context (I suspect ApplyDefaults is a common/inherited method name and not specific to CountryDto)

Upvotes: 1

Anton Gorlin

Reputation: 57

In your case I would leave void since ApplyDefaults clearly states what its doing. Also, it might be a good idea to ApplyDefaults in the collection itself. Subclass IList or List or whatever and then you'd call like this:

myCollection.ApplyDefaults();

Which is just obvious.

Upvotes: 0

Grimarr

Reputation: 41

(My answer is based on the Java viewpoint; C++ and C# programmers might have a different take.) I think it's best to return the collection. The fact that the collection you're returning is the same collection that was given is just an implementation detail, and in future versions of the code, you might want to change that. Document that the collection returned might not be the same one passed in.

If, on the other hand, you want to lock in the design that this method modifies a collection in place, document it that way and don't return the collection. I prefer not to do it this way, but I can see advantages in some contexts.

Upvotes: 0

Dmitrii Bychenko

Reputation: 186668

I'd rather return boolean (or enum in an elaborated case: collection preserved intact, changed, can't be validated etc.)

// true if the collection is changed, false otherwise
public Boolean ApplyDefaults(IList<CountryDto> dtos) {
  Boolean result = false;
  //Iterates the collection
  //Validates the items in collection
  //If items are invalid:
  //  Removes items e.g dtos.Remove(currentCountryDto)
  //  result = true;
  ...
  return result; 
}

...

if (ApplyDefaults(myData)) {
  // Collection is changed, do some extra stuff
}

Upvotes: 1

Samy Arous

Reputation: 6814

Technically, I would say there is not much difference between the two.

However, and as you pointed out, a common used convention is that a function should only return an object it creates. Basically, that would mean that a function that returns an object is generating one while a function which doesn't return anything is modifying the object passed as a parameter.

Again, this is only a convention and it is not widely used within the C# community, but in the python community for example, it is.

Some people, returns a Boolean (or an error code) instead as an indicator of an error (like the old dos command line). I don't like this approach and prefer by far raising exceptions that I can handle later on.

Finally, the best approach in my regard, is to return a value that indicates if a change was done by the function and eventually a value indicating how much of a change was done. It can be a Boolean or it can be the number of inserted/removed elements...

In any case, try to be consistent with the approach you chose, if not in all your code, at least within a single project. Sometimes, you will have no other choice but to abide with the convention used by your teammates.

Upvotes: 0

Tarec

Reputation: 3255

First of all: you cannot change the reference of the collection you send by parameter, because by default you're getting copy of it. You'd need to use a ref keyword in order to be able to change it.

Secondly: if your method has a return type, than it has to return an object. Your method is not called GetNewCollectionWithAppliedDefaults, but ApplyDefaults which implies that the collection will be modified. You should either return boolean true/false to inform user changes were done or always return parameter's collecion (to allow nested methods calling).

Also, why would you think it doesn't make sense to return a collection? I'd say there's no argument against it. Turn the question around: "why wouldn't I return the collection and could it harm my code"?

Upvotes: 0

Harmlezz

Reputation: 8058

In every programming language consistency of your own code / library with the approach of the core libraries is of high value. Hence, inspecting how Collections.sort() or Collection.swap() and Collections.shuffle() are defined, I would suggest to not return the input parameter, if you intend to modify it. In addition, your method should be named in such a way, that it is obvious the input parameter gets modified. Otherwise your method will be considered to have side-effects.

Returning a value most often suggests that it is a new instance which reflects the work, performed by the method or is used for method-chaining in case of builders.

Upvotes: 2

Should I return a collection when the reference to the collection is not changed?

Answers (7)

Related Questions