Reputation: 2470
I got a method which accepts a collection as below
public IList<CountryDto> ApplyDefaults(IList<CountryDto> dtos)
{
//Iterates the collection
//Validates the items in collection
//If items are invalid
//Removes items e.g dtos.Remove(currentCountryDto)
return dtos;//Do I need to do this?
}
My question is since, the reference to the collection is not changed, should I return the collection again from the method?
What is the best approach in this case?
Note: I am not sure if this question is opinion based. I think probably I missing something here on design side.
Upvotes: 2
Views: 415
Reputation: 23198
Given your comments/requirements:
I think based on these, this method definitely should not return the reference to the incoming list, even if no validation is applied. Firstly, unless the API is clearly built around method chaining (which you indicated you do not want), returning a List<T>
type usually indicates a new List is being created. Secondly, if a new list is not created, users may find themselves modifying the list in ways they didn't expect.
Consider:
IList<CountryDto> originalCountries = Service.GetCountries();
IList<CountryDto> validatedCountries = ApplyDefaults(originalCountries);
validatedCountries.Add(mySpecialCountry);
OutputOriginalCountries(originalCountries);
OutputValidatedCountries(validatedCountries);
This code isn't very special, and a fairly common pattern. If ApplyDefaults
returned a reference to the same originalCountries
collection, then mySpecialCountry
would also be added to originalCountries
. This would violate the Principle of Least Astonishment.
This would be exacerbated if this behaviour changed depending on whether or not items were validated/filtered. Since the validation logic is a black-box of behaviour that the caller doesn't know or care about, the API consumer could not depend on whether or not it returned the same reference. They would either have to do their own reference check (e.g., if (myValidatedCountries == myInputCountries)
), or simply make a copy every time. Regardless, this becomes another weird behaviour that the programmer has to juggle when working with the API.
I think that the method should either:
A) always return a copied list with the items filtered out (public IList<CountryDto> ApplyDefaults(IEnumerable<CountryDto> dtos)
)
B) modify the incoming list in-place (public void ApplyDefaults(IList<CountryDto> dtos)
)
For option A, depending on the size of your list, this incurs the possible unnecessary work of creating a copied list every time even if no filtering is performed. However, the validation/filtering logic might be simpler. You might be able to use LINQ queries to apply the filtering nicely. Additionally, removing items from a list is generally costly as it has to rebuild the internal array. So it might actually be faster to build a new list. You may even simplify the signature here to be IEnumerable<CountryDto>
; this allows for wider usage and is extremely obvious that you're creating a new collection.
For option B, if no validation is required, then no work is done and the method is essentially "free" (no array rebuilding, no copying, no reference changes). But if there is significant validation, the removal aspect may be costly. Since you're not method chaining, this version should have a void
return type as it's much more obvious to the developer that this is modifying the list in-place. This follows other commonly known methods like List<T>.Sort
. Furthermore, if a user wants to have a separate originalCountries
and validatedCountries
they can always make a copy:
var validatedCountries = originalCountries.ToList();
ApplyDefaults(validatedCountries);
Ultimately, which one you choose might depend on performance. If validation/removal is cheap and rare, then modifying the list in-place might be best. If you're expecting a lot of changes to the list, it might simply be faster to produce a new copy every time.
Regardless, I would suggest you name the method with a little more clarity as well. For example:
public IList<CountryDto> GetValidCountries(IEnumerable<CountryDto> dtos)
public void RemoveInvalidCountries(IList<CountryDto> dtos)
Of course, the naming might be different depending on your actual code context (I suspect ApplyDefaults
is a common/inherited method name and not specific to CountryDto
)
Upvotes: 1
Reputation: 57
In your case I would leave void since ApplyDefaults
clearly states what its doing.
Also, it might be a good idea to ApplyDefaults in the collection itself. Subclass IList or List or whatever and then you'd call like this:
myCollection.ApplyDefaults();
Which is just obvious.
Upvotes: 0
Reputation: 41
(My answer is based on the Java viewpoint; C++ and C# programmers might have a different take.) I think it's best to return the collection. The fact that the collection you're returning is the same collection that was given is just an implementation detail, and in future versions of the code, you might want to change that. Document that the collection returned might not be the same one passed in.
If, on the other hand, you want to lock in the design that this method modifies a collection in place, document it that way and don't return the collection. I prefer not to do it this way, but I can see advantages in some contexts.
Upvotes: 0
Reputation: 186668
I'd rather return boolean
(or enum
in an elaborated case: collection preserved intact,
changed, can't be validated etc.)
// true if the collection is changed, false otherwise
public Boolean ApplyDefaults(IList<CountryDto> dtos) {
Boolean result = false;
//Iterates the collection
//Validates the items in collection
//If items are invalid:
// Removes items e.g dtos.Remove(currentCountryDto)
// result = true;
...
return result;
}
...
if (ApplyDefaults(myData)) {
// Collection is changed, do some extra stuff
}
Upvotes: 1
Reputation: 6814
Technically, I would say there is not much difference between the two.
However, and as you pointed out, a common used convention is that a function should only return an object it creates. Basically, that would mean that a function that returns an object is generating one while a function which doesn't return anything is modifying the object passed as a parameter.
Again, this is only a convention and it is not widely used within the C# community, but in the python community for example, it is.
Some people, returns a Boolean (or an error code) instead as an indicator of an error (like the old dos command line). I don't like this approach and prefer by far raising exceptions that I can handle later on.
Finally, the best approach in my regard, is to return a value that indicates if a change was done by the function and eventually a value indicating how much of a change was done. It can be a Boolean or it can be the number of inserted/removed elements...
In any case, try to be consistent with the approach you chose, if not in all your code, at least within a single project. Sometimes, you will have no other choice but to abide with the convention used by your teammates.
Upvotes: 0
Reputation: 3255
First of all: you cannot change the reference of the collection you send by parameter, because by default you're getting copy of it. You'd need to use a ref
keyword in order to be able to change it.
Secondly: if your method has a return type, than it has to return an object. Your method is not called GetNewCollectionWithAppliedDefaults
, but ApplyDefaults
which implies that the collection will be modified. You should either return boolean true/false to inform user changes were done or always return parameter's collecion (to allow nested methods calling).
Also, why would you think it doesn't make sense to return a collection? I'd say there's no argument against it. Turn the question around: "why wouldn't I return the collection and could it harm my code"?
Upvotes: 0
Reputation: 8058
In every programming language consistency of your own code / library with the approach of the core libraries is of high value. Hence, inspecting how Collections.sort() or Collection.swap() and Collections.shuffle() are defined, I would suggest to not return the input parameter, if you intend to modify it. In addition, your method should be named in such a way, that it is obvious the input parameter gets modified. Otherwise your method will be considered to have side-effects.
Returning a value most often suggests that it is a new instance which reflects the work, performed by the method or is used for method-chaining in case of builders.
Upvotes: 2