pencilCake
pencilCake

Reputation: 53263

What design pattern(s) would fit to LOAD-CONVERT-WRITE like scenarios?

I do not want to re-invent the wheel.

Is there a design pattern or patterns that would fit to the workflow below. Idea is to have a generic solution that would fit all : LOAD DATA --> CONVERT IT --> WRITE THE CONVERTED

Like:

(1) LOAD DATA : Loads data from DataSource and produces an IEnumerable

(2) COVNERT LOADED DATA - Walks through the loaded data, and converts them to TConverted type upon a conversion logic

(3) WRITE CONVERTED DATA - Walks through the IEnumerable and writes each item into a .txt file

Upvotes: 1

Views: 544

Answers (3)

Sameer
Sameer

Reputation: 4389

The "Template Method" pattern can help you build a generic framework that can be used to implement this process for different kinds of data. There would be an abstract base class like this:

public abstract class ETLProcess {
    public final runETL() {
        IEnumerable rawData = extract();
        IEnumerable tranformedData = transform(rawData);
        load(transformedData);
    }

    protected abstract IEnumerable extract();
    protected abstract IEnumerable transform(IEnumerable rawData);
    protected abstract load(IEnumerable transformedData);
}

And then you can implement the process for different kinds of data by extending ETLProcess class. The advantage of this pattern is that you can define your process in the abstract class and individual steps are defined in concrete classes. You can put the common code, common error handling etc in the base class.

Upvotes: 1

Josh C.
Josh C.

Reputation: 4363

I believe you are looking for the Adapter pattern. I often think of the conversion as an intermediary class leaning neither to the client nor the adaptee. The idea of a wrapper doesn't always "feel" very abstract. However, it is still probably best to write classes specifically designed to adapt incoming data to what the client expects. If you feel it is violating your abstraction, consider creating base classes or interfaces and implementing those for the specifics of your incoming data.

Upvotes: 0

sll
sll

Reputation: 62544

I believe Pipelines pattern with a good C# .NET 4.0 implementation on MSDN.

The idea is to extract stages, and for each stage schedula a new instance of a TPL's Task, then tie all together via BlockingCollection<T> instances as intermediate caches.

Also worth noting that mentioned in referenced MSDN paper BlockingCollection.GetConsumingEnumerable() returns IEnumerable<T> as you want.

General Flow example:

enter image description here

Upvotes: 1

Related Questions