AWhitford
AWhitford

Reputation: 4008

Why does Data Flow Sink Cache not have all Data Preview results?

I'm seeing a significant discrepancy in Data Flow results when using a Cache Sink vs a Data Set Sink. I recreated a simple example to demonstrate.

I uploaded a simple JSON file to Azure Data Lake Storage Gen 2:

{
  "data": [
    {
      "id": 123,
      "name": "ABC"
    },
    {
      "id": 456,
      "name": "DEF"
    },
    {
      "id": 789,
      "name": "GHI"
    }
  ]
}

I created a simple Data Flow that loads this JSON file, flattens it out, then returns it via a Sink. I'm primarily interested in using a Cache Sink because the output is small and I will ultimately need the output for the next pipeline step. (Write to activity output is checked.)

enter image description here

You can see that the Data Preview shows all 3 rows. (I have two sinks in this example simply because I'm illustrating that these do not match.)

Next, I create a pipeline to run the data flow:

enter image description here

Now, when I debug it, the Data Flow output only shows 1 record:

        "output": {
            "TestCacheSink": {
                "value": [
                    {
                        "id": 123,
                        "name": "ABC"
                    }
                ],
                "count": 1
            }
        },

However, the second Data Set Sink contains all 3 records:

{"id":123,"name":"ABC"}
{"id":456,"name":"DEF"}
{"id":789,"name":"GHI"}

I expect that the output from the Cache Sink would also have 3 records. Why is there a discrepancy?

Upvotes: 1

Views: 1743

Answers (1)

KarthikBhyresh-MT
KarthikBhyresh-MT

Reputation: 5044

When you choose cache as a sink, you will not be allowed to use logging. You see the below error during validation before debug.

enter image description here

To fix which, when you select "none" for logging, it automatically checks "first row only" property! This is causing it to write only the first row to cache sink. You just have to manually uncheck it before running debug.

Here is how it looks...

enter image description here

Upvotes: 9

Related Questions