FKrauss
FKrauss

Reputation: 418

BQ giving incomplete results using CONTAINS condition

I'm working with Bigquery to process some Adwords data and, more precisely, to extract all the url parameters from our destination URLs so we can organize it better and etc.

I wrote the following Query to give me back all the parameters available in the "DestinationURL" field in the table. As follows:

SELECT Parameter
  FROM (SELECT NTH(1, SPLIT(Params,'=')) as Parameter,
        FROM (SELECT 
              AdID,
              NTH(1, SPLIT(DestinationURL,'?')) as baseurl,
              split(NTH(2, SPLIT(DestinationURL,'?')),'&') as Params
              FROM [adwords_accounts_ads.ads_all] 
              HAVING Params CONTAINS '='))

 GROUP BY 1

Runnig this will give me 6 parameters. That is correct but incomplete, because in this testing table I know there are 2 other parameters in the URLs that were not fetched. One called 'group' and the other called 'utm_content'.

Now if I run:

SELECT Parameter
  FROM (SELECT NTH(1, SPLIT(Params,'=')) as Parameter,
        FROM (SELECT 
              AdID,
              NTH(1, SPLIT(DestinationURL,'?')) as baseurl,
              split(NTH(2, SPLIT(DestinationURL,'?')),'&') as Params
              FROM [adwords_accounts_ads.ads_all] 
              HAVING Params CONTAINS 'p='))

 GROUP BY 1

I get the "group" parameter showing.

question is: shouldn't the

"CONTAINS '='" 

condition include the

"CONTAINS 'p='" 

In the result? same happens for 't=' instead of '='

Does anyone know how I can fix that? or even how to extract all the parameters from a string that contains a URL?

ps: using LIKE yields the exact same thing

Thanks!

Upvotes: 1

Views: 113

Answers (1)

Pentium10
Pentium10

Reputation: 207838

Split creates a REPEATED output type, and you have to FLATTEN the table to see correctly.

Here I used flatten on params and the output is now good:

SELECT nth(1,SPLIT(Params,'=')) AS Param,
       nth(2,SPLIT(Params,'=')) AS Value
FROM flatten(SELECT 
                AdID, 
                NTH(1, SPLIT(DestinationURL,'?')) AS baseurl, 
                split(NTH(2, SPLIT(DestinationURL,'?')),'&') AS Params
                FROM
                    (SELECT 1 AS AdID,'http://www.example.com.br/?h=Passagens+Aereas&source=google&vt=0' AS DestinationURL) 
                HAVING Params CONTAINS '=',
            params
            )

Outputs:

+-----+--------+------------------+---+
| Row | Param  |      Value       |   |
+-----+--------+------------------+---+
|   1 | h      | Passagens+Aereas |   |
|   2 | source | google           |   |
|   3 | vt     | 0                |   |
+-----+--------+------------------+---+

NOTE: The Web UI always flattens your result but If you select a destination table and uncheck "flatten results", you will get a single row with a repeated parts column.

Upvotes: 2

Related Questions