Harshit Agarwal
Harshit Agarwal

Reputation: 21

Escaping Hindi Characters in mysql

I'm trying to fetch data from database using MySQL, the table has some entries in hindi, I'm looking to filter out those hindi results in google sheets automatically by running SQL queries in google sheets.

I've been able to fetch data into google sheets from database but unable to fetch hindi filtered data

select property, week,
  sum(Original_Stories) "Original Stories",
  sum(Wire_Stories) "Wire Stories",
  sum(Photo_Gallery) "Photo Gallery",
  sum(Video_Gallery) "Video Gallery",
from
  (select *, 
     case when a= "Original_Stories" then total end as "Original_Stories", 
     case when a= "Wire" then total end as "Wire_Stories", 
     case when a= "Photo" then total end as "Photo_Gallery", 
     case when a= "Video_Gallery" then total end as "Video_Gallery"
   from
     (select "Zee Hin" as property,
        yearweek(date_format(date_published,"%Y-%m-%d"),6) week,
        case
          when ((article_type like 'live_blog') or
                (article_type like '%article%' and agency_name like 'ज़ी न्यूज़ डेस्क') or
                (article_type like '%article%' and agency_name like 'एक्सक्लूसिव') or
                (article_type like '%article%' and agency_name like 'ज़ी मीडिया ब्‍यूरो') or
                (article_type like '%article%' and agency_name like 'Exclusive'))
            then "Original_Stories" 
          when (article_type like '%gallery%') then "Photo"
          when (article_type like '%video%') then "Video_Gallery"
          else 'Wire'
        end as a,
        count(distinct article_id) total,
        count(distinct date_format(date_published, "%Y-%m-%d")) days
      from
        zeenews_hindi
      where date_format(date_published, "%Y-%m-%d") >= '2018-12-30'
      group by 2,3
    ) temp
  ) agg
group by 1,2

I need data for hindi filters

Upvotes: 2

Views: 606

Answers (3)

Rick James
Rick James

Reputation: 142296

SELECT ... WHERE HEX(col) REGEXP '(..)*E0A[45]'

Will fetch rows where col contains some Devanagari characters (and possibly others).

UTF-8 hex, mostly E0A4xx has the letters; E0A5xx has the 'signs'.

Upvotes: 0

Tom Blodget
Tom Blodget

Reputation: 20802

Supported in MySQL 8

agency_name regexp '\\p{script=Devanagari}'

Here \p introduces a class of characters meeting a character property filter. In this case, the Unicode script "Devanagari". If that doesn't cover all the characters you want to find for Hindi, you can add other Unicode scripts, blocks or ranges. (Detecting the language used in text, mi amigo, is a different animal, n'est-ce pas?)

Upvotes: 0

O. Jones
O. Jones

Reputation: 108686

See this: How can I find non-ASCII characters in MySQL?

With MySQL you can detect non-roman characters with this kind of query.

SELECT whatever
  FROM tableName 
 WHERE columnToCheck <> CONVERT(columnToCheck USING latin1)

This works by comparing your column to its own value rendered in latin1 (aka iso8859-1), a character set suitable for Western Europe and the USA.

For example, this query

SELECT 'Some Hindi: ज़ी मीडिया ब्‍यूरो', CONVERT ('Some Hindi: ज़ी मीडिया ब्‍यूरो' USING LATIN1);

yields these results

Some Hindi: ज़ी मीडिया ब्‍यूरो         Some Hindi: ??? ?????? ???????

The ? characters are the replacement characters.

Upvotes: 1

Related Questions