Test002
Test002

Reputation: 11

Google Translate API - Reading and Writing to Cloud Storage - Python

I'm using Google Translation API to translate a csv file with multiple columns and rows. The target language is english and the file has text in multiple languages.

The code posted below uses local files for testing but I'd like to use (import) file from the cloud storage bucket and export the translated file to a different cloud storage bucket.

I've tried to run the script below with my sample file and got an error message: "FileNotFoundError: [Errno 2] No such file or directory"

I stumbled upon this link for "Reading and Writing to Cloud Storage" but I was not able to implement the suggested solution into the script below. https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/read-write-to-cloud-storage#reading_from_cloud_storage

May I ask for a suggested modification of the script to import (and translate) the file from google cloud bucket and export the translated file to a different google cloud bucket? Thank you!

Script mentioned:

from google.cloud import translate
import csv


def listToString(s):
    """ Transform list to string"""
    str1 = " "
    return (str1.join(s))

def detect_language(project_id,content):
    """Detecting the language of a text string."""

    client = translate.TranslationServiceClient()
    location = "global"
    parent = f"projects/{project_id}/locations/{location}"

    response = client.detect_language(
        content=content,
        parent=parent,
        mime_type="text/plain",  # mime types: text/plain, text/html
    )

    for language in response.languages:
        return language.language_code


def translate_text(text, project_id,source_lang):
    """Translating Text."""

    client = translate.TranslationServiceClient()
    location = "global"
    parent = f"projects/{project_id}/locations/{location}"

    # Detail on supported types can be found here:
    # https://cloud.google.com/translate/docs/supported-formats
    response = client.translate_text(
        request={
            "parent": parent,
            "contents": [text],
            "mime_type": "text/plain",  # mime types: text/plain, text/html
            "source_language_code": source_lang,
            "target_language_code": "en-US",
        }
    )

    # Display the translation for each input text provided
    for translation in response.translations:
        print("Translated text: {}".format(translation.translated_text))
        
def main():

    project_id="your-project-id"
    csv_files = ["sample1.csv","sample2.csv"]
    # Perform your content extraction here if you have a different file format #
    for csv_file in csv_files:
        csv_file = open(csv_file)
        read_csv = csv.reader(csv_file)
        content_csv = []

        for row in read_csv:
            content_csv.extend(row)
        content = listToString(content_csv) # convert list to string
        detect = detect_language(project_id=project_id,content=content)
        translate_text(text=content,project_id=project_id,source_lang=detect)

if __name__ == "__main__":
    main()

Upvotes: 0

Views: 1031

Answers (1)

CaioT
CaioT

Reputation: 2211

You could download the file from GCS and run your logic against the local (downloaded file) and then upload to another GCS bucket. Example:

Download file from "my-bucket" to /tmp

from google.cloud import storage

client = storage.Client()

bucket = client.get_bucket("my-bucket")
source_blob = bucket.blob("blob/path/file.csv")
new_file = "/tmp/file.csv"
download_blob = source_blob.download_to_filename(new_file)

After translating/running your code logic, upload to a bucket:

bucket = client.get_bucket('my-other-bucket')
blob = bucket.blob('myfile.csv')
blob.upload_from_filename('myfile.csv')

Upvotes: 1

Related Questions