Reputation: 61
I am trying to setup the streamingRecognize() Google Cloud Speech to Text V2 in Node.js for streaming audio data and it always throws me the same Error upon the initial recognizer request to setup the stream:
Error: 3 INVALID_ARGUMENT: Invalid resource field value in the request.
at callErrorFromStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/call.ts:81:17)
at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:701:51)
at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client-interceptors.ts:416:48)
at /Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/resolving-call.ts:111:24
at processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
at ServiceClientImpl.makeBidiStreamRequest (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:685:42)
at ServiceClientImpl.<anonymous> (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/make-client.ts:189:15)
at /Users/<filtered>/backend/node_modules/@google-cloud/speech/build/src/v2/speech_client.js:318:29
at /Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:71:19
at /Users/<filtered>/backend/node_modules/google-gax/src/normalCalls/timeout.ts:54:13
at StreamProxy.setStream (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streaming.ts:204:20)
at StreamingApiCaller.call (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:88:12)
at /Users/<filtered>/backend/node_modules/google-gax/src/createApiCall.ts:118:26
at processTicksAndRejections (node:internal/process/task_queues:95:5)
{
code: 3,
details: 'Invalid resource field value in the request.',
metadata: Metadata {
internalRepr: Map(2) {
'google.rpc.errorinfo-bin' => [Array],
'grpc-status-details-bin' => [Array]
},
options: {}
},
statusDetails: [
ErrorInfo {
metadata: [Object],
reason: 'RESOURCE_PROJECT_INVALID',
domain: 'googleapis.com'
}
],
reason: 'RESOURCE_PROJECT_INVALID',
domain: 'googleapis.com',
errorInfoMetadata: {
service: 'speech.googleapis.com',
method: 'google.cloud.speech.v2.Speech.StreamingRecognize'
}
}
The stream setup process has two steps 1. sending the recognizer request object to tell google what recognizer to use (consisting of the path to the recognizer object as string and an optional config object to overwrite certain options of the recognizer) for the following audio data in bytes and 2. The same request with no config but an audio Buffer for the audio to be transcribed.
I did not get to sending the audio data since the initial recognizer request always failed.
Would be great if someone could help me with this issue since it seems to be rather simple one which might be super obvious if you know where the issue originates from.
My guesses where I made a mistake:
I have read through the Google Cloud Speech to Text V2 docs and tried to implement everything as described. In the end it should return transcribed audio.
I also tried several times to implement streamingRecognize() as follows and with some slight variations:
public async initialize() {
const recognizerName = `projects/${this.projectId}/locations/global/recognizers/_`;
const transcriptionRequest = {
recognizer: recognizerName,
streaming_config: streamingConfig,
};
const stream = this.client
.streamingRecognize()
.on("data", function (response) {
console.log(response);
})
.on("error", function (response) {
console.log(response);
});
// Write request objects.
stream.write(transcriptionRequest);
}
I have also tried to use several recognizer_ids instead of "_" in recognizerName. I have tried several different types of transcriptionRequests where I omitted the streaming_config or renamed it to just "config". I have triple checked my projectId which I have also exchanged for the project number instead of the project-id (found on the main page of the google cloud console). Nothing worked and I always receive the same Error.
Besides that I have also tried to make a normale createRecognizer and recognize request using v2 like this which worked fine:
// Creates a Recognizer: WORKS
public async createRecognizer() {
const recognizerRequest = {
parent: `projects/${this.projectId}/locations/global`,
recognizerId: "rclatest",
recognizer: {
languageCodes: ["en-US"],
model: "telephony",
},
};
const operation = await this.client.createRecognizer(recognizerRequest);
const recognizer = operation[0].result;
const recognizerName = recognizer; //.name;
console.log(`Created new recognizer: ${recognizerName}`);
}
// Transcribes Audio: WORKS
public async transcribeFile() {
const recognizerName = `projects/${this.projectId}/locations/global/recognizers/${this.recognizerId}`;
const content = fs.readFileSync(this.audioFilePath).toString("base64");
const transcriptionRequest = {
recognizer: recognizerName,
config: {
// Automatically detects audio encoding
autoDecodingConfig: {},
},
content: content,
};
const response = await this.client.recognize(transcriptionRequest);
for (const result of response[0].results) {
console.log(`Transcript: ${result.alternatives[0].transcript}`);
}
}
Upvotes: 6
Views: 4696
Reputation: 21
I finally got this to work. I followed the Types provided by the TypeScript definition to understand the nested config structure that was necessary for this. If you use JS code, just leave out the types.
Three things to note at first:
Here is my code that works:
// this is where I got the types from, which I used to figure out this nested config structure
import { google } from '@google-cloud/speech/build/protos/protos';
// Must have GOOGLE_APPLICATION_CREDENTIALS environment variable set
speechClient = new SpeechClient();
const recognitionConfig: google.cloud.speech.v2.IRecognitionConfig = {
autoDecodingConfig: {},
explicitDecodingConfig: {
encoding: event.encoding,
sampleRateHertz: event.sampleRateHertz,
audioChannelCount: 1,
},
languageCodes: [event.languageCode],
model: 'long' // video does not exist in v2
}
const streamingRecognitionConfig: google.cloud.speech.v2.IStreamingRecognitionConfig = {
config: recognitionConfig,
streamingFeatures: {
interimResults: true,
}
}
const streamingRecognizeRequest: google.cloud.speech.v2.IStreamingRecognizeRequest = {
recognizer: `projects/${GOOGLE_PROJECT_ID}/locations/global/recognizers/_`,
streamingConfig: streamingRecognitionConfig,
};
recognizeStream = speechClient
._streamingRecognize()
.on('error', (err) => {
console.error(err);
})
.on('data', async (data) => {
// your code to react to answers from the API
});
recognizeStream.write(streamingRecognizeRequest); // Do this once and only once
When sending audio junks, you must send
recognizeStream.write({ audio: data }); // where data is your audio chunk
Note the GOOGLE_PROJECT_ID, where you put the ID of your project. You can find this in the Google Cloud Console.
Now about the recognizer URL - if I use another region, the call fails. I suspect you'd have to create a recognizer first to do this. I think you have to do this by code, as I found no way of creating one in the Google Cloud Console. See more in this issue as per Vladislav's anwer.
I am on version "@google-cloud/speech": "6.1.1",
Upvotes: 2
Reputation: 3908
We have the following working solution for Dynamic batch speech recognition. Please note the importance of setting proper endpoint in the config and also for the model recognizer.
The Speech-to-Text V2 API has an option to use dynamic batch. Dynamic batch processes audio at a lower level of urgency. If you enable dynamic batch, you will be billed at a discounted rate.
const speech = require('@google-cloud/speech').v2;
const GOOGLE_PROJECT_ID = "your-project-id";
const gcsUri = "gs://speech-samples-00/commercial_mono.wav"; // must be in Google Storage
const configSpeachGoogle = {
projectId: GOOGLE_PROJECT_ID,
keyFilename: 'google-credentials.json',
apiEndpoint: "europe-west4-speech.googleapis.com" // needed for chirp model
}
const speachClient = new speech.SpeechClient(configSpeachGoogle);
const recognizer = `projects/${GOOGLE_PROJECT_ID}/locations/europe-west4/recognizers/_`; // note the location in Europe which reflects your google config
Submit Job:
const batchConfig = {
languageCodes: ["cs-CZ"],
model: "chirp", // Available in europe-west4, us-central1, asia-southeast1
autoDecodingConfig: {},
explicitDecodingConfig: {
encoding: "LINEAR16",
sampleRateHertz: 8000,
audioChannelCount: 1
}
};
const configRequest = {
recognizer: recognizer,
config: batchConfig,
files: [{
uri: gcsUri
}],
recognitionOutputConfig: {
gcsOutputConfig: {
URI: "gs://my-results-bucket/outputs"
}
},
processingStrategy: 'DYNAMIC_BATCHING'
};
Get Results:
let operation = await speachClient.batchRecognize(configRequest);
let data = await operation[0].promise();
console.log('Transcribe response', data[0].results);
Upvotes: 1
Reputation: 415
I put my working code here: nodejs-docs-samples/issues/3578
client.createRecognizer
function (the code is in the issue above)recognizingClient.write(audioData)
before, now you should do (but only once!)recognizingClient.write(newConfigWithRecognizer)
and then recognizingClient.write({audio: audioData})
public streamingConfig?: (google.cloud.speech.v2.IStreamingRecognitionConfig|null);
/** Properties of a StreamingRecognitionConfig. */
interface IStreamingRecognitionConfig {
** StreamingRecognitionConfig config */
config?: (google.cloud.speech.v2.IRecognitionConfig|null);
/** StreamingRecognitionConfig configMask */
configMask?: (google.protobuf.IFieldMask|null);
/** StreamingRecognitionConfig streamingFeatures */
streamingFeatures?: (google.cloud.speech.v2.IStreamingRecognitionFeatures|null);
}
`
When instantiating streamingClient use _streamingRecognize() (this probably is likely to be changed)
Upvotes: 0
Reputation: 26
The following code works in my case:
const recognizer = `projects/${projectId}/locations/global/recognizers/_`
const google_model = "latest_long"
const streamingConfig = {
config: {
languageCodes: ["en-US"],
model: google_model,
autoDecodingConfig: {}
},
};
const configRequest = {
recognizer: recognizer,
streamingConfig: streamingConfig,
};
const recognizeStream = client
._streamingRecognize()
.on('error', (err) => {
console.error(err);
})
.on('data', (data) => {
console.log(data)
}
);
recognizeStream.write(configRequest);
UPD:
On the request of @Cybersupernova I add the screenshot with code and the run results. Screenshot
Upvotes: 0