Reputation: 720
I am trying to customise a puppeteer script that plays a song on soundcloud and records it. Using a CSS selector I would like to print the song duration as well. I can't seem to get the CSS selector to work. The url I am working with is https://soundcloud.com/octasine/octasine-audio-example-1
I have a working CSS selector now and can grab the minutes and seconds from the page. The challenge I am seeing is that sometimes the page hasn't finished rendering and I get an empty array return using await page.waitForNavigation();
causes the promise to just fail.
What am I missing to get puppeteer to work more reliably?
This is how I am using the CSS selector:
const work = async () => {
const inputsValues = [];
const inputElements = await page.$$('span.sc-visuallyhidden');
for (const element of inputElements) {
let inputValue;
inputValue = await element.getProperty('innerText');
inputValue = await inputValue.jsonValue();
if (inputValue.includes('Duration')){
console.log("DURATION");
mins = inputValue.split(" ")[1];
secs = inputValue.split(" ")[3];
console.log(mins);
console.log(secs);
console.log(inputValue);
}
inputsValues.push(inputValue);
}
console.log(inputsValues)
}
await work();
My complete script example.js
:
// example.js -- node version v14.17.2 -- dependency installed with npm i puppeteer-stream
const { launch, getStream } = require("puppeteer-stream");
const fs = require("fs");
const { Console } = require("console");
const file = fs.createWriteStream(__dirname + "/test.webm");
async function test() {
const browser = await launch();
const page = await browser.newPage();
await page.goto("https://soundcloud.com/octasine/octasine-audio-example-1");
// await page.waitForNavigation();
let html_var = await page.content();
// Write the file
fs.writeFile("example.html", html_var, function (err) {
// Checks if there is an error
if (err) return console.log(err);
});
console.log("Wrote html to example.html");
// await page.click("//a[contains(text(), 'Play')]");
await page.evaluate(() => {
let elements = document.getElementsByClassName('snippetUXPlayButton');
for (let element of elements)
element.click();
});
const work = async () => {
const inputsValues = [];
const inputElements = await page.$$('span.sc-visuallyhidden');
for (const element of inputElements) {
let inputValue;
inputValue = await element.getProperty('innerText');
inputValue = await inputValue.jsonValue();
if (inputValue.includes('Duration')){
console.log("DURATION");
mins = inputValue.split(" ")[1];
secs = inputValue.split(" ")[3];
console.log(mins);
console.log(secs);
console.log(inputValue);
}
inputsValues.push(inputValue);
}
console.log(inputsValues)
}
await work();
let page_url = await page.url();
console.log(page_url)
await page.evaluate(() => {
let elements = document.getElementsByClassName('sc-visuallyhidden');
for (let element of elements)
console.log(element.innerHTML);
});
const stream = await getStream(page, { audio: true, video: true });
console.log("recording");
stream.pipe(file);
setTimeout(async () => {
await stream.destroy();
file.close();
console.log("finished");
browser.close();
}, 1000 * 5 + mins * 60000 + secs * 1000);
}
test();
Script based on example script from https://www.npmjs.com/package/puppeteer-stream
Upvotes: 2
Views: 1408
Reputation: 8871
The elements with span.sc-visuallyhidden
selectors are filled into the DOM dynamically one by one, hence the length of $$('span.sc-visuallyhidden')
grows as the page loads. At the moment when you populate your inputElements
array it may not contains the Duration yet.
To make 100% sure it will be available in your set of elements you need to wait until it is rendered into the DOM. E.g. by grabbing its exact selector:
await page.waitForSelector('.playbackTimeline__duration > span.sc-visuallyhidden')
I suggest refactoring your work()
function as a page.$$eval
method like this:
const inputsValues = await page.$$eval('span.sc-visuallyhidden', elems => elems.map(el => el.innerText))
Output is:
8 months ago, 2,452 plays, View all likes, View all reposts, 10 followers, 2 tracks, 414 plays, View all likes, View all comments, Current time: 0 seconds, Duration: 2 minutes 26 seconds, Current track: Octasine Audio Example 1
...that contains: Duration: 2 minutes 26 seconds
you can process to mins and secs like before.
Upvotes: 3