Reputation: 2412
I am scraping the web page and navigating to correct location, however as being a new to the whole c# world I am stuck with downloading pdf file.
Link is hiding behind this
var reportDownloadButton = driver.FindElementById("company_report_link");
It is something like: www.link.com/key/489498-654gjgh6-6g5h4jh/link.pdf
How to download the file to C:\temp\?
Here is my code:
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments("headless");
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(chromeOptions))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link2.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("Company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
EDIT:
EDIT 2:
I am not the sharpest pen on Stackoverflow community yet. I don't understand how to do it with Selenium. I have done it with
var reportDownloadButton = driver.FindElementById("company_report_link");
var text = reportDownloadButton.GetAttribute("href");
// driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
WebClient client = new WebClient();
// Save the file to desktop for debugging
var desktop = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Desktop);
string fileName = desktop + "\\myfile.pdf";
client.DownloadFile(text, fileName);
However web page seems to be a little bit tricky. I am getting
System.Net.WebException: 'The remote server returned an error: (401) Unauthorized.'
Debugger pointing at:
client.DownloadFile(text, fileName);
I think it should really simulate Right click and Save Link As, otherwise this download will not work. Also if I just click on button, it opens PDF in new Chrome tab.
EDIT3:
Should it be like this?
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
options.AddArguments("headless"); // we add headless here
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
//driver = new ChromeDriver(options); we don't need this because we already declare driver below.
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(options))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
driver.Navigate().GoToUrl("www.link.com");
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
}
}
}
}
}
Upvotes: 2
Views: 4087
Reputation: 5909
You could try setting the download.default_directory
Chrome driver preference:
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
driver = new ChromeDriver(options);
// ... run your code here ...
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
If reportDownloadButton
is a link that triggers a download, then the file should download to the filePath
you have set in download.default_directory
.
Neither of these threads are in C#, but they speak of a similar issue:
How to control the download of files with Selenium + Python bindings in Chrome
How to use chrome webdriver in selenium to download files in python?
Upvotes: 2