Reputation: 35
EDIT: I'm not looking for Facebook APIs! I'm simply using Facebook as an example. I intend to get my browser to perform actions on different websites that likely have no APIs.
Let's say I wish to create a program that will log into Facebook, lookup my friends list, visit each one of their profiles, extract the date + text of each post and write this to a file.
I have an idea how the algorithm should work. But I have absolutely no clue how to interface my code with the browser itself.
Now I'm a Java programmer, so I would very much imagine the pesudo code in Java would be to create a Browser Object then convert the current page's contents to HTML code so that the data can be parsed. I provided an example code below of what I think it ought to look like.
However is this the right way that I should be doing it? If it is, then where can I find a web browser object? Are there any parsers I can use to 'read' the content? How do I get it to execute javascript such as clicking on a 'Like' button?
Or are there other ways to do it? Is there a GUI version and then I can simply command the program to go to X/Y pixel position and click on something. Or is there a way to write the code directly inside my FireFox and run it from there?
I really have no clue how to go about doing this. Any help would be greatly appreciated! Thanks!
Browser browser = new Browser();
browser.goToUrl("http://facebook.com");
//Retrieve page in HTML format to parse
HtmlPage facebookCom = browser.toHtml();
//Set username & password
TextField username = facebookCom.getTextField("username");
TextField password = facebookCom.getTextField("password");
username.setText("user123");
password.setText("password123");
facebookCom.updateTextField("username", username);
facebookCom.updateTextField("password", password);
//Update HTML contents
browser.setHtml(facebookCom);
// Click the login button and wait for it to load
browser.getButton("login").click();
while (browser.isNotLoaded()) {
continue;
}
// Click the friends button and wait for it to load
browser.getButton("friends").click();
while (browser.isNotLoaded()) {
continue;
}
//Convert the current page (Friends List) into HTML code to parse
HtmlPage facebookFriends = browser.toHtml();
//Retrieve the data for each friend
ArrayList<XMLElement> friendList = facebookFriends.getXmlElementToArray("friend");
for (XMLElement friend : friendList) {
String id = friend.getId();
//Visit the friend's page
browser.goToUrl("http://facebook.com/" + id);
while (browser.isNotLoaded()) {
continue;
}
//Retrieve the data for each post
HtmlPage friendProfile = browser.toHtml();
ArrayList<XMLElement> friendPosts = friendProfile.getXmlElementToArray("post");
BufferedWriter writer = new BufferedWriter(new File("C:/Desktop/facebook/"+id));
//Write the date+text of every post to a text file
for (XMLElement post : friendPosts) {
String date = post.get("date");
String text = post.get("text");
String content = date + "\n" + text;
writer.append(content);
}
}
Upvotes: 2
Views: 109
Reputation: 261
I think you are thinking about this the wrong way. You wouldn't really want to write a program to scrap the screen via the browser. It looks like you could take advantage of facebooks rest api and query for the data you are looking for. A link to get a users posts via rest api:
https://developers.facebook.com/docs/graph-api/reference/v2.6/user/feed
You could get their users id's from this endpoint:
https://developers.facebook.com/docs/graph-api/reference/friend-list/
Then plug the user ids into the first rest endpoint that was linked. Once you get your data coming back correctly via the rest api its fairly trivial to write that data out to a file.
Upvotes: 1