Tutorial Selenium tut for scraping

Riddle · 08-08-2016, 02:19 AM

I'm not that good at java, I just used it enough to know what i'm doing (when it comes to selenium)

Code:
import org.openqa.selenium.By;

import org.openqa.selenium.WebDriver;

import org.openqa.selenium.chrome.ChromeDriver;

import java.io.IOException;

public class main extends Thread {

    public static int fuckoffloop = 0;

    public static void main(String[] args) throws IOException, InterruptedException {

        WebDriver driver = new ChromeDriver(); // First we need to define driver, like so.

        // We will be scrapping from https://sinister.ly/memberlist

        int Memebers = 503; // Because there 500 users per page.

        int pageNumber = 1; // first page

        driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=1");

        for (int x = 3; x < Memebers; x++) { // Normal for loop.

             // So, we can goto a site like so, I saw "perpage=20" so I just changed it so we don't have to go to as many pages (also faster)

             /*Now we need to find a class or id, xpath, ect.

            we will use xpath since there is multipliable "trow1" class's

            I only know how to get xpath in chrome (since I only use chrome), So right click your text or whatever you want to scrape, right on the HTML code it highlights, then copy -> copy xpath

            So if we click on the first user on the URL and get the xpath of it, it will be

            //*[@id="content"]/table/tbody/tr[3]/td[2]/a/span

            Now, This only is for that person and that person only, so lets click on the second memeber.

            //*[@id="content"]/table/tbody/tr[4]/td[2]/a/span

            There, we can see a change, it incremented by one.

            Now, i'm going to scroll all the way to the bottom so I can see when too tell my loop to stop

            //*[@id="content"]/table/tbody/tr[502]/td[2]/a/span

            That is the last memeber on the page,so lets get to work writing this

             */

            System.out.println(driver.findElement(By.xpath("//*[@id=\"content\"]/table/tbody/tr["+x+"]/td[2]/a/span")).getText()); // This is the xpath, you can see we put +x+ so we can get each user's name, also, getText will get the text of your WebElement

            if (x == 502){ // This check we will use to get to the next page

                x =3; // So we set back to the default three (which will be the first user)

                pageNumber++; // we incremente pageNumber by 1 so we go to the next page

                driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=" + pageNumber); // we get the next page to scrape from

            }

        }

    }

}

W/o comments

Code:
import org.openqa.selenium.By;

import org.openqa.selenium.WebDriver;

import org.openqa.selenium.chrome.ChromeDriver;

import java.io.IOException;

public class main extends Thread {

    public static int fuckoffloop = 0;

    public static void main(String[] args) throws IOException, InterruptedException {

        WebDriver driver = new ChromeDriver();

        int Memebers = 503;

        int pageNumber = 1;

        driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=1");

        for (int x = 3; x < Memebers; x++) {

            System.out.println(driver.findElement(By.xpath("//*[@id=\"content\"]/table/tbody/tr[" + x + "]/td[2]/a/span")).getText());

            System.out.println(x);

            if (x == 502) {

                x = 3;

                pageNumber++;

                driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=" + pageNumber);

            }

        }

    }

}

BORW3 · 08-26-2016, 09:03 AM

I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

Inori · 08-26-2016, 04:54 PM

I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.

(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

BORW3 · 08-27-2016, 12:05 PM

(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.

(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

Inori · 08-27-2016, 02:33 PM

(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.

(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

BORW3 · 08-29-2016, 01:54 PM

(08-27-2016, 02:33 PM)Ao- Wrote:
(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Ex094 · 08-29-2016, 03:34 PM

(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.

(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

Are you trying to Click a Submit button via JS?

Inori · 08-29-2016, 04:22 PM

(08-29-2016, 01:54 PM)BORW3 Wrote:
(08-27-2016, 02:33 PM)Ao- Wrote:
(08-27-2016, 12:05 PM)BORW3 Wrote: Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Numberphile did a good video on this topic. YouTube tries their best to veto "counterfeit" views, so it's harder than just visiting the page.

Wildfire · 08-30-2016, 04:03 AM

You are making a pretty big mistake here. NEVER assume an element exists in Selenium, if anything goes wrong it will exception out and you will be screwed.

BORW3 · 08-31-2016, 09:12 AM

(08-29-2016, 04:22 PM)Ao- Wrote:
(08-29-2016, 01:54 PM)BORW3 Wrote:
(08-27-2016, 02:33 PM)Ao- Wrote: I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Numberphile did a good video on this topic. YouTube tries their best to veto "counterfeit" views, so it's harder than just visiting the page.

LOL. Man. 301 is end of bot counters.