Login Register






Thread Rating:
  • 0 Vote(s) - 0 Average


Tutorial Selenium tut for scraping filter_list
Author
Message
Selenium tut for scraping #1
I'm not that good at java, I just used it enough to know what i'm doing (when it comes to selenium)

Code:
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import java.io.IOException;

public class main extends Thread {
   public static int fuckoffloop = 0;

   public static void main(String[] args) throws IOException, InterruptedException {


       WebDriver driver = new ChromeDriver(); // First we need to define driver, like so.
       // We will be scrapping from https://sinister.ly/memberlist
       int Memebers = 503; // Because there 500 users per page.

       int pageNumber = 1; // first page

       driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=1");
       for (int x = 3; x < Memebers; x++) { // Normal for loop.
            // So, we can goto a site like so, I saw "perpage=20" so I just changed it so we don't have to go to as many pages (also faster)
            /*Now we need to find a class or id, xpath, ect.
           we will use xpath since there is multipliable "trow1" class's

           I only know how to get xpath in chrome (since I only use chrome), So right click your text or whatever you want to scrape, right on the HTML code it highlights, then copy -> copy xpath
           So if we click on the first user on the URL and get the xpath of it, it will be
           //*[@id="content"]/table/tbody/tr[3]/td[2]/a/span

           Now, This only is for that person and that person only, so lets click on the second memeber.
           //*[@id="content"]/table/tbody/tr[4]/td[2]/a/span

           There, we can see a change, it incremented by one.
           Now, i'm going to scroll all the way to the bottom so I can see when too tell my loop to stop
           //*[@id="content"]/table/tbody/tr[502]/td[2]/a/span


           That is the last memeber on the page,so lets get to work writing this

            */
           System.out.println(driver.findElement(By.xpath("//*[@id=\"content\"]/table/tbody/tr["+x+"]/td[2]/a/span")).getText()); // This is the xpath, you can see we put +x+ so we can get each user's name, also, getText will get the text of your WebElement
           if (x == 502){ // This check we will use to get to the next page
               x =3; // So we set back to the default three (which will be the first user)
               pageNumber++; // we incremente pageNumber by 1 so we go to the next page
               driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=" + pageNumber); // we get the next page to scrape from
           }
       }

   }

}


W/o comments


Code:
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

import java.io.IOException;

public class main extends Thread {
   public static int fuckoffloop = 0;

   public static void main(String[] args) throws IOException, InterruptedException {

       WebDriver driver = new ChromeDriver();

       int Memebers = 503;
       int pageNumber = 1;
       
       driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=1");
       for (int x = 3; x < Memebers; x++) {
           System.out.println(driver.findElement(By.xpath("//*[@id=\"content\"]/table/tbody/tr[" + x + "]/td[2]/a/span")).getText());
           System.out.println(x);
           if (x == 502) {
               x = 3;
               pageNumber++;
               driver.get("https://sinister.ly/memberlist.php?sort=lastvisit&order=descending&perpage=500&page=" + pageNumber);
           }
       }

   }

}

Reply

RE: Selenium tut for scraping #2
I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

Reply

RE: Selenium tut for scraping #3
I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.



(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.
It's often the outcasts, the iconoclasts ... those who have the least to lose because they
don't have much in the first place, who feel the new currents and ride them the farthest.

Reply

RE: Selenium tut for scraping #4
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.



(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

Reply

RE: Selenium tut for scraping #5
(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.



(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly
It's often the outcasts, the iconoclasts ... those who have the least to lose because they
don't have much in the first place, who feel the new currents and ride them the farthest.

Reply

RE: Selenium tut for scraping #6
(08-27-2016, 02:33 PM)Ao- Wrote:
(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.




After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Reply

RE: Selenium tut for scraping #7
(08-27-2016, 12:05 PM)BORW3 Wrote:
(08-26-2016, 04:54 PM)Ao- Wrote: I can't get a great look at the code since I'm on mobile, but you should methodize the program (i.e., use methods to split it into chunks) and let it take command line arguments (the String[] args in main) to scrape the user's desired site. Also, with sites using couldflare, you need to give it a few seconds to pass the DDoS protection page.



(08-26-2016, 09:03 AM)BORW3 Wrote: I need to know, can I use Selenium on any PC after creating a java application? Or I need to install Selenium on each PC I intend to use the application on?

After compiling ("creating", in your terms) the program, it will work on any computer you use it on. That's why Java requires import statements for everything: import statements tell the compiler what to include in the final product. Additionally, since Java is completely cross-platform, you can run it on any computer that has Java installed, not just PC.

Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

Are you trying to Click a Submit button via JS?
My Blog: http://www.procurity.wordpress.com
Donations: 1HLjiSbnWMpeQU46eUVCrYdbkrtduX7snG

Reply

RE: Selenium tut for scraping #8
(08-29-2016, 01:54 PM)BORW3 Wrote:
(08-27-2016, 02:33 PM)Ao- Wrote:
(08-27-2016, 12:05 PM)BORW3 Wrote: Does Selenium support javascript input on submit buttons? Because I tried using HTMLUnit and it didn't submit my requests on clicking buttons with it, I read I found that if the website had javascript errors then it would'nt load and work properly. So Does Selenium have this same problem?

I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Numberphile did a good video on this topic. YouTube tries their best to veto "counterfeit" views, so it's harder than just visiting the page.
It's often the outcasts, the iconoclasts ... those who have the least to lose because they
don't have much in the first place, who feel the new currents and ride them the farthest.

Reply

RE: Selenium tut for scraping #9
You are making a pretty big mistake here. NEVER assume an element exists in Selenium, if anything goes wrong it will exception out and you will be screwed.

Reply

RE: Selenium tut for scraping #10
(08-29-2016, 04:22 PM)Ao- Wrote:
(08-29-2016, 01:54 PM)BORW3 Wrote:
(08-27-2016, 02:33 PM)Ao- Wrote: I mainly use the Python bindings for selenium, so I'm not sure if it's a Java-specific issue, but any JavaScript events/actions/listeners should work if they're put together correctly

Okay, I have tried it, works better with javascript than HTMLUnit ever did. I also tried creating a youtube bot to rack up views with Selenium. I used HtmlUNitDriver wih Selenium and a get() to a video page, i didn't add the web views, but opening the link with firefox normally worked fine. So.... Wondering if you can assist me here.

Numberphile did a good video on this topic. YouTube tries their best to veto "counterfeit" views, so it's harder than just visiting the page.

LOL. Man. 301 is end of bot counters.

Reply







Users browsing this thread: 1 Guest(s)