Sunday, 25 March 2018

Captcha Automation using Selenium

Automating captcha using Selenium and Tessaract

<b>Problem Statement:</b>
Automating captcha in web pages without downloading the captcha image.

<b>Technical Stack:</b>
Selenium
C#
Tessaract OCR

<b>Solution:</b>
Tessaract OCR:
Tessaract is an open source OCR  (Optical Character Recognition) engine developed by Google. It is used to extract text from image.
The latest Nuget package for Tessaract can be found in the below link
https://www.nuget.org/packages/Tesseract/
                (PM> Install-Package Tesseract -Version 3.0.2)
Download and install latest version of Tessaract from the below link
<i>https://digi.bib.uni-mannheim.de/tesseract/ </i>
Copy the “tessdata” folder from the installed location and paste in the project directory.

<b>Selenium:</b>
Identify the captcha image using Selenium WebDriver.
The captured WebElement (Captcha Image) has to be passed to Tessaract class to extract the text from image.

<b>Source Code:</b>
public static string GetCaptchaText(IWebElement element)
{
var ocrText = string.Empty;
try
{
ITakesScreenshot screenshotDriver = ReferenceBrowser as ITakesScreenshot;
Byte[] arrScreen = screenshotDriver.GetScreenshot().AsByteArray;

var msScreen = new MemoryStream(arrScreen);
Bitmap screen = new Bitmap(msScreen);

//to get absolute point of the element
IJavaScriptExecutor executor = (IJavaScriptExecutor)ReferenceBrowser;
var yOffsetObject = executor.ExecuteScript(“return window.pageYOffset;”);
int offsetY = Convert.ToInt16(yOffsetObject);

Point point = new Point();

point.X = element.Location.X;
point.Y = (element.Location.Y – offsetY);
var rectCrop = new Rectangle(point, element.Size);
Image imgCap = screen.Clone(rectCrop, screen.PixelFormat);

Bitmap imgSource = new Bitmap(imgCap);

//Point to the tessdata location and define locale.
var engine = new TesseractEngine(System.IO.Directory.GetParent(BasePath).Parent.FullName +”\\tessdata”, “eng”, EngineMode.TesseractAndCube);
var img = PixConverter.ToPix(imgSource);
var page = engine.Process(img);
ocrText = page.GetText().ToString().Trim().Split(‘\n’)[0];
}
catch (Exception e)
{
Console.WriteLine(e.StackTrace);
}
return ocrText;
}

<b>LIMITATIONS:</b>
Can only automate only simple captcha which are clear to read.
Accuracy level is 90% *
*Accuracy varies according to the captcha image and the Tessaract version.

Sunday, 2 February 2014

Selenium - Webdriver wait functions

Explicit and Implicit Waits

Waiting is having the automated task execution elapse a certain amount of time before continuing with the next step.

Explicit Waits

An explicit waits is code you define to wait for a certain condition to occur before proceeding further in the code. The worst case of this is Thread.sleep(), which sets the condition to an exact time period to wait. There are some convenience methods provided that help you write code that will wait only as long as required. WebDriverWait in combination with ExpectedCondition is one way this can be accomplished.

WebDriver driver = new FirefoxDriver();
driver.get("http://somedomain/url_that_delays_loading");
WebElement myDynamicElement = (new WebDriverWait(driver, 10)) 
.until(ExpectedConditions.presenceOfElementLocated 
(By.id("myDynamicElement")));
 

Implicit Waits

An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0. Once set, the implicit wait is set for the life of the WebDriver object instance.


 WebDriver driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);

Saturday, 1 February 2014

Selenium


Starting with Selenium - Eclipse

Open Eclipse.
File -> New->Project->Java Project

Download the Selenium library files from the following link
Download the library files for Java language.

Right click on the Project name and select Properties.
Select Java build path.
Click on the Libraries tab and click on Add external jars and add all the selenium jar files.

 Run your first automation test with the following code.


// package org.openqa.selenium.example;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.support.ui.WebDriverWait;

public class Selenium2Example  {
    public static void main(String[] args) {
        // Create a new instance of the Firefox driver
        // Notice that the remainder of the code relies on the interface,
        // not the implementation.
        WebDriver driver = new FirefoxDriver();

        // And now use this to visit Google
        driver.get("http://www.google.com");
        // Alternatively the same thing can be done like this
        // driver.navigate().to("http://www.google.com");

        // Find the text input element by its name
        WebElement element = driver.findElement(By.name("q"));

        // Enter something to search for
        element.sendKeys("Cheese!");

        // Now submit the form. WebDriver will find the form for us from the element
        element.submit();

        // Check the title of the page
        System.out.println("Page title is: " + driver.getTitle());
       
        // Google's search is rendered dynamically with JavaScript.
        // Wait for the page to load, timeout after 10 seconds
        (new WebDriverWait(driver, 10)).until(new ExpectedCondition<Boolean>() {
            public Boolean apply(WebDriver d) {
                return d.getTitle().toLowerCase().startsWith("cheese!");
            }
        });

        // Should see: "cheese! - Google Search"
        System.out.println("Page title is: " + driver.getTitle());
       
        //Close the browser
        driver.quit();
    }
}