This article is about Selenium and the difference between Selenium IDE and Selenium WebDriver. It also discusses how to convert Selenium 1 tests to Selenium 2 tests and, in the last section, covers the current state of Selenium 2 and mentions some issues one can encounter when using the WebDriver API.

What is Selenium?

As the Selenium homepage states, Selenium automates browsers. With Selenium, you can script your browser. Usually, it is used for automated testing of web applications, but sometimes it is used to automate certain repetitive web-based tasks as well.

Selenium tests

When using Selenium tests, there is one big choice one has to make: whether to use Selenium IDE or Selenium WebDriver.

Selenium IDE tests

Selenium IDE is a Firefox add-on that will do simple record-and-playback of interactions with the browser.
You can download it from the Selenium website and install it as a plugin. Once it is installed, open Firefox, go to Tools>Selenium IDE. We are going to record a simple test that goes to dzone.com and clicks on the “New Links” link.
To record, just press the red button in the IDE. Selenium IDE starts recording now. Everytime you click or type something in Firefox, the action will be recorded in the script, and in the end, Selenium IDE will be able to reproduce everything you did when interacting with the browser.

Selenium IDE

In our case, the resulting Selenium 1 tests looks like this(click on Source in Selenium IDE):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head profile="http://selenium-ide.openqa.org/profiles/test-case">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="selenium.base" href="http://www.dzone.com/" />
<title>New Test</title>
</head>
<body>
<table cellpadding="1" cellspacing="1" border="1">
<thead>
<tr><td rowspan="1" colspan="3">New Test</td></tr>
</thead><tbody>
<tr>
	<td>open</td>
	<td>/links/index.html</td>
	<td></td>
</tr>
<tr>
	<td>clickAndWait</td>
	<td>link=New links</td>
	<td></td>
</tr>
</tbody></table>
</body>
</html>

To run the test, you can just click the green arrow. If you do, Selenium IDE goes to the dzone.com website and clicks the “New Links” link.

Selenium WebDriver tests

Selenium WebDriver is a collection of language specific bindings(Java, Python,..) to drive a browser. Unlike Selenium IDE, Selenium WebDriver uses native calls to the browser.
It was introduced with Selenium 2.0, which is a new and extended version of the Selenium 1.0 API. Usually, when talking about Selenium 2.0, people mean Selenium making use of the WebDriver API.
Instead of recording tests with a plugin from your browser, Selenium WebDriver expects you to write code . If you want to code along in Java, you will have to fire up your Java IDE – I am going to use Eclipse myself – and make sure you import the Selenium library jar and the driver jars into the project. If you are using Maven, the dependencies would look like:

<dependency>
        <groupId>org.seleniumhq.selenium</groupId>
	<artifactId>selenium-htmlunit-driver</artifactId>
	<version>${selenium.version}</version>
</dependency>
<dependency>
        <groupId>org.seleniumhq.selenium</groupId>
	<artifactId>selenium-chrome-driver</artifactId>
	<version>${selenium.version}</version>
</dependency>
<dependency>
	<groupId>org.seleniumhq.selenium</groupId>
	<artifactId>selenium-firefox-driver</artifactId>
	<version>${selenium.version}</version>
</dependency>
<dependency>
        <groupId>org.seleniumhq.selenium</groupId>
	<artifactId>selenium-ie-driver</artifactId>
	<version>${selenium.version}</version>
lt;/dependency>
<dependency>
	<groupId>org.seleniumhq.selenium</groupId>
	<artifactId>selenium-support</artifactId>
	<version>${selenium.version}</version>
</dependency>

Note: Selenium libraries for other programming languages such as Python are available as well. The approach for those is just the same.

Our Selenium 2 test looks like this:

public class SeleniumTestCase {
	@Test
	public void testFramework() {
		File firefoxBin =
                     new File("C:\\Users\\Public\\Mozilla Firefox 3.6\\firefox.exe");
		FirefoxBinary firefoxBinary = new FirefoxBinary(firefoxBin);
		final FirefoxDriver firefoxDriver =
                     new FirefoxDriver(firefoxBinary, null);

		firefoxDriver.get("http://www.dzone.com");

		final DzonePage dzonePage = new DzonePage();

		WebDriverWait wait = new WebDriverWait(firefoxDriver, 30*1000, 200);
		wait.until(new ExpectedCondition() {
		      @Override
		      public Boolean apply(WebDriver driver) {
		    	  PageFactory.initElements(firefoxDriver, dzonePage);
		    	  return dzonePage.newLinksLink.isDisplayed();
		      }
		});

		dzonePage.newLinksLink.click();
	}
}

First, the driver is instantiated. For Firefox, it is necessary to supply the driver with the location of the Firefox binary. For some drivers, such as the InternetExplorerDriver, this is not necessary.
Then we instruct the driver to send a http get request to www.dzone.com. After that, we wait for the page to load. We do this by instantiating WebDriverWait, which wraps our driver into a construct which allows it to wait for a condition. In this case, we want to wait for the “new links” link to be displayed, so we can click it.
Note that in Selenium 2 there is a Page class concept. This page is an abstract representation of the opened webpage and looks like this:

public class DzonePage{
	@FindBy(linkText = "New links")
	public WebElement newLinksLink;

	@FindBy(tagName = "body")
	public WebElement body;
}

From within Eclipse, this test can be run like “Run as>jUnit Test”.

Converting from Selenium 1 to Selenium 2

If you have a bunch of selenium 1 tests you have to convert, you can consider automating the conversion. The selenium1 tests can easily be parsed as xml files and you can come up with code snippets that map one to one on the selenium1 commands.
However, writing a framework like that is beyond the scope of this article. I am just going to discuss some issues I encountered when doing such conversion.

Coming up with Selenium WebDriver equivalents of the Selenium IDE commands.

Although many mappings are rather straightforward, and there is also a guide for converting Selenium 1 to Selenium 2 tests, some are a bit harder to come up with.
For example, it is often necessary to wait for a certain element to load after doing a click. We already saw that in our example. Generic code that waits on the element that needs to be loaded, could be implemented like this:

protected final void waitForExistsAndVisible(final WebElement element) {
     seleniumContext.getWebDriverWait().until(new ExpectedCondition() {
	@Override
	public Boolean apply(WebDriver driver) {
		return element.isDisplayed();
	}
     });
}

From time to time, there are some catches too. For example, to wait for an element not to be present, one could expect driver.findElement to return null if the element is not present. However, this method never returns null. If the element is not present, a NoSuchElementException is thrown instead. Hence, the method needs to look like:

protected final void waitForElementNotPresent(final String htmlId) {
     seleniumContext.getWebDriverWait().until(new ExpectedCondition() {
	@Override
	public Boolean apply(WebDriver driver) {
	     try {
		driver.findElement(By.id(htmlId));
		return false;
	     } catch (NoSuchElementException e) {
		return true;
	     }
	}
     });
}

There are also some more exotic ones, that dig a bit deeper into the API, such as code to assert and dismiss an alert:

protected final boolean assertAndDismissAlert(String alertText) {
     Alert alert = seleniumContext.getWebDriver().switchTo().alert();
     if (alert != null) {
	boolean ok = alertText.equals(alert.getText());
	alert.dismiss();
	return ok;
     } else {
	return false;
     }
}

Or to check whether an element has focus:

protected final boolean hasFocus(WebElement element) {
     WebElement focusedElement = seleniumContext.getWebDriver()
          .switchTo().activeElement();
     return element.equals(focusedElement);
}

Selenium WebDriver issues

Since Selenium WebDriver is a rather new library, and its a rather ambitious project, I guess it is to be expected there are some issues with it. For example, there appear to be some irregularities between what a human user sees/can interact with in the browserwindow and what the driver sees/can interact with.

a. Element is visible for user, but not for Selenium 2

Selenium 2 takes some assumptions that do not always hold. For example, if a dom element has a height or width of 0, Selenium 2 thinks it is not visible/not clickable, while in reality, it often is. If an element has a css float attribute set, the height becomes 0 but the element is still visible.
A workaround that usually works is to click an element inside the element that has height 0.

b. Element not found

There are differences in behavior between drivers. This problem is an intrinsic effect of the WebDriver using the browser native APIs.
Because of this, different browserdrivers can react a bit different. For example, in Firefox, clicking on a span element within a link works, in IE6 it does not.
Also, different browsers have slightly different xpath implementations, which is something to keep in mind too.

c. Click problems

Selenium sometimes doesn’t correctly click an element on certain browsers. This appears to be a viewport problem. If the element is not in the current browser view, Selenium2 does not scroll to the link, the click does not trigger anything, although no Exception is thrown. This can happen with other commands on WebElements too, such as when calling clear() on a input WebElement.

d. Selenium 2 sometimes shows very specific OS/browser dependent behaviour

The previous problem is probably a more specific case of this one, and we are giving another example of it here.
With Firefox, unlike with IE, the driver always scrolls to the link first before it clicks it. However, once we had a bunch of tests failing for Firefox on Windows while all these tests were green for Firefox on Ubuntu.
Apparently, there was some Javascript on the page that caused a breadcrumbs bar to scroll with the page. On Windows, to click on a link, the Selenium2 driver scrolled to the link(to make it visible) and then wanted to click on it. However, right after the scroll, the breadcrumbs appeared over the link to click, and by the time the driver simulated the click, the link was not clickable anymore so nothing happened and all the mentioned tests timed out.