Web UI Automation and Test using PowerShell
Lately, I had to do lots of repetitive tasks at work in order to fix defects in an application. As I grew tired of filling numerous forms with the mandatory data needed for them to validate, I thought that I could maybe use PowerShell to automate the form-filling process.
I very quickly stumbled upon an MSDN article titled Web UI Automation with Windows PowerShell. It was definitely the way to go, but I was not really satisfied with everything I found over there.
Getting a Brower object
This part is similar to the one used in the MSDN article.
$global:ie = New-Object -com "InternetExplorer.Application" $global:ie.Navigate("about:blank") $global:ie.visible = $true
Please note that I will use $global:ie and $global:doc all the time, as global variables. I could have used $script: scope variables, but global are better for this task. Here, we are building a bunch of reusable function; declaring these variables as $global: ensures that the variables will be available in scripts that include this utility script. See here for more information on PowerShell variable scopes.
So this, sets $global:ie to a COM Internet Explorer object, navigates to “about:blank” and makes the window visible. Now we need some helper functions to navigate, fill forms and click elements easily.
Helper Functions to Navigate, Click, etc…
First of all, we need a method that yields the script while the current page is loading. So, what it should do is simple: wait for the current page to be loaded then set the $global:doc variable to the loaded document.
Function WaitForPage([int] $delayTime = 100) { $loaded = $false while ($loaded -eq $false) { [System.Threading.Thread]::Sleep($delayTime) #If the browser is not busy, the page is loaded if (-not $global:ie.Busy) { $loaded = $true } } $global:doc = $global:ie.Document }
I used an optional parameter here for the delay between checks of the browser’s status. It’s pretty straight forward, while the browser ($global:ie) is busy, wait. Once it’s not busy anymore, assign the document to the $global:doc variable. Let’s now define a function to navigate to a given url.
Function NavigateTo([string] $url, [int] $delayTime = 100) { Write-Verbose "Navigating to $url"; $global:ie.Navigate($url) WaitForPage $delayTime }
Note that I used Write-Verbose commands around in functions to output some useful information in my script, making it easier to spot mistakes while running it.
Now, as this was made to fill web forms, let’s define a function that will fill an input text field with a given value.
Function SetElementValueByName($name, $value, [int] $position = 0) { if ($global:doc -eq $null) { Write-Error "Document is null"; break } $elements = @($global:doc.getElementsByName($name)) if ($elements.Count -ne 0) { $elements[$position].Value = $value } else { Write-Warning "Couldn't find any element with name ""$name"""; } }
This is heavily used in my scripts. An HTML form always has lots of input elements that have unique names that need to be filled. So, if you need to fill the username input text with Philippe value, just call this function:
SetElementValueByName “username” “Philippe”
Note that there is also a option parameter that is used as the element’s position in the array returned by $global:doc.getElementByName. By default, the used position is 0 because most of the forms will only have one element with a given name. However, it can be that in (badly designed?) forms, two elements have the same name. In this case, you can specify which one you want to fill.
Now, I won’t explain all the functions, but here are the ones I wrote:
Function ClickElementByTagName($tagName, [int] $position = 0) { if ($global:doc -eq $null) { Write-Error "Document is null" break } $elements = @($global:doc.getElementsByTagName($tagName)) if ($elements.Count -ne 0) { $elements[$position].Click() WaitForPage } else { Write-Error "Couldn't find element ""$tagName"" at position ""$position"""; break } } Function ClickElementById($id) { $element = $global:doc.getElementById($id) if ($element -ne $null) { $element.Click() WaitForPage } else { Write-Error "Couldn't find element with id ""$id""" break } } Function ClickElementByName($name, [int] $position = 0) { if ($global:doc -eq $null) { Write-Error "Document is null" break } $elements = @($global:doc.getElementsByName($name)) if ($elements.Count -ne 0) { $elements[$position].Click() WaitForPage } else { Write-Error "Couldn't find element with name ""$name"" at position ""$position""" break } }
These are used to click on objects of the DOM in order to submit the form. These functions are not error proof, but as far as I used this stuff, I didn’t have issues.
A Little Example
Let’s write something very simple to test these functions. I will do a advanced search on google using these functions:
NavigateTo "http://www.google.com/advanced_search" SetElementValueByName "as_oq" "Unisys Fenix PLDA" SetElementValueByName "num" "30" SetElementValueByName "lr" "lang_en" ClickElementByName "btnG"
This gives you a little example of the kind of things you can do. It’s rather simple, but very powerful if you have repetitive tasks to on some web sites.
Download the full script here.
Java Inheritance VS C# Inheritance
This topic is very basic, but I felt like writing something on simple subjects that may be misunderstood. It is also a good excuse to go in language specifications and read all the details that most of the people don’t like to read… As I’m working in Java now, I quite like to compare the two languages to see where are the differences and make sure I don’t do any silly mistake.
So, when it comes to inheritance, there is a big difference between Java and C#.
Java Inheritance
In a Java subclass, you can override any method of the superclass. The method that is to be called is always determined at run time. So for example, if you write code like this:
public class Parent { public String sayHello() { return "Hello from Parent"; } } public class Child extends Parent { public String sayHello() { return "Hello from Child"; } }
If you create a new instance of the Child class, when you call the sayHello() method, it is always the Child one that will be called, no matter what the declaration class is. So, you can teat an instance of Child as an instance of Parent, but the methods called will be the ones from Child (if they are overriden, of course).
Code like this:
Parent o = new Child(); System.out.println(o.sayHello());
will output this:
Hello from Child
C# Inheritance
In C#, things are a bit different. C# language needs to be told which method can be overriden (declared as virtual), and which method overrides (declared as override). So to have the same behavior as Java, C# code has to look like this:
class Parent { public virtual String SayHello() { return "Hello from Parent"; } } class Child : Parent { public override String SayHello() { return "Hello from Child"; } }
If the virtual and override are omitted (or just the override, actually), then it is the Parent’s method that is executed when the Child object is declared as Parent.
In C# language specification, there is a clear explanation on how is behaves:
In a virtual method invocation, the run-time type of the instance for which that invocation takes place determines the actual method implementation to invoke. In a non-virtual method invocation, the compile-time type of the instance is the determining factor.
Summary
To sum it up, in Java the method called will be determined by the instantiation class, while in C# it will depend on how the class and the calling code is written. C# gives you much more flexibility, but it is more complicated and the capacity for a class to override a method is determined by its parent class. With Java, you loose the ability to call the parent’s method, but is that very useful? On the other hand, C# ensures that if you don’t want a method to be overriden, it won’t be.
Oh and we forgot to talk about the new keyword in C#. I never came across code that used it, but a nice description is given here.
Boolean Logical Operators and Boolean Conditional Logical Operators
I’m often amazed that most programmer don’t know that in C# and in Java you can use the simple & and | as logical operators. It seems than most people don’t event know that they exist!
But what’s the difference between these simple and double logical operators?
Back To Basics
Let’s see in the C# language specifications:
The result of
x&yistrueif bothxandyaretrue. Otherwise, the result isfalse.The result of
x|yistrueif eitherxoryistrue. Otherwise, the result isfalse.
Now let’s have a look in Java language specifications:
For
&, the result value istrueif both operand values aretrue; otherwise, the result isfalse.For
|, the result value isfalseif both operand values arefalse; otherwise, the result istrue.
It’s pretty clear, it does what most of the people would expect them to do, and and or operations.
But, then, what are those && and || that most developers use everywhere, wasting bytes like there’s now tomorrow?
Again, specifications are there to give full explanations.
In C#:
The
&&and||operators are conditional versions of the&and|operators:
- The operation
x&&ycorresponds to the operationx&y, except thatyis evaluated only ifxistrue.- The operation
x||ycorresponds to the operationx|y, except thatyis evaluated only ifxisfalse.
And in Java:
The
&&operator is like&(§15.22.2), but evaluates its right-hand operand only if the value of its left-hand operand istrue.The
||operator is like|(§15.22.2), but evaluates its right-hand operand only if the value of its left-hand operand isfalse.
Tadaaaam! Now it makes perfect sense, doesn’t it? These operators that most of the people use everywhere are smart and only evaluate the right-side if it is of any use (if left-hand side is true for and, if left-hand side is false for or).
So, let’s give a small example of how this can be useful. Let’s pretend you have an object that has a member that can be null. With these operators, you can test it without any fear of the dreaded NullReferenceException or NullPointerException. Here is a small piece of C# code that shows the point:
var p = new Parent { Name = "Father" }; p.Childs = new Person[] { new Person() { Name = "Son" } }; if (p.Childs != null && p.Childs.ElementAt(0).Name != null) { Console.WriteLine("First Child's name: {0}", p.Childs.ElementAt(0).Name); }
If Child collection was left null, this code would still work even though p.Childs.ElementAt(0) would normally throw an ArgumentNullException. As p.Childs != null returns false, the right-hand operand is not evaluated so no exception is thrown.
Now, as you may say “Well in this case, why does it matter? Let’s use && and || everywhere, so we make no mistake!”. Technically, it is true. However, as one of my University teacher said:
Les gens qui utilisent les opérateurs conditionnels booléens partout sont des gens qui ne savent pas ce qu’ils font.
Translation: “Peoples who use conditional logical operators everywhere are peoples that don’t know what they’re doing”.
Of course, his point was that you should understand the code you are writing and that you should know when which part of an expression can be evaluated. I have no clue if there is a performance gain when using non conditional operators, nor if there is any compiler optimization of any kind.
The Third Operator
There is also a third operator in both languages: ^.
In C#:
The result of
x^yistrueifxistrueandyisfalse, orxisfalseandyistrue. Otherwise, the result isfalse. When the operands are of typebool, the^operator computes the same result as the!=operator.
In Java:
For
^, the result value istrueif the operand values are different; otherwise, the result isfalse.
I’v never seen this operator used. As pointed in C# specification, this operator has the same result as the != operator.
And, of course, it doesn’t make any sense to have a ^^ operator, as both operand have to be evaluated in all the cases…
