How to emulate real users touching your web apps using Python controlling Selenium and Beautiful Soup for reading HTML
Overview
- Some misconceptions
- Running example
- Docker images
- Ansible
- Run using Maven after GitHub
- Maven
- View Sample Selenium scripts
- Selenium3Hello1
- Invoke sample command
- Coding
- Obtain jars and drivers
- Maven
- Manual Download
- Eclipse IDE
- IntelliJ IDE
- Java Coding
- Add-on functionality
- Rock Stars
- Video Tutorials
- CA DevTest
- SauceLab Cloud
- Python Web scraping
- Other info
This article contains notes on installing, coding, and running Selenium.
- Run in Google Cloud a Docker image contaning Selenium and associated software.
- Run Docker image containing Selenium built using Ansible scripts that created the image.
-
Run after download from GitHub, then invoke “as is” using Maven
- Explain sample files.
- Invoke a run using starter samples.
- Run across various browsers - Firefox browser, IE, etc.
- Obtain jars and drivers if they have changed.
-
Adapt the starter a basic starter (Java in Selenium driving Chrome).
- Update of results to SonarQube.
- Add CSV data processing
- Add Excel data processing
- Add OpenCV (via SikuliX2) to recognize portions of pictures
-
Add Tesseract to extract text from pictures (OCR = Optical Character Recognition)
- Run by CA DevTest
- Run in SauceLab Cloud
Some misconceptions
When we mention “Selenium” and “performance testing” in the same sentence, the first thing that most people say is “you can only run a few users on a machine”.
But I’m not talking about running several browser instances on each single machine.
I’m talking about running Selenium once while another program converts what goes back and forth over the network into a script.
The secret sauce is …
Running example
Selenium has no GUI. It runs as a console
However, reports are produced by TestNG, a plug-in to Selenium.
Docker images
Docker images containing Selenium server:
Ansible
Ansible task files to establish Selenium server:
- https://github.com/arknoll/ansible-role-selenium
- https://github.com/quarkslab/ansible-selenium-server
- https://mtlynch.io/testing-ansible-selenium/
Run using Maven after GitHub
- Install Maven.
- Install Selenium.
- Install the various browsers Selenium will control (Chrome, Firefox, etc.).
- Navigate to or create a folder to hold a new folder to be created by Git.
-
Clone from GitHub a repository containing sample tests:
git clone https://github.com/wilsonmar/Selenium-samples
cd Selenium-samples -
Select a folder:
cd Python-soup
Look at the root layer of the repository. These files are there for use with Eclipse IDE:
- .classpath
- .project
- .settings
- .metadata
Some prefer to add them in .gitignore so they are not in the repo.
.gitignore .DS_Store
.DS_Store
files should be ignored. They are created by MacOS. There is an entry for it in the.gitignore
file so they are not stored in GitHub.Maven
-
Invoke Maven to download dependencies and run Selenium:
mvn clean verify -Pbrowser-phantomjs
Maven creates a folder named
target
to receive downloads before starting Selenium.verify -Pbrowser-phantomjs
specifies use of the PhantomJS headless browser. Alternately, other browsers:mvn clean verify -Pbrowser-chrome
mvn clean verify -Pbrowser-firefox
mvn clean verify -Pbrowser-edge
mvn clean verify -Pbrowser-internet-explorer
mvn clean verify -Pbrowser-opera
PROTIP: Several separate runs are needed to test on several browsers.
At the end of the run, you should see:
SUCCESS
View Sample Selenium scripts
-
If you’re not using an IDE, use the Atom text editor to open a folder list:
atom .
-
View the
pom.xml
file.The browsers handled by Selenium are identifined by a profile with a property within the
pom.xml
file which also specifies to Maven what dependencies to download and how to run Selenium.PROTIP: If you work within enterprise firewalls, change the external URLs to internal ones, which may be managed within Nexus or Artifactory servers.
-
Dive into folder to view:
src/test/resources/webdrivermanager.properties
.Here is where Maven knows to download drivers.
PROTIP: The
LATEST
is specified. But a specific version would ensure that all drivers downloaded are the ones previously tested to work with each other. Specific versions can be specified with java invocation:-Dwdm.chromeDriverVersion=2.25 -Dwdm.internetExplorerVersion=2.46 -Dwdm.operaDriverVersion=0.2.0 -Dwdm.edgeVersion=3.14366 -Dwdm.phantomjsDriverVersion=2.1.1 -Dwdm.geckoDriverVersion=0.11.1
-
Dive into the
src
folder.Notice there is a
main
and atest
folder. Under each is a folder path:
java/selenium/utils
At the end of that path under
main
contains aTestUtils.java
file which defines generic Java utility functions such as randomBetween, isDuplicatePresent, isAllEquals.Selenium is all about testing, so the end of the path under
test
contains many more java files to control the browser.test/java/selenium/configurations/TestConfig.java
makes use of propertiesbrowser.name
andbase_url
retrieved by variables in
main/java/selenium/configurations/TypedProperties.java
.PROTIP: Properties controlling a specific test are defined in properties files rather than hard-coded into code so that different properties can be used during a run by temporarily replacing the file.
SeleniumTestWrapper
Tests are driven by a wrapper which App-specific test code extend:
-
Look in
SeleniumTestWrapper.java
defined as aabstract class
which are extended by other code.Code in the file controls agent strings and cookies that browsers automatically send back to servers.
The code also manages the screen dimensions of the browser window.
These enable app-specific test code to focus on business. Under the Annotations folder are files that define code generation by the Java compiler.
Annotations
Each Java compiler annotation: @Before, @Rule, and @After is defined within code imported:
import org.junit.Before; import org.junit.Rule; import org.junit.After;
The annotation code imported add additional functionality such as logging. These decorators also provide metadata (data about data).
Instead of using Java inheritance, Java frameworks Spring and Hibernate use AOP (Aspect oriented programming) to provide a mechanism to inject code for preProcessing and postProcessing for an event. A “hook” in code before and after a method execution for consumer code in those places.
App-specific Tests
-
Edit the file defined to test an app:
src/test/java/selenium/testcases/SearchIT.java
The file contains annotations:
@BrowserDimension(XLARGE) @Browser(skip = { INTERNET_EXPLORER, EDGE, PHANTOMJS })
These annotations are defined by imports:
import selenium.utils.annotations.browser.BrowserDimension; import selenium.utils.annotations.browser.Browser;
The earlier version, 2.53 and below, used these libraries:
import org.openqa.selenium.Dimension;
The earlier 2.x code was:
WebDriver driver = new FirefoxDriver(); driver.navigate().to("http://google.co.in"); System.out.println(driver.manage().window().getSize()); Dimension d = new Dimension(420,600); //Resize the current window to the given dimension driver.manage().window().setSize(d);
QUESTION: How can we read the code in these annotation libraries?
Page Objects
The core driving code refers to definitions within the pageobjects folder:
StartPage startPage = PageFactory.initElements(getDriver(), StartPage.class); HeaderSearch search = PageFactory.initElements(getDriver(), HeaderSearch.class); SearchResultPage searchResultPage = PageFactory.initElements(getDriver(), SearchResultPage.class);
See Selenium Design Considerations
The Page Object design pattern separates locator definitions in a separate java file so that various tests only need to refer to a reference rather than reduntantly specifying the way to locate objects on each page. This reduces maintenance over time as the website HTML changes. Change the locator technique in one place and all tests are good again.
@find
The basic rule is that tests don’t declare variables (to manage state on their own), manipulate the DOM directly, nor create objects (using the “new” constructor keyword).
Webdriver Install
There are Web Driver programs for each combination of operating system (macOS, Windows, Linux, etc.), internet browser (Chrome, Firefox, etc.) and each version (78, 79, etc.). This means the Web Driver you install today would likely be obsolete when a new version of the browser is automatically installed on your machine.
PROTIP: There are two ways to install Selenium Web Driver. One is to manually install for whatever is your current version. The other is to use a Webdriver package manager.
Webdriver Manager
Professor Boni Garcia (in Madrid, Spain) took the initiative to create and maintain https://github.com/bonigarcia/webdrivermanager
- Checks for the latest version of the WebDriver binary
- Downloads the WebDriver binary if it’s not present on your system
- Exports the required WebDriver Java environment variables needed by Selenium
Its automation makes use of Maven utility commonly used by Java.
The WebDriver can download files from an open source repository:
Selenium3Hello1
Obtain Selenium to test a sample call to Google Search:
-
Selenium3Hello1 is used to verify whether the Selenium core install works. It doesn’t use any browser driver.
-
Selenium3FirefoxGoogleSearch1
-
Selenium3GoogleSearch1 works on multiple browsers.
Look into the folder:
NOTE: It is not best practice, but the sample scripts in my GitHub contains binary files copied from binary repositories.
File Sea.jpg
is required only by the sample program from Neotys.
PROTIP: For team/production coding, place photos in a folder such as pics
.
The file is placed at the project’s root folder to make it easy to specify its path.
File chromedriver.exe
and other browser driver files are at the root of each script folder.
The more correct way is to specify the files (and their specific versions) in a pom.xml file that point to the location of those external dependencies, and then have each user of the repository to run Maven to obtain those files.
https://github.com/Ardesco/Selenium-Maven-Template/blob/master/pom.xml
Nevertheless, the drivers are included so you can get going quickly.
Invoke sample command
On Windows:
It is assumed that the Java program is within the PATH which the operating system looks for executables.
-
View the sample command file Selenium3Chrome1.bat
REM cd Selenium3Usahidi1NeoloadChrome1 REM Selenium3 must use Java 1.8+ javac Selenium3Usahidi1NeoloadChrome1.java java.exe -jar Selenium3Usahidi1NeoloadChrome1.jar ^ -Dnl.selenium.proxy.mode=Design ^ -Ddriver=chromedriver.exe ^ -Druntype=Landing ^ -Dimg=Sea.jpg
PROTIP: Windows command prompt (cmd.exe) allows the umlaut ^ (Shift + 6) character to indicate line continuation.
- Open a Terminal window.
-
Run the sample Selenium Java program on a Windows machine:
Selenium3Usahidi1NeoloadChrome1.bat
On Macs & Linux
-
View the sample command file Selenium3Chrome1.sh
# cd Selenium3Usahidi1NeoloadChrome1 # Selenium3 must use Java 1.8+ javac Selenium3Usahidi1NeoloadChrome1.java java -jar Selenium3Usahidi1NeoloadChrome1.jar \ -Dnl.selenium.proxy.mode=Design \ -Ddriver=chromedriver \ -Druntype=Landing \ -Dimg=Sea.jpg
Notice there is no “.exe” in the driver.
PROTIP: Bash shell scripts use the back-slash character (above the Enter/return key) to indicate line continuation.
- Open a Terminal window.
-
Grant permissions:
chmod +X *.sh
-
Run the sample Selenium Java program:
Selenium3Usahidi1NeoloadChrome1.sh
Coding
Drivers for browsers go into the Reference folder.
Implicit waits. Don’t use them. Especially with explicit waits.
Cross platform
Windows vs. Mac vs. Linux
Cross browser
system.setProperty("webdriver.gecko.driver", "\\selenium-java-3.5.0")
Obtain jars and drivers
There are the ways to assemble what Selenium needs:
a). Copy a working Selenium project (as shown above) that already has the files needed. Then edit it for your own uses. This is the quickest and simplest way.
b). Copy set of jars from an in-house binary respository (such as Nexus or Artifactory).
https://selenium-release.storage.googleapis.com/index.html
c). MavenSelenium”>Code in pom.xml and run Maven (or equivalent tool such as Ant).
d). Manual download from websites, which you’ll need to do when a new version comes along.
e). Build from source.
PROTIP: Download all the latest install files at one sitting to test compatibility of the set. Assemble them together in a single folder for copying into each Selenium project folder so each can stand alone when distributed.
This explains the driver operation.
Maven
How to Install Selenium WebDriver With Java And Maven On Mac OS X10 - a 13 Minute SpeedRun
Manual Download
If you’re using a sample script, skip this.
PROTIP: Download files for all operating systems so the scripts begin as cross-system capable (works on Windows, Mac, Linux).
-
Under the “Selenium Standalone Server” section heading, click the link next to “Download version”:
Download version 3.5.0
-
Click Save of file named with the same version number, such as:
selenium-server-standalone-3.5.0.jar
to the Downloads folder.Java bindings
-
Under the “Selenium Client & WebDriver Language Bindings” section heading, click the link click the Download link associated with the programming language you use, such as Java.
-
Click Save of file named with the same version number:
selenium-java-3.5.0.zip
to the Downloads folder.Windows Internet Explorer driver
-
On Windows only, under “The Internet Explorer Driver Server” section heading, click the link 64-bit Windows IE
-
Click Save of file named with the same version number, such as:
IEDriverServer_x64_3.5.0.zip
to the Downloads folder.Mac Safari WebDriver
This section is only applicable on an Apple Mac.
Under the “Safari” section heading in the Selenium webpage, the SafariDriver.safariextz file is deprecated. So no need to download it.
Starting with Safari 10 that comes with OS X El Capitan and macOS Sierra, in 2016, WebKit running Java 1.8+ supports the new W3C WebDriver browser automation API. So Selenium automatically launches the driver without further configuration.
However, it needs to be enabled because it’s off by default.
- Open the Safari browser.
-
Press command+, (comma) or in the menu bar, select Safari Preferences Advanced tab (wheel icon). - Check “Show Develop menu in menu bar” to see “Develop” appear in the menu bar behind the dialog. Exit the dialog.
-
Click “Develop” on the menu bar and select “Allow Remote Automation”.
Authorize safaridriver to launch the webdriverd service which hosts the local web server.
-
Open a Terminal window to run:
cd /usr/bin
./safaridriver -p 5678BLAH: The response asks for a port number.
Usage: safaridriver [options] -h, --help Prints out this usage information. -p, --port Port number the driver should use. If the server is already running, the port cannot be changed. If port 0 is specified, a default port will be used.
-
Complete the authentication prompt.
-
To verify, run the Selenium script:
SeleniumSafariGoogleSearch1.java
Eclipse IDE
### New Lib folder
- Within Eclipse, right-click on your project name to select New, Folder.
-
Type in name “lib” (for library). Finish.
Server into lib folder
- Press command+tab to switch to the File Explorer window.
- Drag from within the Downloads folder file
selenium-server-standalone-3.5.0.jar
and drop it within the lib folder when the mouse turns into a “+” sign. -
Click OK to the pop-up dialog for Copy files.
Selenium Java into lib folder
- Press command+tab to switch to the File Explorer window.
-
Within the Downloads folder, unzip by double-clicking on file
selenium-java-3.5.0.zip.On a Mac, this should result in the creation of folder
selenium-java-3.5.0. But ” if you see a .cpgz file created, move that to trash and use another utility such as the “RAR Extractor” utility or unzip in a Terminal window. -
Dive into the folders to drag
selenium-java-3.5.0.jar
to the lib folder. -
Click OK to the pop-up dialog for Copy files.
Eclipse folder
- In Eclipse, press command + I or right-click on your project to select Properties.
- At Resource, Location is the path of the project.
-
Click on the door icon and a Finder window opens up.
new References folder for Browser drivers
- Double-click to expand selenium-java-3.5.0.zip.
- Drill down to the “selenium-3.5.0” folder.
- Drill down to that jar.
- Drag the client-combined-3.5.0-nodeps.jar into the lib folder. ???
- Choose Copy Files. OK.
- Drill down into the lib folder within the selenium-java-3.5.0 folder.
-
Select all the files.
NOTE: Previous versions did not include JUnit and others.
- Drag them all into the lib folder.
-
Choose Copy Files. OK.
Establish lib as References
To make Eclipse recognize the jar files:
-
Highlight the two files in the lib folder.
Click first file. Hold Cntrl while clicking the second file.
-
Right-click for Build Path. Add to Build Path.
A Referenced Libraries item should appear under Package Explorer.
Download Chrome driver
-
Open Chrome browser, finish work on all windows as you’ll need to relaunch by the time this is done.
http://sites.google.com/a/chromium.org/chromedriver/downloads
- Identify your version by clicking the three-dot icon at the upper-right, select Help, About Chrome. This would automatically update the browser to the latest version. Take a break because this takes several minutes.
- Click “Relaunch” and wait until you see “Google Chrome is up to date”.
-
Notice “Version 79.0.3945.79” or whatever it is what you do this.
-
Navigate to:
- Scroll down to the bottom of the page to click the highest number folder (above the LATEST_RELEASE links).
-
Right-Click the file for your operating system, such as:
chromedriver_mac64.zip
- Select “Save Save Link As…” in the pop-up.
-
Click “Save” and wait for the download to finish.
- Click on the file to unzip it to file chromedriver.
- Open another a new Finder window to /usr/local/bin.
-
Use your mouse to drag to move the chromedriver file to that folder.
- Restart your Chrome.
-
Try your Selenium WebDriver code.
Use Gecko Firefox driver
Video 26 May 2016
-
Navigate to
https://github.com/mozilla/geckodriver/releasesThe “Latest release” is shown at the top. Later releases are below.
Mozilla also calls it the “Marionette Proxy”.
-
Scroll down to the Downloads section and click on the file for your operating system.
geckodriver-v0.18.0-macos.tar.gz for Mac (1.31 MB)
geckodriver-v0.18.0-win64.zip for Windows
-
Unzip the macos file for the geckodriver executable.
-
Unzip the win64 file for the geckodriver.exe executable.
- Copy the chromdriver file to /usr/local/bin folder. ???
-
Restart your system and try your Selenium WebDriver code as shown in video.
Headless drivers
Ghost Driver or PhantomJS turns Selenium “headless”, accessing the DOM.
HTML Unit from https://selenium-release.storage.googleapis.com/index.html selenium-html-runner-3.5.0.jar
Junit into lib folder
- Navigate to
http://sourceforge.net/projects/junit/files/junit -
Click the text to the right of “Looking for the latest version?”, such as Download junit-4.10.jar (253.2 kB).
This downloads file junit-4.10.jar (from 2011).
-
Drag from within the Downloads folder file
junit-4.10.jar
and drop it within the lib folder when the mouse turns into a “+” sign.New Class
- In the Package Explorer, right-click on “src”. Select New, Java Class.
- For Package, type “com.demo.testcases”.
- For Name, type “SeleniumFirefoxDemo1”.
- Check “public static void main(Strong[] args)”. Finish.
IntelliJ IDE
VIDEO: How to Install Java, Maven and IntelliJ on Apple Mac
Java Coding
TODO: Annotations
## TODO: Random number
Add-on functionality
### TestNG
TestNG has more in-built annotations than JUnit, making testing easier.
TestNG requires a download from http://testng.org/
Its @DataProvider and parameters enables data-driven testing.
JUnit does not generate a HTML reports. But TestNG generates an XSLT report.
- https://www.youtube.com/watch?v=OTtFSnZY4f8
### Logging
VIDEO on emitting industry-standard logs.
VIDEO on Advanced topics.
import org.apache.log4j.Logger;
public class LogDemo {
public static void main(String[] args){
Logger logger=Logger.getLogger("LogDemo"); // the class name
}
}
- https://www.youtube.com/watch?v=0UQ9pAlY3qg
Read CSV files
To get your Selenium Java code to read CSV files:
Specify the dependency in Maven, Ant, etc. or
-
Download from https://mvnrepository.com/artifact/net.sf.opencsv/opencsv/
opencsv-2.3.jar
-
Add in your Java code:
List<MyBean> beans = new CsvToBeanBuilder(FileReader("yourfile.csv")) .withType(Visitors.class).build().parse();
Read Excel files
Video tutorials Data driven framework, Read, Write, BLOG
To get your Selenium Java code to read Excel files:
-
Download Apache POI - the Java library for Microsoft documents from
https://poi.apache.org/download.html
the binary distribution for the latest stable version, such as
poi-bin-3.16-20170419.tar.gz -
Unzip using RAR so you don’t extract
-
Select the file under the HTTP heading.
Rock Stars
Simon Stewart (@sha98c, blog.rocketpoweredjetpants.com) invented WebDriver, now Lead Committer
- State of the Union at SeleniumConf Austin 5 April 2017. says the original vision was:
FIT Test > FIT Fixture > Page Objects > WebDriver
“Definitely do not use FIT”
“We test so that when we release software, we are confident it works, as early as possible.”
- Automation Best Practices 14 Jul 2016
Video Tutorials
http://learn-automation.com
- Selenium WebDriver Eclipse Java Project Setup: For the absolute beginner on Windows with Java 1.6 [13:58]
https://www.youtube.com/watch?v=ZUM9jEhLie0
https://www.youtube.com/watch?v=E3hKgb4aLHM
https://www.youtube.com/watch?v=X8Xw7FWw49E
https://www.dropbox.com/s/inuirqwhlr3w7zf/slides.pdf?dl=0 Dave Hoeffer
https://www.youtube.com/watch?v=zylSll8hsPs
https://www.youtube.com/watch?v=nq97dfaVmC4
CA DevTest
DevTest
SauceLab Cloud
-
Open an account at SauceLab.com.
-
set environment variables SAUCE_USERNAME SAUCE_ACCESS_KEY
SELENIUM_BROWSER SELENIUM_VERSION SELENIUM_PLATFORM
NAME BUILD
Python Web scraping
This is based on Pluralsight’s 1h 7m video course “Scraping Dynamic Web Pages with Python and Selenium” by Pratheerth Padman covers use of Python invoked by Jupyter Notebook. Python has library Beautiful Soup (to scrape HTML and XML from web pages) and Selenium 2.0 WebDriver (to emulate keyboard and mouse movements based on JSON commands).
- Install Selenium for Python3 (covered above)
-
Install Python with virtualvenv and Anaconda.
pip3 install selenium
Sample response:
Collecting selenium Downloading https://files.pythonhosted.org/packages/80/d6/4294f0b4bce4de0abf13e17190289f9d0613b0a44e5dd6a7f5ca98459853/selenium-3.141.0-py2.py3-none-any.whl (904kB) |████████████████████████████████| 911kB 507kB/s Requirement already satisfied: urllib3 in /Users/wilson_mar/Library/Python/3.7/lib/python/site-packages (from selenium) (1.25.6) Installing collected packages: selenium Successfully installed selenium-3.141.0
- Download Chrome driver.
-
Download Beautiful Soup 4 for Python
pip3 install beautifulsoup4
Sample response:
Collecting beautifulsoup4 Downloading https://files.pythonhosted.org/packages/3b/c8/a55eb6ea11cd7e5ac4bacdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl (101kB) |████████████████████████████████| 102kB 528kB/s Collecting soupsieve>=1.2 Using cached https://files.pythonhosted.org/packages/81/94/03c0f04471fc245d08d0a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl Installing collected packages: soupsieve, beautifulsoup4 Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.5
-
Download my repository containing Jupyter Notebooks:
git clone https://github.com/wilsonmar/DevSecOps/master/Selenium
-
Open a Terminal and navigate to that folder. See https://wilsonmar.github.io/jupyter
Call Selenium from Python
-
In Jupyter Notebook, run file demo_mod2.ipynb file which opens a hard-coded web page, then quit.
from selenium import webdriver driver = webdriver.Chrome() driver.get("https://www.pluralsight.com/") driver.quit # close window
To open browser options and set arguments (as if manually in Settings”) before opening:
options = webdriver.ChromeOptions() options.add_argument("--ignore-certificate-errors") options.add_argument("--incognito") options.add_argument("--headless") # window at position 0.0, in full screen. driver = webdriver.Chrome(options=options)
-
Run file demo_mod3 -2.ipynb file which manipulates iframes:
driver.switch_to_frame("frame name) # Switch back: driver.switch_to_default_content()
To handle popups:
driver.switch_to_alert()
-
Run file demo_mod4.ipynb in Jupyter to use the soup class to scrape quotes attributed to Soccer player Wayne Rooney from premierleague.com.
NOTE: There is an error in this script.
- Additional Python code that can be added in Jupyter for other work can include Machine Learning such as sentiment analysis. The course author’s personal GitHub
Additional information:
https://www.pythonforbeginners.com/beautifulsoup/beautifulsoup-4-python
Other info
NOTE: All “Selenium3” require Java 1.8+.
PROTIP: A number is included with each component to provide for version control, since everything changes all the time in IT.
https://www.gridlastic.com/java-code-example.html
macos-install-all/tests/firefox_pycon_search.py