Wilson Mar bio photo

Wilson Mar


Calendar YouTube Github


How to emulate real users touching your web apps using Python controlling Selenium and Beautiful Soup for reading HTML

US (English)   Norsk (Norwegian)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Estonian   اَلْعَرَبِيَّةُ (Egypt Arabic)   Napali   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean


This article contains notes on installing, coding, and running Selenium.

  1. Run in Google Cloud a Docker image contaning Selenium and associated software.
  2. Run Docker image containing Selenium built using Ansible scripts that created the image.
  3. Run after download from GitHub, then invoke “as is” using Maven

  4. Explain sample files.
  5. Invoke a run using starter samples.
  6. Run across various browsers - Firefox browser, IE, etc.
  7. Obtain jars and drivers if they have changed.
  8. Adapt the starter a basic starter (Java in Selenium driving Chrome).

  9. Update of results to SonarQube.
  10. Add CSV data processing
  11. Add Excel data processing
  12. Add OpenCV (via SikuliX2) to recognize portions of pictures
  13. Add Tesseract to extract text from pictures (OCR = Optical Character Recognition)

  14. Run by CA DevTest
  15. Run in SauceLab Cloud

Some misconceptions

When we mention “Selenium” and “performance testing” in the same sentence, the first thing that most people say is “you can only run a few users on a machine”.

But I’m not talking about running several browser instances on each single machine.

I’m talking about running Selenium once while another program converts what goes back and forth over the network into a script.

The secret sauce is …

Running example

Selenium has no GUI. It runs as a console

However, reports are produced by TestNG, a plug-in to Selenium.

Docker images

Docker images containing Selenium server:


Ansible task files to establish Selenium server:

  • https://github.com/arknoll/ansible-role-selenium
  • https://github.com/quarkslab/ansible-selenium-server
  • https://mtlynch.io/testing-ansible-selenium/

Run using Maven after GitHub

  1. Install Maven.
  2. Install Selenium.
  3. Install the various browsers Selenium will control (Chrome, Firefox, etc.).
  4. Navigate to or create a folder to hold a new folder to be created by Git.
  5. Clone from GitHub a repository containing sample tests:

    git clone https://github.com/wilsonmar/Selenium-samples
    cd Selenium-samples

  6. Select a folder:

    cd Python-soup

    Look at the root layer of the repository. These files are there for use with Eclipse IDE:

    • .classpath
    • .project
    • .settings
    • .metadata

    Some prefer to add them in .gitignore so they are not in the repo.

    .gitignore .DS_Store

    .DS_Store files should be ignored. They are created by MacOS. There is an entry for it in the .gitignore file so they are not stored in GitHub.


  7. Invoke Maven to download dependencies and run Selenium:

    mvn clean verify -Pbrowser-phantomjs

    Maven creates a folder named target to receive downloads before starting Selenium.

    verify -Pbrowser-phantomjs specifies use of the PhantomJS headless browser. Alternately, other browsers:

    mvn clean verify -Pbrowser-chrome

    mvn clean verify -Pbrowser-firefox

    mvn clean verify -Pbrowser-edge

    mvn clean verify -Pbrowser-internet-explorer

    mvn clean verify -Pbrowser-opera

    PROTIP: Several separate runs are needed to test on several browsers.

    At the end of the run, you should see:


View Sample Selenium scripts

  1. If you’re not using an IDE, use the Atom text editor to open a folder list:

    atom .

  2. View the pom.xml file.

    The browsers handled by Selenium are identifined by a profile with a property within the pom.xml file which also specifies to Maven what dependencies to download and how to run Selenium.

    PROTIP: If you work within enterprise firewalls, change the external URLs to internal ones, which may be managed within Nexus or Artifactory servers.

  3. Dive into folder to view:

    Here is where Maven knows to download drivers.

    PROTIP: The LATEST is specified. But a specific version would ensure that all drivers downloaded are the ones previously tested to work with each other. Specific versions can be specified with java invocation:


    See https://github.com/bonigarcia/webdrivermanager

  4. Dive into the src folder.

    Notice there is a main and a test folder. Under each is a folder path:

    At the end of that path under main contains a TestUtils.java file which defines generic Java utility functions such as randomBetween, isDuplicatePresent, isAllEquals.

    Selenium is all about testing, so the end of the path under test contains many more java files to control the browser.

    test/java/selenium/configurations/TestConfig.java makes use of properties browser.name and base_url retrieved by variables in

    PROTIP: Properties controlling a specific test are defined in properties files rather than hard-coded into code so that different properties can be used during a run by temporarily replacing the file.


    Tests are driven by a wrapper which App-specific test code extend:

  5. Look in SeleniumTestWrapper.java defined as a abstract class which are extended by other code.

    Code in the file controls agent strings and cookies that browsers automatically send back to servers.

    The code also manages the screen dimensions of the browser window.

    These enable app-specific test code to focus on business. Under the Annotations folder are files that define code generation by the Java compiler.



    Each Java compiler annotation: @Before, @Rule, and @After is defined within code imported:

    import org.junit.Before;
    import org.junit.Rule;
    import org.junit.After;

    The annotation code imported add additional functionality such as logging. These decorators also provide metadata (data about data).

    Instead of using Java inheritance, Java frameworks Spring and Hibernate use AOP (Aspect oriented programming) to provide a mechanism to inject code for preProcessing and postProcessing for an event. A “hook” in code before and after a method execution for consumer code in those places.

    App-specific Tests

  6. Edit the file defined to test an app:


    The file contains annotations:


    These annotations are defined by imports:

    import selenium.utils.annotations.browser.BrowserDimension;
    import selenium.utils.annotations.browser.Browser;

    The earlier version, 2.53 and below, used these libraries:

    import org.openqa.selenium.Dimension;

    The earlier 2.x code was:

         WebDriver driver = new FirefoxDriver();
         Dimension d = new Dimension(420,600);
         //Resize the current window to the given dimension

    Maximize window size

    QUESTION: How can we read the code in these annotation libraries?

    Page Objects

    The core driving code refers to definitions within the pageobjects folder:

    StartPage startPage = PageFactory.initElements(getDriver(), StartPage.class);
    HeaderSearch search = PageFactory.initElements(getDriver(), HeaderSearch.class);
    SearchResultPage searchResultPage = PageFactory.initElements(getDriver(), SearchResultPage.class);

    See Selenium Design Considerations

    The Page Object design pattern separates locator definitions in a separate java file so that various tests only need to refer to a reference rather than reduntantly specifying the way to locate objects on each page. This reduces maintenance over time as the website HTML changes. Change the locator technique in one place and all tests are good again.


    The basic rule is that tests don’t declare variables (to manage state on their own), manipulate the DOM directly, nor create objects (using the “new” constructor keyword).

    Webdriver Install

    There are Web Driver programs for each combination of operating system (macOS, Windows, Linux, etc.), internet browser (Chrome, Firefox, etc.) and each version (78, 79, etc.). This means the Web Driver you install today would likely be obsolete when a new version of the browser is automatically installed on your machine.

    PROTIP: There are two ways to install Selenium Web Driver. One is to manually install for whatever is your current version. The other is to use a Webdriver package manager.

    Webdriver Manager

    Professor Boni Garcia (in Madrid, Spain) took the initiative to create and maintain https://github.com/bonigarcia/webdrivermanager

    • Checks for the latest version of the WebDriver binary
    • Downloads the WebDriver binary if it’s not present on your system
    • Exports the required WebDriver Java environment variables needed by Selenium

    Its automation makes use of Maven utility commonly used by Java.

    The WebDriver can download files from an open source repository:



Obtain Selenium to test a sample call to Google Search:

  • Selenium3Hello1 is used to verify whether the Selenium core install works. It doesn’t use any browser driver.

  • Selenium3FirefoxGoogleSearch1

  • Selenium3GoogleSearch1 works on multiple browsers.

Look into the folder:

NOTE: It is not best practice, but the sample scripts in my GitHub contains binary files copied from binary repositories.

File Sea.jpg is required only by the sample program from Neotys.

PROTIP: For team/production coding, place photos in a folder such as pics.

The file is placed at the project’s root folder to make it easy to specify its path.

File chromedriver.exe and other browser driver files are at the root of each script folder.

The more correct way is to specify the files (and their specific versions) in a pom.xml file that point to the location of those external dependencies, and then have each user of the repository to run Maven to obtain those files.


Nevertheless, the drivers are included so you can get going quickly.

Invoke sample command

On Windows:

It is assumed that the Java program is within the PATH which the operating system looks for executables.

  1. View the sample command file Selenium3Chrome1.bat

    REM cd Selenium3Usahidi1NeoloadChrome1
    REM Selenium3 must use Java 1.8+
    javac         Selenium3Usahidi1NeoloadChrome1.java
    java.exe -jar Selenium3Usahidi1NeoloadChrome1.jar ^
    -Dnl.selenium.proxy.mode=Design ^
    -Ddriver=chromedriver.exe ^
    -Druntype=Landing ^

    PROTIP: Windows command prompt (cmd.exe) allows the umlaut ^ (Shift + 6) character to indicate line continuation.

  2. Open a Terminal window.
  3. Run the sample Selenium Java program on a Windows machine:


On Macs & Linux

  1. View the sample command file Selenium3Chrome1.sh

    # cd Selenium3Usahidi1NeoloadChrome1
    # Selenium3 must use Java 1.8+
    javac     Selenium3Usahidi1NeoloadChrome1.java
    java -jar Selenium3Usahidi1NeoloadChrome1.jar \
    -Dnl.selenium.proxy.mode=Design \
    -Ddriver=chromedriver \
    -Druntype=Landing \

    Notice there is no “.exe” in the driver.

    PROTIP: Bash shell scripts use the back-slash character (above the Enter/return key) to indicate line continuation.

  2. Open a Terminal window.
  3. Grant permissions:

    chmod +X *.sh

  4. Run the sample Selenium Java program:



Drivers for browsers go into the Reference folder.

Implicit waits. Don’t use them. Especially with explicit waits.

Cross platform

Windows vs. Mac vs. Linux

Cross browser


    system.setProperty("webdriver.gecko.driver", "\\selenium-java-3.5.0")

Obtain jars and drivers

There are the ways to assemble what Selenium needs:

a). Copy a working Selenium project (as shown above) that already has the files needed. Then edit it for your own uses. This is the quickest and simplest way.

b). Copy set of jars from an in-house binary respository (such as Nexus or Artifactory).


c). MavenSelenium”>Code in pom.xml and run Maven (or equivalent tool such as Ant).

d). Manual download from websites, which you’ll need to do when a new version comes along.

e). Build from source.

PROTIP: Download all the latest install files at one sitting to test compatibility of the set. Assemble them together in a single folder for copying into each Selenium project folder so each can stand alone when distributed.

This explains the driver operation.


How to Install Selenium WebDriver With Java And Maven On Mac OS X10 - a 13 Minute SpeedRun


Manual Download

If you’re using a sample script, skip this.

PROTIP: Download files for all operating systems so the scripts begin as cross-system capable (works on Windows, Mac, Linux).

  1. Go to http://docs.seleniumhq.org/download

  2. Under the “Selenium Standalone Server” section heading, click the link next to “Download version”:

    Download version 3.5.0

  3. Click Save of file named with the same version number, such as:
    to the Downloads folder.

    Java bindings

  4. Under the “Selenium Client & WebDriver Language Bindings” section heading, click the link click the Download link associated with the programming language you use, such as Java.

  5. Click Save of file named with the same version number:
    to the Downloads folder.

    Windows Internet Explorer driver


  6. On Windows only, under “The Internet Explorer Driver Server” section heading, click the link 64-bit Windows IE

  7. Click Save of file named with the same version number, such as:
    to the Downloads folder.

    Mac Safari WebDriver

    This section is only applicable on an Apple Mac.


    Under the “Safari” section heading in the Selenium webpage, the SafariDriver.safariextz file is deprecated. So no need to download it.

    Starting with Safari 10 that comes with OS X El Capitan and macOS Sierra, in 2016, WebKit running Java 1.8+ supports the new W3C WebDriver browser automation API. So Selenium automatically launches the driver without further configuration.

    However, it needs to be enabled because it’s off by default.

  8. Open the Safari browser.
  9. Press command+, (comma) or in the menu bar, select Safari Preferences Advanced tab (wheel icon).
  10. Check “Show Develop menu in menu bar” to see “Develop” appear in the menu bar behind the dialog. Exit the dialog.
  11. Click “Develop” on the menu bar and select “Allow Remote Automation”.

    Authorize safaridriver to launch the webdriverd service which hosts the local web server.

  12. Open a Terminal window to run:

    cd /usr/bin
    ./safaridriver -p 5678

    BLAH: The response asks for a port number.

    Usage: safaridriver [options]
    -h, --help                Prints out this usage information.
    -p, --port                Port number the driver should use. If the server
                              is already running, the port cannot be changed.
                              If port 0 is specified, a default port will be used.
  13. Complete the authentication prompt.

  14. To verify, run the Selenium script:


Eclipse IDE

### New Lib folder

  1. Within Eclipse, right-click on your project name to select New, Folder.
  2. Type in name “lib” (for library). Finish.

    Server into lib folder

  3. Press command+tab to switch to the File Explorer window.
  4. Drag from within the Downloads folder file
    and drop it within the lib folder when the mouse turns into a “+” sign.
  5. Click OK to the pop-up dialog for Copy files.

    Selenium Java into lib folder

  6. Press command+tab to switch to the File Explorer window.
  7. Within the Downloads folder, unzip by double-clicking on file

    On a Mac, this should result in the creation of folder
    selenium-java-3.5.0. But ” if you see a .cpgz file created, move that to trash and use another utility such as the “RAR Extractor” utility or unzip in a Terminal window.

  8. Dive into the folders to drag
    to the lib folder.

  9. Click OK to the pop-up dialog for Copy files.

    Eclipse folder

  10. In Eclipse, press command + I or right-click on your project to select Properties.
  11. At Resource, Location is the path of the project.
  12. Click on the door icon and a Finder window opens up.

    new References folder for Browser drivers


  13. Double-click to expand selenium-java-3.5.0.zip.
  14. Drill down to the “selenium-3.5.0” folder.
  15. Drill down to that jar.
  16. Drag the client-combined-3.5.0-nodeps.jar into the lib folder. ???
  17. Choose Copy Files. OK.
  18. Drill down into the lib folder within the selenium-java-3.5.0 folder.
  19. Select all the files.

    NOTE: Previous versions did not include JUnit and others.

  20. Drag them all into the lib folder.
  21. Choose Copy Files. OK.

    Establish lib as References


    To make Eclipse recognize the jar files:

  22. Highlight the two files in the lib folder.

    Click first file. Hold Cntrl while clicking the second file.

  23. Right-click for Build Path. Add to Build Path.

    A Referenced Libraries item should appear under Package Explorer.

    Download Chrome driver


  24. Open Chrome browser, finish work on all windows as you’ll need to relaunch by the time this is done.


  25. Identify your version by clicking the three-dot icon at the upper-right, select Help, About Chrome. This would automatically update the browser to the latest version. Take a break because this takes several minutes.
  26. Click “Relaunch” and wait until you see “Google Chrome is up to date”.
  27. Notice “Version 79.0.3945.79” or whatever it is what you do this.

  28. Navigate to:


  29. Scroll down to the bottom of the page to click the highest number folder (above the LATEST_RELEASE links).
  30. Right-Click the file for your operating system, such as:

  31. Select “Save Save Link As…” in the pop-up.
  32. Click “Save” and wait for the download to finish.

  33. Click on the file to unzip it to file chromedriver.
  34. Open another a new Finder window to /usr/local/bin.
  35. Use your mouse to drag to move the chromedriver file to that folder.

  36. Restart your Chrome.
  37. Try your Selenium WebDriver code.

    Use Gecko Firefox driver

    Video 26 May 2016

  38. Navigate to

    The “Latest release” is shown at the top. Later releases are below.

    Mozilla also calls it the “Marionette Proxy”.

  39. Scroll down to the Downloads section and click on the file for your operating system.

    geckodriver-v0.18.0-macos.tar.gz for Mac (1.31 MB)

    geckodriver-v0.18.0-win64.zip for Windows

  40. Unzip the macos file for the geckodriver executable.

  41. Unzip the win64 file for the geckodriver.exe executable.

  42. Copy the chromdriver file to /usr/local/bin folder. ???
  43. Restart your system and try your Selenium WebDriver code as shown in video.

    Headless drivers

    Ghost Driver or PhantomJS turns Selenium “headless”, accessing the DOM.

    HTML Unit from https://selenium-release.storage.googleapis.com/index.html selenium-html-runner-3.5.0.jar

    Junit into lib folder

  44. Navigate to
  45. Click the text to the right of “Looking for the latest version?”, such as Download junit-4.10.jar (253.2 kB).

    This downloads file junit-4.10.jar (from 2011).

  46. Drag from within the Downloads folder file
    and drop it within the lib folder when the mouse turns into a “+” sign.

    New Class

  47. In the Package Explorer, right-click on “src”. Select New, Java Class.
  48. For Package, type “com.demo.testcases”.
  49. For Name, type “SeleniumFirefoxDemo1”.
  50. Check “public static void main(Strong[] args)”. Finish.

IntelliJ IDE

VIDEO: How to Install Java, Maven and IntelliJ on Apple Mac

Java Coding

TODO: Annotations

## TODO: Random number

Add-on functionality

### TestNG


TestNG has more in-built annotations than JUnit, making testing easier.

TestNG requires a download from http://testng.org/

Its @DataProvider and parameters enables data-driven testing.

JUnit does not generate a HTML reports. But TestNG generates an XSLT report.

  • https://www.youtube.com/watch?v=OTtFSnZY4f8

### Logging

VIDEO on emitting industry-standard logs.

VIDEO on Advanced topics.

   import org.apache.log4j.Logger;
   public class LogDemo {
      public static void main(String[] args){
         Logger logger=Logger.getLogger("LogDemo");  // the class name
  • https://www.youtube.com/watch?v=0UQ9pAlY3qg

Read CSV files

To get your Selenium Java code to read CSV files:

Specify the dependency in Maven, Ant, etc. or

  1. Download from https://mvnrepository.com/artifact/net.sf.opencsv/opencsv/


  2. Add in your Java code:

      List<MyBean> beans = new CsvToBeanBuilder(FileReader("yourfile.csv"))

Read Excel files

Video tutorials Data driven framework, Read, Write, BLOG

To get your Selenium Java code to read Excel files:

  1. Download Apache POI - the Java library for Microsoft documents from
    the binary distribution for the latest stable version, such as

  2. Unzip using RAR so you don’t extract

  3. Select the file under the HTTP heading.

Rock Stars

Simon Stewart (@sha98c, blog.rocketpoweredjetpants.com) invented WebDriver, now Lead Committer

FIT Test > FIT Fixture > Page Objects > WebDriver

“Definitely do not use FIT”

“We test so that when we release software, we are confident it works, as early as possible.”

Sauce Labs

Video Tutorials





https://www.dropbox.com/s/inuirqwhlr3w7zf/slides.pdf?dl=0 Dave Hoeffer



CA DevTest


SauceLab Cloud

  1. Open an account at SauceLab.com.

  2. set environment variables SAUCE_USERNAME SAUCE_ACCESS_KEY



Python Web scraping

This is based on Pluralsight’s 1h 7m video course “Scraping Dynamic Web Pages with Python and Selenium” by Pratheerth Padman covers use of Python invoked by Jupyter Notebook. Python has library Beautiful Soup (to scrape HTML and XML from web pages) and Selenium 2.0 WebDriver (to emulate keyboard and mouse movements based on JSON commands).

  1. Install Selenium for Python3 (covered above)
  2. Install Python with virtualvenv and Anaconda.

    pip3 install selenium

    Sample response:

    Collecting selenium
      Downloading https://files.pythonhosted.org/packages/80/d6/4294f0b4bce4de0abf13e17190289f9d0613b0a44e5dd6a7f5ca98459853/selenium-3.141.0-py2.py3-none-any.whl (904kB)
      |████████████████████████████████| 911kB 507kB/s
    Requirement already satisfied: urllib3 in /Users/wilson_mar/Library/Python/3.7/lib/python/site-packages (from selenium) (1.25.6)
    Installing collected packages: selenium
    Successfully installed selenium-3.141.0
  3. Download Chrome driver.
  4. Download Beautiful Soup 4 for Python

    pip3 install beautifulsoup4

    Sample response:

    Collecting beautifulsoup4
      Downloading https://files.pythonhosted.org/packages/3b/c8/a55eb6ea11cd7e5ac4bacdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl (101kB)
      |████████████████████████████████| 102kB 528kB/s
    Collecting soupsieve>=1.2
      Using cached https://files.pythonhosted.org/packages/81/94/03c0f04471fc245d08d0a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl
    Installing collected packages: soupsieve, beautifulsoup4
    Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.5
  5. Download my repository containing Jupyter Notebooks:

    git clone https://github.com/wilsonmar/DevSecOps/master/Selenium
  6. Open a Terminal and navigate to that folder. See https://wilsonmar.github.io/jupyter

    Call Selenium from Python

  7. In Jupyter Notebook, run file demo_mod2.ipynb file which opens a hard-coded web page, then quit.

    from selenium import webdriver
    driver = webdriver.Chrome()
    driver.quit  # close window

    To open browser options and set arguments (as if manually in Settings”) before opening:

    options = webdriver.ChromeOptions()
    # window at position 0.0, in full screen.
    driver = webdriver.Chrome(options=options)
  8. Run file demo_mod3 -2.ipynb file which manipulates iframes:

    driver.switch_to_frame("frame name)
    # Switch back:

    To handle popups:

  9. Run file demo_mod4.ipynb in Jupyter to use the soup class to scrape quotes attributed to Soccer player Wayne Rooney from premierleague.com.

    NOTE: There is an error in this script.

  10. Additional Python code that can be added in Jupyter for other work can include Machine Learning such as sentiment analysis. The course author’s personal GitHub

Additional information:


Other info

NOTE: All “Selenium3” require Java 1.8+.

PROTIP: A number is included with each component to provide for version control, since everything changes all the time in IT.