Before writing any automation code, you need a mental model of what happens when a test runs. Understanding this end-to-end flow makes every concept in this path click into place.
Every automated test follows the same pattern: Arrange → Act → Assert.
@Test
public void testAddToCart() {
// Arrange
WebDriver driver = new ChromeDriver();
driver.get("https://shop.example.com/products/laptop");
// Act
driver.findElement(By.id("add-to-cart")).click();
// Assert
String cartCount = driver.findElement(By.id("cart-count")).getText();
Assert.assertEquals(cartCount, "1");
driver.quit();
}When you run a Selenium test, several components work together:
| Component | Role | Example |
|---|---|---|
| Your Java Code | Sends commands — "click this button", "type this text" | driver.findElement(By.id("login")) |
| Selenium WebDriver | Translates Java commands into browser-specific instructions | Converts click() into Chrome DevTools Protocol call |
| Browser Driver | Receives instructions and controls the actual browser | chromedriver.exe for Chrome, geckodriver for Firefox |
| Browser | Executes the actions and returns results | Chrome opens the page, clicks the element |
| TestNG/JUnit | Runs the tests, reports pass/fail, generates reports | @Test annotation marks a method as a test |
| Maven | Manages dependencies, compiles code, runs the test suite | Downloads Selenium JAR, runs `mvn test` |
Your Java Code
↓ (Selenium API call)
Selenium WebDriver Library
↓ (HTTP request)
Browser Driver (chromedriver)
↓ (DevTools Protocol)
Chrome Browser
↓ (renders page, clicks element)
Result sent back up the chain
↓
TestNG compares actual vs expected
↓
✅ PASS or ❌ FAILYou do not need to understand every layer right now. The key takeaway: your Java code talks to the Selenium library, which talks to the browser driver, which controls the browser. When you write driver.findElement(By.id("login")).click(), that single line triggers this entire chain.
Q: Explain the architecture of Selenium WebDriver.
A: Selenium WebDriver has a client-server architecture. The test code (Java, Python, etc.) uses the Selenium client library to send commands. These commands are sent as HTTP requests to the browser-specific driver (chromedriver, geckodriver). The driver translates them into browser-native commands using protocols like Chrome DevTools Protocol. The browser executes the action and returns the result back through the same chain.
Not all tests need a browser. The testing pyramid defines how many tests to have at each level:
Most of your tests should be unit tests (fast, reliable). API tests sit in the middle. UI tests at the top are the slowest and most fragile — a single CSS change can break them. This path focuses on UI automation with Selenium, but understanding the pyramid helps you make better decisions about what to test and how.
Q: What is the testing pyramid?
A: The testing pyramid is a model for test distribution. The base is unit tests (about 70%) — fast and stable, testing individual functions. The middle is API/integration tests (about 20%) — testing how components communicate. The top is UI/E2E tests (about 10%) — slow, fragile, reserved for critical user flows. The principle is to catch most bugs quickly at the lowest level possible.
| Type | What It Tests | When It Runs | Tools |
|---|---|---|---|
| Smoke | Critical features work after a deploy | Every deployment | Selenium, RestAssured |
| Regression | Existing features still work after changes | Every sprint or build | Selenium, TestNG |
| Functional | Features do what they are supposed to | During development | Selenium, RestAssured |
| Performance | App handles expected load | Before major releases | JMeter, Gatling, k6 |
| Cross-browser | App works on Chrome, Firefox, Safari, Edge | Before releases | Selenium Grid, BrowserStack |
Q: What is the difference between smoke testing and regression testing?
A: Smoke testing is a quick sanity check (10-20 tests, under 5 minutes) that answers "is the build alive?" It runs after every deployment. Regression testing is a thorough rerun of all existing tests (100-500+ tests, 30+ minutes) that answers "did we break anything?" It runs before major releases or on every sprint.
Exercise 1: Read the LoginTest code example at the top of this lesson. Identify the Arrange, Act, and Assert sections in it.
Exercise 2: For each scenario, identify the test type (smoke, regression, functional, performance): (a) Running 200 tests before a release to check nothing broke (b) Checking if login works right after deploying to production (c) Verifying that the search feature returns correct results (d) Testing whether the app crashes under 1000 simultaneous users.