If you have made it this far and your distributed test worked on the first try, you are either very lucky or very experienced. For the rest of us, here is the troubleshooting guide I wish I had when I started. Every error in this list is one I have personally encountered and debugged with chai in hand and frustration in heart.
The single most common error. You start the test and see: "java.rmi.ConnectException: Connection refused to host: 192.168.1.101"
| Cause | How to Check | Fix |
|---|---|---|
| jmeter-server not running on worker | SSH to worker, run: ps aux | grep jmeter-server | Start jmeter-server on the worker machine |
| Wrong IP in remote_hosts | Verify IP with: hostname -I (on worker) | Update remote_hosts in jmeter.properties with correct IP |
| Firewall blocking port 1099 | From master: telnet 192.168.1.101 1099 | Open port 1099 in firewall (see firewall rules above) |
| RMI bound to wrong interface | Check jmeter-server startup message for the bound IP | Start with -Djava.rmi.server.hostname=CORRECT_IP |
| Worker behind NAT | Worker shows 10.x IP but master uses public IP | Set java.rmi.server.hostname to the IP visible to master |
| Port already in use | netstat -tlnp | grep 1099 | Kill existing process or use a different port |
# Step 1: Is jmeter-server running on the worker?
ssh user@192.168.1.101 "ps aux | grep jmeter-server"
# Step 2: Can you reach the worker port from master?
telnet 192.168.1.101 1099
# Or if telnet is not installed:
nc -zv 192.168.1.101 1099
# Expected: "Connected" or "Connection succeeded"
# If timeout/refused: firewall or process not running
# Step 3: Check what IP jmeter-server is bound to (on worker)
netstat -tlnp | grep 1099
# Look for 0.0.0.0:1099 (listening on all interfaces) or specific IP
# Step 4: Verify no DNS resolution issues
nslookup 192.168.1.101
ping -c 3 192.168.1.101You see: "javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake" or "java.rmi.ConnectIOException: non-JRMP server at remote endpoint"
# Check if SSL is enabled/disabled on each machine
grep "server.rmi.ssl.disable" /opt/jmeter/bin/jmeter.properties
# Should return the same value on ALL machines
# Check if keystore exists on each machine
ls -la /opt/jmeter/bin/rmi_keystore.jks
# Should exist on every machine with the same file size/checksum
# Compare keystore checksums across machines
md5sum /opt/jmeter/bin/rmi_keystore.jks
# All machines must return the same hash
# Quick fix for test environments: disable SSL everywhere
for host in master worker1 worker2 worker3; do
ssh user@$host "echo 'server.rmi.ssl.disable=true' >> /opt/jmeter/bin/jmeter.properties"
doneThe test starts, master says "Starting remote workers," but then nothing happens. No results come back. The test seems hung.
Workers crash with java.lang.OutOfMemoryError: Java heap space. This means you are running more threads than the worker can handle with its current memory allocation.
# Increase JMeter heap memory on workers
# Option 1: Set environment variable before starting
export JVM_ARGS="-Xms2g -Xmx4g -XX:MaxMetaspaceSize=256m"
./jmeter-server
# Option 2: Edit jmeter script directly
# In bin/jmeter file, find the HEAP line and change:
HEAP="-Xms2g -Xmx4g"
# Option 3: For very heavy tests
export JVM_ARGS="-Xms4g -Xmx8g -XX:MaxMetaspaceSize=512m -XX:+UseG1GC"
./jmeter-server
# Rule of thumb:
# 1 GB heap per 500 medium-complexity threads
# Monitor with: jcmd <pid> GC.heap_infoOne worker shows 200ms average response time while another shows 2,000ms. Before blaming the application, check these worker-side issues.
ERROR - jmeter.threads.JMeterThread: Test failed!
java.lang.IllegalArgumentException: File /home/user/test-data/users.csv must exist and be readable
at org.apache.jmeter.services.FileServer.resolveFileFromPathCheck jmeter-server is running on ALL workers (ps aux | grep jmeter)
Verify network connectivity from master to every worker (nc -zv IP 1099)
Verify network connectivity from every worker to the target application (curl target-url)
Check SSL configuration is consistent across all machines (grep ssl jmeter.properties)
Verify JMeter and Java versions match on all machines
Confirm CSV files and plugins exist at correct paths on all workers
Check worker logs: /opt/jmeter/bin/jmeter-server.log on each worker
Check master logs for timeout or connection errors
Run a smoke test with 1 thread per worker to isolate configuration issues from load issues
If smoke test passes, gradually increase load to find the capacity ceiling
Always run a smoke test (1-5 threads per worker) before the full load test. It catches 90% of configuration issues in 30 seconds instead of wasting a 30-minute test run.
Q: You started a distributed JMeter test but it is not working. How would you troubleshoot it?
A: I would follow a systematic approach, starting from the most basic checks. First, verify that jmeter-server is running on all worker machines (ps aux | grep jmeter-server). Second, test network connectivity from the master to each worker on port 1099 (nc -zv worker_ip 1099). If connection is refused, check firewalls and whether jmeter-server is bound to the correct network interface. Third, verify SSL configuration is consistent -- either all machines have server.rmi.ssl.disable=true or all have the same rmi_keystore.jks. Fourth, confirm that JMeter and Java versions are identical across all machines. Fifth, check that CSV data files and any custom plugins exist at the correct paths on every worker. Sixth, review jmeter-server.log on each worker for specific error messages. Finally, run a minimal smoke test with 1-2 threads per worker to isolate configuration issues from capacity issues. This systematic approach eliminates variables one by one instead of guessing randomly.
Key Point: Most distributed testing failures come down to: jmeter-server not running, firewall blocking RMI ports, SSL misconfiguration, or missing CSV files on workers.