embeddedSftpServer error
Overview
If the SFTP Gateway Java application is unable to start, this could be due to various reasons. One culprit on Azure is the Reset Password tool in the Azure Portal. One of its options is to reset OpenSSH so that it listens to its default port 22
. As a result, SFTP Gateway is unable to start its SFTP service because the port is taken.
This article explains this scenario in more detail, and also provides steps to fix the issue.
Symptoms to look for
After a reboot, you encounter the following loading screen on the SFTP Gateway web admin portal:
Please wait while SFTP Gateway finishes setting up and try again in a minute...
Meanwhile, take a moment to view our documentation for setting up SFTP Gateway
This loading screen is supposed to turn into the login page when Java finishes loading. But if Java never loads successfully, the next step is to SSH into the VM and check the logs.
When trying to SSH over port 2222
, you may see this error:
Roberts-MacBook-Pro:~ robertchen$ ssh azureuser@51.8.113.70 -p 2222
ssh: connect to host 51.8.113.70 port 2222: Connection refused
Next, try to SSH over port 22
. If it lets you in, this is a major clue that the OpenSSH configuration was reset:
Roberts-MacBook-Pro:~ robertchen$ ssh azureuser@51.8.113.70 -p 22
Welcome to Ubuntu 22.04.4 LTS (GNU/Linux 6.5.0-1019-azure x86_64)
Next, tail the logs:
sudo su
cd /opt/sftpgw/log/
tail -f *
Look for any stack traces and note the error message at the very top:
2024-10-03T20:39:40.517Z ERROR 2290 --- [sftpgateway] [main] c.s.b.sftp.server.EmbeddedSftpServer : Exception when starting embedded server
java.io.IOException: Server failed to start
This is a clue that the SFTP service is unable to start.
Confirming the root cause
To verify our hunch that the SFTP Gateway Java service is unable to start due to OpenSSH squatting on port 22
, we need to list out all sevices and ports:
ss -nltp
You may see the following output (formatted for readability):
State Local Address:Port Process
LISTEN 0.0.0.0:22 users:(("sshd",pid=589,fd=3))
LISTEN 0.0.0.0:80 users:(("nginx",pid=931,fd=6),("nginx",pid=930,fd=6))
LISTEN 127.0.0.53%lo:53 users:(("systemd-resolve",pid=476,fd=14))
LISTEN 0.0.0.0:443 users:(("nginx",pid=931,fd=8),("nginx",pid=930,fd=8))
LISTEN 127.0.0.1:5432 users:(("postgres",pid=624,fd=5))
LISTEN [::]:22 users:(("sshd",pid=589,fd=4))
LISTEN [::]:80 users:(("nginx",pid=931,fd=7),("nginx",pid=930,fd=7))
LISTEN [::]:443 users:(("nginx",pid=931,fd=9),("nginx",pid=930,fd=9))
The most important things to note are:
- The
sshd
service is listening on port22
instead of2222
- The
java
service and port8080
are missing from the picture
If this is what you are seeing, refer to the next section which covers how to fix the issue.
Give port 22 back to SFTP Gateway
At the moment, OpenSSH (sshd
) is listening on port 22
. So the first step is to move this service to another port (e.g. 2222
).
Edit the following file:
/etc/ssh/sshd_config
Around line 7, look for this line:
Port 22
and change it to this:
Port 2222
Save the file, then restart OpenSSH:
service sshd restart
Re-run ss -nltp
to check your work (sshd
should listen to 2222
now).
Now that port 22
is available, restart Java:
service sftpgw-admin-api restart
Re-run ss -nltp
to check your work. You want to see output like this:
State Local Address:Port Process
LISTEN 0.0.0.0:2222 users:(("sshd",pid=4000,fd=3))
LISTEN 0.0.0.0:80 users:(("nginx",pid=931,fd=6),("nginx",pid=930,fd=6))
LISTEN 127.0.0.53%lo:53 users:(("systemd-resolve",pid=476,fd=14))
LISTEN 0.0.0.0:443 users:(("nginx",pid=931,fd=8),("nginx",pid=930,fd=8))
LISTEN 127.0.0.1:5432 users:(("postgres",pid=624,fd=5))
LISTEN *:8080 users:(("java",pid=4060,fd=40))
LISTEN [::]:2222 users:(("sshd",pid=4000,fd=4))
LISTEN *:22 users:(("java",pid=4060,fd=36))
LISTEN [::]:80 users:(("nginx",pid=931,fd=7),("nginx",pid=930,fd=7))
LISTEN [::]:443 users:(("nginx",pid=931,fd=9),("nginx",pid=930,fd=9))
You are looking for the following:
- Java should be listening on
22
- Java should be listening on
8080
- sshd should be listening on
2222
At this point, SFTP Gateway should be running again. Refresh your browser window, and the web admin portal should load as expected.
Root cause analysis
To prevent this from happening again, we should understand how this happened in the first place.
In the Azure Portal, there is a Reset Password tool on the VM details page. This tool is useful for updating or injecting new credentials so that you can SSH into the VM.
Toward the top, there is a Mode
section that has 3 radio buttons:
- Reset Password
- Add SSH public key
- Reset configuration only
The Reset configuration only
option is very dangerous for SFTP Gateway. Our VM image customizes
the /etc/ssh/sshd_config
file to move OpenSSH to port 2222
. If this custom configuration
is wiped out and reset to Linux defaults, it can create problems:
- At first, nothing happens. This is the scary part, because it sets a time bomb.
- The next time someone reboots the VM (which could be months in the future),
sshd
will start first (because it's lightweight), and it will occupy port22
- Java will lose the race condition, and it will repeatedly fail to start because its SFTP subsystem is unable to start on port
22
- When trying to troubleshoot, sysadmins will try to SSH in on port
2222
but will be unable to do so becausesshd
got moved to port22