Using Ansible to Manage the Server Agent in Okta Privileged Access

This article looks at how Ansible could be used to manage the server agent (‘sftd‘) on a fleet of Linux servers. The article assumes there’s an Ansible deployment configured and the controller can connect to and run playbooks on managed servers.

Table of Contents

Note, I’m not an Ansible guru, I started looking at it a few days ago in order to figure out some systems management tasks for sftd. The set up I used is very basic and could be improved, and the playbooks I created could probably be better. The examples are just to show how it could be done. YMMV.

Introduction

In a small environment with a handful of servers managed by Okta Privileged Access, manually managing the components isn’t a problem. But if you have many server instances, you need some automation/systems management to manage the components. Ansible is a popular tool for automation/systems management across fleets of servers. It can be used to manage the Okta Privileged Access infrastructure components like the server agent (sftd) and the gateway (sft-gatewayd).

This article looks at how some simple tasks can be implemented into Ansible using playbooks. The tasks are:

Checking the sftd is running on a set of servers and if not, restarting it,
Checking the version of sftd on a set of servers,
Updating sftd to the latest version, and
Moving the server enrollment to another resource group/project

You could also use it to install and configure the server agent, but we won’t cover that in this article.

Before looking at these use cases, we will look at some of the Ansible set up done for this environment.

Ansible Set up

The Ansible set up for this was a simple non-production type deployment. If you already have Ansible installed and configured, it would probably look different to this.

In this environment there was an Ansible controller and two managed servers running the sftd process. All were Ubuntu 22.04 servers.

Security

From a security perspective, the aim was to use native mechanisms and stay away from what Okta Privileged Access could provide.

We created an ordinary user on each managed server called “ansible“.

Rather than rely on username/password, we configured a ssh key pair on the Ansible controller (using ssh-keygen) and distributed the public key to the new ansible accounts on each target servers (using ssh-copy-id).

So that the ansible user could sudo without password, the user was added to the /etc/sudoers.d/90-cloud-init-users file (same as for the default ubuntu account).

# Created by cloud-init v. 24.3.1-0ubuntu0~22.04.1 on Mon, 25 Nov 2024 04:48:39 +0000

# User rules for ubuntu
ubuntu ALL=(ALL) NOPASSWD:ALL

# User rules for ansible
ansible ALL=(ALL) NOPASSWD:ALL

This is not the best practice for security. There are many ways you could tighten security on that account, such as restricting what sudo commands can be run and controlling where the user can log in from. Note that there’s an Ansible Vault feature that may also be useful in managing account credentials.

Other Set up

For this environment, all files are in a single directory (which wouldn’t be the case in a production deployment of Ansible).

This includes an inventory file that has a single group ([ubuntu]) with the two test servers, the connection method and the account to use (the ansible account mentioned above)

There are some files with enrollment tokens for projects in Okta Privileged Access. There are a set of playbook files (sftd.*.*.yml). We also had a simple script that runs a ping via Ansible to check connectivity.

The use of the files will be explained in the sections below.

Ansible Automation Examples

Let’s look at some examples of how Ansible can be used to manage the sftd processes.

Check Status and Start sftd Process

The first use case is checking to see if the sftd process is running, and if it’s not, to start it. Ansible provides a set of out-of-the-box modules, including one called ansible.builtin.service which is used to work with Linux services, like sftd.

The playbook shown below will use this module to check if the service is in a started state. If not, it triggers the handlers section that will restart it. Note that most modern Linux OS support the serviced mechanism that this module is talking to.

This playbook will use the become and become_user arguments to elevate the user (ansible) to root (i.e. run the commands via sudo to root). This is why we needed to allow the user to run sudo commands without requiring a password. If you didn’t you would need to pass in a password.

- name: Verify sftd running
  hosts: ubuntu
  remote_user: ansible
  become: true
  become_user: root
  tasks:
    - name: Ensure sftd is running
      ansible.builtin.service:
        name: sftd
        state: started

  handlers:
    - name: Restart sftd
      ansible.builtin.service:
        name: sftd
        state: restarted

To test this we log directly into one of the managed servers, stop the sftd service, and check the status of the process.

It’s now down.

On the Ansible controller, we run the ansible-playbook command, passing that inventory file mentioned earlier, and the playbook file from above. It shows that there was nothing done on one of the servers (ip-172-31-12-41) but on the other (where we stopped the process) it did make a change.

Checking the service again, we can see it was restarted.

This is a fairly simple use case and you could schedule the command to run frequently. If the sftd process is still running, as you would expect, it doesn’t do anything.

Check sftd Versions

For the next use case we want to check the versions of the sftd installation on each server (to use in the next use case). The sftd -version command will return the agent version. This command will run on any supported server agent OS.

To get this in an Ansible playbook, we could use the ansible.builtin.command or ansible.builtin.shell modules. There are subtle differences between the two, but as we wanted to parse out the actual version, we chose to use the shell module. It uses the register argument to specify an object name for the command and then a debug task to display the info,

- name: Check sftd version
  hosts: ubuntu
  tasks:
    - name: Get sftd version 
      ansible.builtin.shell: /usr/sbin/sftd -version | awk '{print $3}'
      changed_when: False
      register: sftdver

    - debug: msg="Current version = {{ sftdver.stdout }}"

Running the playbook includes some text in the output showing the version number of the two servers – one is at the current version (1.87.1, as at Nov 2024) and the other is an older 1.83.1 version.

Now that we know we have some older versions in our environment, perhaps we can use Ansible to update them?

Update sftd Version

There are different processes to update a sftd package (scaleft-server-tools) on different Linux platforms depending on the package manager for that platform, but none of them are documented. You need to look at the install steps to see the package manager used and extrapolate from there. For Ubuntu/Debian it uses apt. On Red Hat (RHEL)/Amazon Linux/Alma Linux/Fedora it uses dnf or yum. On SUSE Linux it uses zypper. The standard update option should work, but we did not test updating sftd with platforms other than Ubuntu.

There is an Ansible module, ansible.builtin.apt, to run apt commands (there are dnf and yum_package modules in Ansible, but nothing for zypper).

The playbook for updating sftd to the latest version is shown below. Note that you do not need to tell it to restart the service, the module does all of this for you. As shown earlier, this will run as (sudo to) root. You could use the same command to install sftd if it wasn’t already installed, but there are additional steps required, so the only_upgrade argument is used.

- name: Update sftd version
  hosts: ubuntu
  tasks:
    - name: Update sftd version 
      become: true
      become_user: root
      ansible.builtin.apt:
        name: scaleft-server-tools
        state: latest
        install_recommends: true
        only_upgrade: true

Running this playbook across both Ubuntu servers results in one not being upgraded (as it was already at the latest version) and one being changed (i.e. updated).

Running the versions playbook from earlier confirms that the second machine is now at the latest version.

This completes the standard systems management tasks.

Move Server Enrolment to Another Project

For a more advanced topic, lets look at how we could use Ansible to re-enrol a server agent into a different resource group/project in Okta Privileged Access. Enrollment is controlled by the enrollment tokens passed to the server agent. If you change the enrollment token, you change the project the server is mapped to. This mechanism could be part of automation for a wider migration exercise.

The playbook will walk through the steps to do this:

Stop the sftd process
Blow away all of the state files associated with the agent (in /var/lib/sftd)
Recreate the /var/lib/sftd directory and set the owners to root and permissions
Copy an enrollment token file for the new project from the local directory across to the target server as /var/lib/sftd/enrollment.token and set the owners to root and permissions
Restart the sftd process

It uses two other builtin modules – ansible.builtin.file and ansible.builtin.copy, and runs as (sudos to) root.

The playbook file has the target server and source token file hardcoded, but you would parametrise this in production.

- name: Copy token file and restart sftd
  hosts: ip-172-31-12-41
  become: true
  become_user: root
  tasks:
    - name: Stop sftd service
      ansible.builtin.service:
        name: sftd
        state: stopped
    - name: Delete /var/lib/sftd
      ansible.builtin.file:
        path: /var/lib/sftd
        state: absent
    - name: Create /var/lib/sftd
      ansible.builtin.file:
        path: /var/lib/sftd
        state: directory
        owner: root
        group: root
        mode: '0755'
    - name: Copy new token file
      ansible.builtin.copy:
        src: ./new-enrollment.token
        dest: /var/lib/sftd/enrollment.token
        owner: root
        group: root
        mode: '0644'
    - name: Restart sftd service
      ansible.builtin.service:
        name: sftd
        state: started

This playbook just has the commands needed. In production, you would need to look at some error handling between steps and some stdout/stderr notifications written out.

Running the playbook performs each step (if you monitor the /var/lib/sftd directory you will see it be cleared out, and then recreated with a different inode).

Checking Okta Privileged Access you can see the server has been defined in the new project.

NOTE – This process does NOT delete the old server definition in the other resource group/project. There is no way to do this programmatically from within Ansible. Whilst there are two server definitions, users cannot connect to them. You can wait for it to be automatically removed from Okta Privileged Access, or go into the admin console and delete it from the old project.

You may also need to revisit any policies that leverage the server to make sure they are still as you expect.

This completes this use case.

Conclusion

Ansible is the ideal tool for automation/systems management across a large fleet of servers. It can be leveraged for the day-to-day management of the Okta Privileged Access infrastructure components.

This article has shown a simple Ansible deployment and how playbooks can be used to check/restart the sftd process, check versions, automatically update to the latest version and even move a server from one project to another in Okta Privileged Access.

The examples shown are trivial implementations but should provide a start towards implementing Okta Privileged Access infrastructure component management with Ansible. There are ample examples of different Ansible commands and playbooks across the web, the the Ansible documentation is excellent. We’d love to see what you come up with the manage your environment.