Let’s Automate VMware Part 2: Building Linux Templates with Packer

With packer ready to go, the next step was to specify the installation options for the template. In CentOS, that’s achieved with a text file called a kickstart file. The easiest way to create a kickstart file is to manually install CentOS in a throwaway VM using the desired options. Once the installation was completed, I could download the /root/anaconda-ks.cfg file from the VM and customize it using the CentOS documentation. My manual install included automatic partitioning, disabling kdump, choosing a minimum installation set, enabling the network adapter and creating a root password and an ansible user for future use. Once I downloaded the file, I added a few options and made some changes, all commented below.

#version=RHEL8

# Autopartition as lvm without home partition (CHANGED)
ignoredisk --only-use=sda
autopart --type=lvm --nohome

# Partition clearing information
clearpart --none --initlabel

# Use graphical install
graphical

# Use CDROM installation media
cdrom

# Keyboard layouts
keyboard --vckeymap=us --xlayouts='us'

# System language
lang en_US.UTF-8

# Network information - disable ipv6 (CHANGED)
network  --bootproto=dhcp --device=eth0 --noipv6 --activate
network  --hostname=localhost.localdomain

repo --name="AppStream" --baseurl=file:///run/install/repo/AppStream

# Root password
rootpw --iscrypted $6$poUOPKJL.FwIJlYW$6qcoGwXwoxcusg92P7vQTDg6naxmYovIXfrU6Ck6pRV.LoDWxYu0GzmFWnrQHF5kBaF6rIx2t0C6IlUO0qXXO.

# Don't run the Setup Agent on first boot (CHANGED)
firstboot --disable

# Enable firewall with ssh allowed (ADDED)
firewall --enabled --ssh

# Do not configure the X Window System
skipx

# System services (ADDED VMTOOLS)
services --disabled="chronyd"
services --enabled="vmtoolsd"

# System timezone (not UTC in VM environment) (CHANGED)
timezone America/Chicago

# Add ansible service account and add to 'wheel' group
user --groups=wheel --name=ansible --password=$6$D5GqhqMWaRMWrrGP$WVktoxs2QAhyEtozouyUEwjMFShVGEXUvKJ0hG89LW38jU1Uax5tCkFvp4O3z0piP3NiSV34XBn6qQk3RwfQ5. --iscrypted --gecos="ansible"

# Reboot after installation (ADDED)
reboot

%packages
@^minimal-environment
wget
open-vm-tools

%end

%addon com_redhat_kdump --disable --reserve-mb='auto'

%end

%anaconda
pwpolicy root --minlen=6 --minquality=1 --notstrict --nochanges --notempty
pwpolicy user --minlen=6 --minquality=1 --notstrict --nochanges --emptyok
pwpolicy luks --minlen=6 --minquality=1 --notstrict --nochanges --notempty
%end

I added a few packages, including open-vm-tools, to the install list. Any additional packages I could install later with ansible, but getting VM Tools installed up front is important.

Once I created the file, I had to figure out how to get it into a place where the CentOS installer could reach it. Packer has the ability to drop the file onto a virtual “floppy disk” and mount it to the VM, but CentOS 8 no longer supports floppy disks during boot. There are many other options, including FTP, HTTP and NFS, so I put Nginx on my automation machine and dropped the Kickstart file into the html folder.

Packer uses JSON as its legacy language. It was in the process of moving to HCL2, but as it was in beta at the time of this blog post, I stayed with JSON. Here was my initial configuration file for CentOS on vSphere.

{
"variables": {
  "vcenter-server": "vcsa.lab.clev.work",
  "vcenter-username": "automation@vsphere.local",
  "vcenter-password": "VMware1!",
  "vcenter-datacenter": "datacenter",
  "vcenter-cluster": "cluster1",
  "vcenter-datastore": "storage1",
  "vcenter-folder": "Templates",
  "vm-name": "CentOS_8_Minimal",
  "vm-cpu": "2",
  "vm-ram": "2048",
  "vm-disk": "81920",
  "vm-network": "VM Network",
  "ks-path": "http://automation.lab.clev.work/ks_centos8_minimal.cfg",
  "iso-path": "[storage2] iso/CentOS-8.1.1911-x86_64-dvd1.iso"
},
"builders": [
{
  "type": "vsphere-iso",
  "vcenter_server": "{{user `vcenter-server`}}",
  "username": "{{user `vcenter-username`}}",
  "password": "{{user `vcenter-password`}}",
  "insecure_connection": "true",
  "datacenter": "{{user `vcenter-datacenter`}}",
  "cluster": "{{user `vcenter-cluster`}}",
  "datastore": "{{user `vcenter-datastore`}}",
  "folder": "{{user `vcenter-folder`}}",
  "ssh_username": "root",
  "ssh_password": "RootPW1#",
  "convert_to_template": "false",
  "vm_name": "{{user `vm-name`}}",
  "notes": "Built by Packer at {{timestamp}}",
  "guest_os_type": "centos8_64Guest",
  "CPUs": "{{user `vm-cpu`}}",
  "RAM": "{{user `vm-ram`}}",
  "RAM_reserve_all": "false",
  "disk_controller_type": "pvscsi",
  "disk_size": "{{user `vm-disk`}}",
  "disk_thin_provisioned": "true",
  "network_card": "vmxnet3",
  "network": "{{user `vm-network`}}",
  "iso_paths": ["{{user `iso-path`}}"],
  "boot_command": [
    "<esc><wait>",
    "linux ks={{user `ks-path`}}<enter>"
    ]
  }
]
}

It looks a little intimidating at first, but the layout is pretty straightforward. The first section is called “variables” and it allows me to define variables that will be used later in the code. Most of these are self-explanatory–they define the settings for the VM, such as disk size and network, as well as how it will be deployed in vCenter, such as which datacenter, cluster and datastore it will be written to. Also note the ks-path, which is where the CentOS installer will find its Kickstart file, and the iso-path, which is where I uploaded the CentOS ISO.

The second section is called “builders” and passes those variables on to the builder, which actually does the work. Any insertion is encased in double sets of braces, and the variable name is further encased in backticks. Again, everything is fairly straightforward until we get to the boot_command field. Here, the builder will actually tell vCenter to type the characters into the VM, interrupting the boot and instructing grub to boot the installer and grab the Kickstart file from the web server.

So, with the template saved in the packer directory as centos8_minimal.json, I was ready to build.

./packer build centos8_minimal.json

Because I was interested, I ran this command and immediately jumped into vCenter. The “CentOS_8_Minimal” VM appeared almost instantly and I launched a remote console into it. There I could see the boot options literally being typed into the window. The installer then booted and proceeded with a completely hands-off graphical install. It then rebooted the VM and finished.

The process was so seamless that I ran it again with the “time” command to show how long the entire process took:

I fired up the VM and verified that everything was in place and that both the root and ansible users were created correctly. After deleting the VM and changing “convert_to_template” to “true”, I reran Packer and it dutifully rebuilt the VM and converted it to a template of the same name.

Packer has the ability to install additional software, run scripts and perform other tasks on a freshly prepared VM before shutting it down, using “provisioners.” My goal is to manage my infrastructure using Ansible, so I won’t be using provisioners to perform that task here; however, I don’t want my VMs to be deployed from a hopelessly out-of-date template, so I added in a quick provisioner:

"provisioners": [
  {
    "type": "shell",
    "inline": [
    "sudo yum -y upgrade"
    ]
  }
]

The output of the yum upgrade was copied into the output from Packer, so I could verify that the build and upgrade were successful. With that, I had a basic clean freshly-upgraded of CentOS 8 that was templated and ready to deploy in vCenter. The final task was to schedule regular redeployments. I wanted to schedule my deployments for Saturdays, and since Windows patches come out on the second Tuesday, I decided to rebuild all of my templates on the second Saturday of each month. Since my automation VM was running on Linux, I would use cron to schedule it. However, cron doesn’t have the concept of “every second Saturday of the month.” However, since the second Saturday of the month always happens between the eighth and fourteenth of the month, inclusive, I set cron to run every day during that second week and then only proceed if it was a Saturday:

0 2 8-14 * * [ `date +\%u` = 6 ] && /opt/automation/packer build -force /opt/automation/centos8_minimal.json

This crontab entry builds at 2:00am every second Saturday. The “-force” option overwrites the previous template with the newly built one. In the next post, I will build Windows Server templates in Packer.

Let’s Automate VMware Part 1: Introduction and Packer Prerequisites

Just a little warning up front: I’ve never done this before. I’m taking a journey into slowly replacing myself with a small shell script, or at least with some automation tools. I’m completely new to this and working my way through it with the official documentation, a little bit of Googling and my cobbled-together vSphere 6.7 homelab. Feel free to join me on this journey, but please don’t blame me if you try this in your environment and get eaten by a grue.

Ultimately, the goal this project is to completely automate my homelab, from hardware provisioning to hypervisor installation, cluster management, image management, package management, log aggregation and backups. Stretch goals include extending into the cloud, a CI/CD pipeline and “employee” self service.

My first part of this journey was building base images so I would have something to operate, back up and manage. This seemed like a logical place to start as this was “low hanging fruit” that would be easy to implement in any environment where VMs are deployed regularly. But rather than hand-building some base CentOS and Windows templates, it made sense to delve into using Packer to build these templates for me. So in the next couple of blog posts, I will use Packer to build base Linux and Windows images that I can use as templates in my VMware environment, and then schedule Packer to keep those images up-to-date.

While I could run Packer straight from my desktop, ultimately I wanted to have this code running on a dedicated virtual machine in my environment. I created a barebones Linux VM based on CentOS 8 that I named ‘automation’ as the base for all of this activity. While I’m using CentOS 8 for this, feel free to use whatever *nix you want or even Windows–Packer itself is platform agnostic.

A quick note up front: some versions of RedHat derivatives, including CentOS 8, already have a program called “packer”, which is related to cracklib, a library for checking passwords for good entropy. It’s actually a link to another binary, so I bypassed the name collision issue altogether by issuing an “unlink /usr/sbin/packer” command without any regard for whether it might break the rest of the system. Do this at your own peril.

Installation was straightforward. I downloaded the Linux x64 binary from the Packer website (https://www.packer.io/downloads.html) and unzipped it into /usr/local. Running /usr/local/packer without any arguments proved that it was working. You can also find prebuilt packages for Packer or compile it yourself.

My next step was to get permissions to my vSphere cluster. Packer uses the vSphere API, and the permissions needed can be found in Packer’s documentation for the vSphere builders. I fired up the vSphere client to create a user called “automation” and assign it to the Administrators group. I then logged into the vCenter using PowerCLI in PowerShell:

Connect-VIServer -server vcsa.lab.clev.work

After allowing the vCenter certificate, I entered the credentials for the new ‘automation’ user into the popup dialog box and connected. To test my new user, I connected my datastore called “datastore2” to a local PowerShell mount called “isostore”, and then uploaded the latest CentOS ISO to the “iso” folder on that datastore.

New-PSDrive -Location (Get-Datastore "storage2") -Name isostore -PSProvider VimDatastore -Root "\"
Copy-DatastoreItem -Item d:\isos\CentOS-8.1.1911-x86_64-dvd1.iso -Destination isostore:\iso\

Note that this process is slow; the upload took about 20 minutes in my homelab and the progress bar in PowerCLI did not move during the copy. Fortunately, I could see the file growing in the vSphere client as the upload proceeded. In the next post, I use my freshly-uploaded ISO to build my first Linux image from my ‘automation’ VM.

Automatically Disable Inactive Users in Active Directory

While Microsoft provides the ability to set an expiration date on an Active Directory user account, there’s no built-in facility in Group Policy or Active Directory to automatically disable a user who hasn’t logged in in a defined period of time. This is surprising since many companies have such a policy and some information security standards such as PCI require it.

There are software products on the market that provide this functionality, but for my homelab, my goal is do this on the cheap. Well, not the cheap so much as the free. After reading up on the subject, I found that this is not quite as straightforward as it may seem. For instance, Active Directory doesn’t actually provide very good tools out of the box for determining when a user last logged on.

The Elusive Time Stamp

Active Directory actually provides three different timestamps for determining when a user last logged on, and none of them are awesome. Here are the three available AD attributes:

lastLogon – This provides a time stamp of the user’s last logon, with the caveat that it is not a replicated attribute. Each domain controller retains its own version of this attribute with the last timestamp that the user logged onto that particular domain controller. This means that any script that uses this attribute will need to pull the attribute from every domain controller in the domain and then use the most recent of those timestamps to determine that actual last logon.

lastLogonTimeStamp – This is a replicated version of the lastLogon timestamp. However, it is not replicated immediately. To reduce domain replication traffic, the replication frequency depends on a domain attribute called msDS-LogonTimeSyncInterval. The value of lastLogonTimeStamp is replicated based on a random time interval of up to five days before the msDS-LogonTimeSyncInterval. By default the msDS-LogonTimeSyncInterval attribute is unset, which makes it default to 14 days. Therefore, by default, lastLogonTimeStamp is replicated somewhere between 9 and 14 days after the previous replicated value. Needless to say, this is not useful for our purposes. In addition, this attribute is stored in a 64-bit signed numeric value that must be converted to a proper date/time to be useful in Powershell.

lastLogonDate – There are a lot of blogs that will state that this is not a replicated timestamp. Technically that’s true; it’s a copy of lastLogonTimeStamp that the domain controller has converted to a standard date/time for you. Otherwise, it’s the same as lastLogonTimeStamp and has the same 9-to-14-day replication delay.

Another caveat for all three of these attributes is that the timestamp is updated for many logon operations, including interactive and network logons and those passed from another service such as RADIUS or another Kerberos realm. Also, there’s a Kerberos operation called “Service-for-User-to-Self” (“S4u2Self”) which allows services to request Kerberos tickets for a user to perform group and access checks for that user without supplying the user’s credentials. For more information, see this blog post.

Caveats aside, this is what we have to work with. For the purposes of my relatively small domain, I’m comfortable with increasing the replication frequency of lastLogonTimeStamp. Be aware before using this in production that it will increase replication traffic, especially during periods when many users are logging in simultaneously; domain controllers will be replicating this attribute daily instead of every 9 to 14 days.

As none of these caveats applied in my homelab, I launched Active Directory Users and Computers. Under “View”, I selected “Advanced Features” to expose the attributes I needed to view or change. I then right-clicked my domain and selected “Properties.” The msDS-LogonTimeSyncInterval was “not set” as expected, so I changed it to “1” to ensure that the timestamp was replicated daily for all users.

I then created the below Powershell script in a directory.

# disableUsers.ps1  
# Set msDS-LogonTimeSyncInterval (days) to a sane number.  By
# default lastLogonDate only replicates between DCs every 9-14 
# days unless this attribute is set to a shorter interval.

# Also, make sure to create the EventLog source before running, or
# comment out the Write-EventLog lines if no event logging is
# needed.  Only needed once on each machine running this script.
# New-EventLog -LogName Application -Source "DisableUsers.ps1"

# Remove "-WhatIf"s before putting into production.

Import-Module ActiveDirectory

$inactiveDays = 90
$neverLoggedInDays = 90
$disableDaysInactive=(Get-Date).AddDays(-($inactiveDays))
$disableDaysNeverLoggedIn=(Get-Date).AddDays(-($neverLoggedInDays))

# Identify and disable users who have not logged in in x days

$disableUsers1 = Get-ADUser -SearchBase "OU=Users,OU=Demo Accounts,DC=lab,DC=clev,DC=work" -Filter {Enabled -eq $TRUE} -Properties lastLogonDate, whenCreated, distinguishedName | Where-Object {($_.lastLogonDate -lt $disableDaysInactive) -and ($_.lastLogonDate -ne $NULL)}

 $disableUsers1 | ForEach-Object {
   Disable-ADAccount $_ -WhatIf
   Write-EventLog -Source "DisableUsers.ps1" -EventId 9090 -LogName Application -Message "Attempted to disable user $_ because the last login was more than $inactiveDays ago."
   }

# Identify and disable users who were created x days ago and never logged in.

$disableUsers2 = Get-ADUser -SearchBase "OU=Users,OU=Demo Accounts,DC=lab,DC=clev,DC=work" -Filter {Enabled -eq $TRUE} -Properties lastLogonDate, whenCreated, distinguishedName | Where-Object {($_.whenCreated -lt $disableDaysNeverLoggedIn) -and (-not ($_.lastLogonDate -ne $NULL))}

$disableUsers2 | ForEach-Object {
   Disable-ADAccount $_ -WhatIf
   Write-EventLog -Source "DisableUsers.ps1" -EventId 9091 -LogName Application -Message "Attempted to disable user $_ because user has never logged in and $neverLoggedInDays days have passed."
   }

You may notice two blocks of similar code. The lastLogonDate is null for newly created accounts that have never logged in. Rather than have them all handled in a single block, I created two separate handlers for accounts that have logged in and those that haven’t. This might be useful for some organizations that want to disable inactive accounts after 90 days but disable accounts that have never logged in after only 14 or 30 days.

Note also that I have included event logging in this script. This is completely optional, but if you are bringing Windows logs into a SIEM like Splunk, it’s useful to have custom scripts logging into the Windows logs so that the security team can track and act on these events. To log Windows events from a Powershell script, you need to register your script. This is a simple one-time command on each machine running the script. Here’s the command I used to register my script:

New-EventLog -LogName Application -Source "DisableUsers.ps1"

This gives my script the ability to write events into the Application log, and the source will show as “DisableUsers.ps1”. The LogName can be used to log events to a different standard Windows log, or to even create a completely separate log. I also created two event IDs, 9090 and 9091, to log the two event types from my script. I did a quick Google search to make sure that these weren’t already used by Windows, but duplicate IDs are fine from different sources.

This script by default has a “-WhatIf” switch on each Disable-ADAccount command. This is so the script can be tested against the domain to make sure it’s behaving in an expected manner. During the run, the script will display the accounts it would have disabled, and the event log will generate an event for each disablement regardless of the WhatIf switch. Once thoroughly tested, the “-WhatIf”s can be removed to make the script active. The SearchBase should also be changed to an appropriate OU in your domain. Note that this script does not discriminate; any users in its path are subject to being disabled, including service accounts and the Administrator account if they have not logged in during the activity period.

Finally, it’s time to put the script into play. It’s as easy as creating a daily repeating task that launches powershell.exe with the arguments “-ExecutionPolicy Bypass c:\path\to\DisableUsers.ps1” If this is run on a domain controller, it can be run as the NT AUTHORITY\System user so that no credentials need to be stored or updated. I ran this in my homelab test domain and viewed the event log. Sure enough, several of my test accounts were disabled. Note that I used the weasel words “attempted to” in the log; this is because I don’t actually do any checking to verify that the account was disabled successfully.

I did see one blog post where the author added a command to update the description field of an account so that administrators could see at a glance that it was auto-disabled. I didn’t do that here, but if you wanted to add that, here’s a sample command you can add to the script:

Set-ADUser -Description ("Account Auto-Disabled by DisableUser.ps1 on $(get-date)")

Hopefully this will help others to work around a glaring oversight by Microsoft. Please drop me a comment if you have any suggestions for improvement; I’m not a Powershell coder by trade and I’m always looking for tips.

Building and Populating Active Directory for my Homelab

There are a multitude of Active Directory how-to blog posts and videos out there. This is my experience putting together a quick and dirty Active Directory server, populating it with OUs and users, and getting users randomly logged in each day. I’m going to skip the standard steps for installing AD as it’s well documented in other blogs, and it’s the populating of AD that’s more interesting to me.

Why put in all this work you ask? I’m doing some experimentation in AD, and rather than blow up a production environment, I want a working and populated AD at home. I also need users to “log in” randomly so I can use the login activity to write some scripts. So, here’s how I did it.

Step 1 was to actually install Active Directory. Again, there are numerous blogs about how to do this. Since I was running Server 2016 in evaluation mode, I wasn’t going to do a lot of customizing since it was just going to expire eventually anyway. I had already configured 2016 for DNS and DHCP for my lab network, so installing AD was as easy as adding the role and management utilities from the Server Manager. Utilizing the “Next” button repeatedly, I created a new forest and domain for the lab.

The next step was to populate the directory. For this, I went with the “CreateDemoUsers” script from the Microsoft TechNet Gallery. I downloaded the PowerShell script and accompanying CSV and executed the script. In minutes, I had a complete company including multiple OUs filled with over a thousand user accounts and groups. Each user had a complete profile, including title, location and manager.

With my sample Active Directory populated, my next goal was to record random logins for these users. I needed this initially for some scripts I was writing, but my eventual goal was to generate regular activity to bring into Splunk. For my simulated login activity, I wrote a script to pick a user at random and execute a dummy command as that user so that a login was recorded on their user record. I began by modifying the “Default Domain Controller policy” in Group Policy Management. I added “Domain Users” to “Allow log on locally.” (This is found in Computer Configuration -> Windows Settings -> Security Settings -> Local Policies -> User Rights Assignment.) This step isn’t necessary if the script is running from a domain member, but since I didn’t have any computers added to the domain, this would allow me to run the script directly on the domain controller.

Here is the script that I created. I put it in c:\temp so it could be run by any user account.

# Login.ps1

$aduser = Get-ADUser -Filter * | Select-Object -Property UserPrincipalName | Sort-Object{Get-Random} | Select -First 1
$username = $aduser.UserPrincipalName
$password = ConvertTo-SecureString 'Password1' -AsPlainText -Force

$credential = New-Object System.Management.Automation.PSCredential $username, $password
Start-Process hostname.exe -Credential $credential

I wanted my user base as a whole to log in on average every 90 days, which meant that I wanted about 11 users per day to log in. Easy enough, I created a scheduled task to run at 12:00am every day and repeat every 2 hours. That was close enough for my purposes.

Since I was running this on a lone DC in a lab, I ran it as NT AUTHORITY\SYSTEM so I didn’t have to mess with passwords. The task runs powershell.exe with the arguments -ExecutionPolicy Bypass c:\temp\login.ps1.

After saving the task, I right-clicked and clicked “Run.” Because I chose hostname.exe as the execution target, the program opened and immediately closed. While this briefly flashed on my screen when testing, nothing appears on the screen when running this as a scheduled task. A quick look through the event log confirmed that user “LEClinton” logged in, and a look at Lisa Clinton’s AD record shows that she did indeed log in.

While my eventual goal is to script all of the installation and configuration, this was enough to get the other work I needed to do done, as well as provide a semi-active AD environment for further testing and development.