About Tim Clevenger

I'm a 20+ year veteran of the IT industry who enjoys modern and vintage technology.

The Littlest Datacenter Part 2: Internet and Firewalls

Part 1 of this saga can be found here.

As mentioned before, this was a SaaS-focused business.  Most of the vital business functions, including ordering, shipping and receiving, pricing, accounting and customer service, were SaaS.  That meant that a rock-solid Internet connection was required.  But again, a small business runs on a small budget.  Combined with the fact that the business was in a strip mall, and we were lucky to get Internet at all.

Fortunately, we were able to get Fios for a reasonable cost and installed reasonably quickly.  Previously the business had been running IPCop on a tiny fanless Jetway PC, but I felt we had outgrown IPCop, and the Jetway box, though still working, was a bit underpowered for what I needed.  I settled on pfSense as my firewall of choice, but I didn’t want to run it on desktop hardware.

Fortunately, Lenovo had a nearly perfect solution for my budget: the RS140 server.  It was a 1U rackmount server with a four-core Xeon E3 processor with AES-NI for fast crypto, and it came with 4GB of RAM for a hair over $400.  The price was so good I bought two.  Each I fitted out with an additional 4GB of RAM and two SSDs, a 240GB from SanDisk and a 240GB from Intel.  There was a bit of consternation when I found that the server came with no drive trays, but I found that I could mount the SSDs in 3.5″ adapters and mount them directly into the chassis with no drilling.

The SanDisk and Intel SSDs in each server were configured in software RAID-1 using the onboard motherboard RAID, and the integrated IPMI was finicky but good enough that I could remotely KVM into the boxes if need be.  The servers were then configured into an active/passive pair using the pfSense software, and I used a new HPe 8-port switch to connect them to the Fios modem.

The firewalls worked so well I bought a matching pair for the other location and connected them with an IPSec tunnel so they could share files more securely.

You may ask why I used hardware for the firewalls instead of virtualizing them.  The answer is, I initially did virtualize them in Hyper-V.  However, I just wasn’t comfortable with the idea of running my firewalls on the same hardware as my workloads.  There have been rumors of ways to escape a VM and compromise the host, and indeed recent revelations about hypervisor compromise through bad floppy drivers and side channel data leakage a la Spectre and Meltdown have confirmed my suspicions about virtualized firewalls.

Coming soon: Backup, environmental, monitoring and security.

The Littlest Datacenter Part 1: Compute and Storage

I was tasked with building a datacenter.  Okay, not really.  The company was expanding into a low-cost strip mall, which meant limited connectivity options, no power redundancy and strict rules regarding modifications.  It also meant that I was limited to two racks in a tiny closet in the middle of an office space.  Finally, as always, there was minimal budget.

The Requirements

The COO was very SaaS-focused for business applications.  As the sole IT person (with additional ancillary duties), I was happy to oblige.  File storage, office applications, email, CRM, shipping and accounting functions were duly shipped off to folks who do that kind of thing for a living, leaving me with a relatively small build: AD/DNS/DHCP, phone system and surveillance.  While the systems I was replacing used independent servers that replicated VMs across, it was a decidedly more… manual failover process than I wanted.  As a shipping-focused business that was penalized for missing shipping deadlines, systems needed to be redundant and self-healing to the extent possible within the thin budget.  Finally, I knew that I would eventually be handing off the environment to either a managed service provider or a junior admin, so everything needed to be as simple and self-explanatory as possible.

The infrastructure VM (AD, DNS, etc.) and ancillary VMs were pretty straightforward.  The elephant in the room was the surveillance system.  Attached to 27 high-resolution surveillance cameras, it would have to store video for 90 days for most of the cameras for insurance reasons.  Once loaded with 90 days of video, it would consume 26TB of disk space and average about 50GB/hour of disk churn during business hours.

The Software

Because of costs, I settled on Hyper-V as my VM solution.  As it’s included with the Windows licenses I was already buying, it was cost-effective, and it had live migration, storage migration, backup APIs, remote replication and failover capabilities.  Standard licensing allows two Windows VMs to run on one license, further reducing costs.

Next to consider was the storage solution.  As I mentioned, the existing server pair consisted of two independent Hyper-V systems, with one active and one passive.  Hyper-V replication kept the passive host up to date, but in the event of a failure or maintenance, failing over and failing back was a long and arduous process.  I opted for shared storage to allow HA.  Rather than roll my own shared storage, I decided to buy.

After talking with several vendors, I settled on Starwind vSAN.  I had used their trialware with good results, and it had good reviews from people who had chosen it.  As it ran on two independent servers with independent copies of the data, it protected both from disk failure and host, backplane, operating system, RAID controller and motherboard failure.  Starwind sold a virtual appliance which was an OEM-branded but very familiar Dell T630 tower server, so I ordered two, which was substantially cheaper than sourcing the servers and vSAN software separately, and about a sixth of the cost of an equivalent pair of Dell servers and separate SAN.

The Hardware

I settled on a pair of midrange Xeons with 12 cores each–24 cores or 48 threads per host.  This was enough to process video on all of the cameras while leaving plenty of overhead for other tasks.  The T630 is an 18-bay unit with a rack option.  Dual gigabit connections went to the dedicated camera switch, while another pair went to the core switches.  For Starwind, a dual-port 10 gigabit card was installed in each host.  One port on each was used for Starwind iSCSI traffic, and the other for Starwind sync traffic.  Both were redundant in software, and they were direct-connected between the hosts with TwinAx.  Storage for each host consisted of sixteen 4TB Dell drives and two 200GB solid state drives for Starwind’s caching.

In an effort to reduce complexity, I went with a flat network.  Two HPe switches provided redundant gigabit links to the teamed server NICs and the other equipment in the rack.  Stacked and dual-uplinked HPe switches connected the workstations and ancillary equipment to the core.

The Software

Windows Server 2012R2 standard provided the backbone, with Starwind vSAN running on top.  Two Windows VMs powered the AD infrastructure server and the surveillance recording server.  I later purchased an additional Windows license and built a second DC/DNS/DHCP VM running on the second host.

Coming Soon:  Firewalls, backup, environmental, monitoring and security

 

The Datastores That Would Not Die

As part of a recent cleanup of our vSphere infrastructure, I was tasked with removing disused datastores from our Nimble CS500.  The CS500 had been replaced with a newer generation all-flash Nimble, and the VMs had been moved a couple of months ago.  Now that the new array was up and had accumulated some snapshots, I was cleaning up the old volumes to repurpose the array.  I noticed, however, that even though all of the files were removed from the datastores, there were still a lot of VMs that “resided” on the old volumes.

VMware addresses this in a KB (2105343) titled “Virtual machines show two datastores in the Summary tab when all virtual machine files are located one datastore.” It suggests that the VM is pointing to an ISO that no longer exists on the old datastore.

After looking at the configs, I realized that, sure enough, some of the VMs were still pointing to an ISO file that was no longer on that datastore.  Easy, right, except that on one of the test VMs, I set the optical drive setting back to “Client Device.”  It was still pointing at the old datastore.

Looking through the config again, I noticed that the Floppy Drive setting is missing from the HTML5 client.  I fired up the Flex client and set the floppy drive to “Client Device” as well.  Still no go.  For the few VMs that were pointing at a nonexistent ISO, setting the optical drive back to “Client Device” worked, but for VMs that were pointing at a nonexistent floppy, changing the floppy to “Client Device” wasn’t working.  A bug in the floppy handling?  Perhaps.

I created a blank floppy image on one of my new datastores and pointed the VM’s floppy to that new image.  Success!  The VM was no longer listing the old datastore, and I could then set the floppy to “Client Device.”  After checking out other VMs, I realized that I had over 100 VMs that had some combination of optical drive or floppy drive pointing at a non-existent file on the old datastores.  PowerCLI to the rescue!

$vm = $args[0]

$cd = get-cddrive -vm $vm
floppy = get-floppydrive -vm $vm

set-floppydrive -Floppy $floppy -FloppyImagePath "[datastorename] empty-floppy.flp" -StartConnected:$false -Confirm:$false

set-floppydrive -Floppy $floppy -NoMedia -Confirm:$false
set-cddrive -cd $cd -NoMedia -Confirm:$false

Simply save this as a .ps1 file and pass it the name of the VM (in quotes if it contains spaces.) It will get the current floppy and CD objects from the VM, set the floppy to the blank floppy image previously created, and then set both the CD and floppy to use “NoMedia.” This was a quick and dirty script, so you will have to install PowerCLI and do your own Connect-VIServer first. Once connected, however, you can either manually specify VMs one at a time or modify the script to get VM names from a file or from vCenter itself.  All of these settings can be changed while the VM is running, so there is no need to schedule downtime to run this script.

After all of this work, I found that there were still a few VMs that were showing up on the old datastores.  Another quick Google search revealed that any VMs that had snapshots that were taken with the CD or floppy mounted would still show up on that datastore.  Drat!  After clearing out the snapshots, I finally freed up the datastores and was able to delete them using the Nimble Connection Manager.

So now a little root cause analysis: why were there so many machines with a nonexistent CD and floppy mounted?  After seeing that they were all Windows 2016 VMs, I went back to our templates and realized that the tech who built the 2016 template left the Windows ISO (long since deleted) and floppy image (mounted so he could F6-load the paravirtualized SCSI driver during OS installation) mounted when he created the template.  I converted the template to a VM, removed the two mounts (using the same two-step method for the floppy) and converted it back to a template.

With that job done, I’m continuing to plug away at converting our other vCenters from 6.5 to 6.7U1.  Have a great day, everyone!

The Dell PowerEdge FX2: A Dead End?

A small crowd was gathered by the server enclosure at the edge of the Dell EMC VMworld booth, gawking at the blinkenlights on the newly announced MX7000 blade chassis.  I dodged the eye candy and the suited salesdroids surrounding it and instead searched the booth for their PowerEdge FX2 display.  Alas, the entire booth was dedicated to the 7 rack unit MX7000 and its array of server, storage and networking building blocks.

When the FX2 was announced in 2014, it held a lot of promise.  A 2U enclosure with horizontally-mounted blades, the FX2 could hold up to eight dual-processor Xeon servers, consolidating power supplies, cooling and management.  The FC430 blade crammed two Xeons and 8 DIMM slots into a quarter-width sled.  The FC630 blade offered three times as much RAM capacity, access to another PCIe chassis slot, and a “real” PERC controller in a half-width sled, and the FC830 was a rack-width quad-Xeon sled with more DIMM slots and extra drive bays.

With the introduction of the 14th generation Dell PowerEdge lineup, the FC630 half-width blade was replaced by the FC640, but the FC830 and FC430 were never updated.

Seeing the writing on the wall, I grabbed one of the Dell EMC guys that was floating around the booth and asked him about the future of the FX2.  He told me that the FC430 and FC830 wouldn’t be updated because the “processors were too big”, and that he couldn’t comment on the future of the FX2 platform further.

Now, I’ve been in IT for a while, but that explanation just seemed strange.  What did he mean by the “processors are too big?” They need a bigger heatsink because they run hotter?  They’re a physically bigger die?  After digging a bit, the answer appears to be “both of the above.”

The Broadwell-based Xeons that underpin the 13th generation PowerEdge use the 37.5mm by 37.5mm LGA1151-3 socket–a socket size that has been carried over virtually unchanged since the 35mm by 35mm Socket 478 of nearly two decades ago.  However, the Skylake-based “Purley” Xeons that power the 14th generation servers use a much larger socket called LGA3647, so-called because it has nearly 2,500 more pins than the old 1151-3.  Those extra pins are used for additional memory channels among other features.

Those extra pins mean that the actual socket had to grow–in this case, to 76mm by 56.5mm.  That’s about three times the real estate of the previous generation CPU.  The FC430 blade was already so tight that it gave up DIMM slots and the option of having a decent RAID controller to fit in the space allotted.  There is simply no room to fit a Purley Xeon without radical rework like breaking off a lot of support hardware onto a daughterboard, which complicates manufacturing, repair and cooling.

fc430

A PowerEdge FC430 blade.

So, I now understand the demise of the FC430, but what about the future of the FX2 as a platform?

The folks over at Wccf Tech have posted Intel documents about the Xeon roadmap.  The Cooper Lake and Ice Lake chips, expected in 2019 and 2020 respectively, are expected to use a socket called LGA4189.  With 15% more pins than the LGA3647, I would expect the socket to be 15-20% larger as well.  While that doesn’t sound like much, it may be just too much to fit in a dense enough blade for the FX2 to continue to make sense.

Heat will also be a factor.  The same source above shows that Ice Lake will have a TDP of “up to 230W.”  That’s 75 watts higher than the hottest processor Dell offers in the FC640.  To handle the more powerful Ice Lake processors will probably take larger heatsinks and better cooling than is possible with any of the current FX2 form factors.

So how will these larger and hotter processors affect the blade landscape?  Dell knows the answer, and the answer lies in the MX7000.  Dell’s newest blade architecture, just released to the public in the past couple of months, is your traditional big box with vertical blades in the same vein as the old M1000e.  However unlike the M1000e, which offered full-height and half-height blades, the MX7000 only offers full-height single-width (2 CPU) and double-width (4 CPU) options.  Again, the helpful salesguy at the Dell EMC booth said that they were “unlikely” to be able to offer half-height blades because the days of 125 watt 37.5mm by 37.5mm CPUs are over.

So, are blades still worth it?  The M1000e could fit 16 servers into 10U and the FX2 can still fit 20 current generation servers into 10U.  The MX7000 can only fit 8 servers into 7U, and Lenovo can fit 14 servers into 10U.  The FX2 offers very high density in a form factor that is more flexible because it can be purchased and racked in smaller increments than the larger blade systems, but it appears that the days may be numbered for that form factor.

 

 

The End of an Era?

I was heartbroken to read about the demise of Weird Stuff Warehouse, a Silicon Valley institution.

I remember when they were just called Weird Stuff and were located in a commercial storefront near Fry’s in Milpitas.  They had glass display cases with a few dozen parts for sale, such as hard drives and peripheral cards. Once in a while, they would have something crazy, like a giant minicomputer hard drive with a spindle motor that looked like it belonged in a washing machine.  We mostly visited just to see what was new, though I do remember when they had trash cans full of ping pong balls that they were selling by the bagful.  We bought a few dozen to throw at each other at the office.

Imagine my surprise, then, when I was assigned a weeklong project in Milpitas a few years ago, a good decade after I had moved to SoCal.  My GPS took me to a nondescript warehouse entrance.  When I walked inside, it was like a massive museum.  Stack after stack of 30-year-old hard drives, cards, motherboards, power supplies, test equipment, industrial equipment, cables, wires, displays, servers, switches, cabinets, modem banks… I spent every evening after work walking up and down the aisles, admiring and sometimes touching the Silicon Valley of my youth.

With my (and my 17-year-old son’s) excitement building about the upcoming Vintage Computer Festival West in Mountain View this summer, I Googled Weird Stuff so that my son, too, could experience the fruits of Silicon Valley on those shelves.  Alas, it turns out that Googling was what contributed to the death of this institution.  The search giant bought the building, and Weird Stuff Warehouse closed its doors and sold its inventory to a company that, as far as I can tell, doesn’t have a retail presence.

In the light of recent events involving Facebook, Uber and other companies, there’s a growing sentiment that Silicon Valley is not what it used to be.  I can’t speak to that myself; I moved out of the Bay Area almost two decades ago and haven’t followed it as closely as I used to.  But it seems that Silicon Valley, which used to be about inventing and building better stuff (hence the “silicon” in the name) has forgotten its roots a bit in its bid to grab some of that VC gold rush money.  Perhaps Silicon Valley needs to get back to building more weird stuff instead.

The Motivation

First, a confession:  this is not my first attempt at a blog.

My previous attempts typically died after a post or two.  What makes things different this time?  I heard a podcast a couple of months ago where the guests listed reasons why I should consider doing this.  Improving my writing skills, sharing tips and tricks I’ve found during the course of my work, networking with peers, and having a public body of work were all reasons that resonated with me.

So, what should readers expect?  My goal is to post something at least weekly.  With my upcoming VMware VCP training, lots of vintage equipment finding its way into my house and plenty of excitement at the office, I should be able to post with some regularity.  I expect to eventually involve YouTube at some point, but I’m going to crawl before I try to walk.  I did fire up a Twitter Account for the stuff that’s better suited to short form.

Incidentally, I’m not getting paid to say this, but it’s easy to find WordPress.com offer codes on various podcast networks right now, and it’s awesome having somebody else to the heavy lifting for less than four bucks a month.  Having had to maintain and patch WordPress sites in the past, I like being able to focus on what I want to write and not whether the next WP patch is going to nuke a custom theme.