Working in the Hot Aisle: Sealing It Up

“The hot stays hot and the cool stays cool.” -McDonalds McDLT commercials

Part 1 of this saga can be found here.

We were all just a bunch of starry-eyed kids with our plan for the future:  Rack some equipment, buy back-to-front cooled switches, snap in some filler blanks and the world would be a beautiful place.  We soon were in for a big reality check.  Let’s start with a simple task: buying some switches.

It turns out that it’s actually difficult to find switches that suck air in the back and blow it out the front.  In fact, it’s even fairly difficult to find switches that blow front to back.  Many (most?) switches pull air from the left side of the switch and blow it out the right or vice versa.  In a contained data center, that means that the switches are pulling in hot air from the hot aisle and blowing out even hotter air.  In fact, there are diagrams on the Internet showing rows of chassis switches mounted in 2-post racks where the leftmost switch is getting relatively cool air and blowing warmer air into the switch to its right.  This continues down the line until you get to the switch at the other end of the row, which is slowly melting into a pile of slag.  Needless to say, this is not good for uptime.

There are companies that make various contraptions for contained-aisle use.  These devices have either passive ducts or active fan-fed ducts that pull air from the cold aisle and duct it into the intake side of the switch.  Unfortunately, switch manufacturers can’t even agree on which side of the switch to pull air from or where the grilles on the chassis are located.  Alas, this means that unless somebody is making a cooler specific to your chassis, you have to figure out which contraption is closest to your needs.  In our case, we were dealing with a Cisco 10-slot chassis with right-to-left cooling.  There were no contraptions that fit correctly, so an APC 2U switch cooler was used.  This cooler pulls air from the front and blows it up along the intake side of the switch in the hot aisle.  While not as energy efficient as contraptions with custom fitted ducts that enclose the input side of the switch, it works well enough and includes redundant fans and power inputs.

For the top of rack and core switches, only the Cisco Nexus line offered back-to-front cooling options (among Cisco switches, that is.)  That’s fine since we were looking at Nexus anyway, but it’s unfortunate that it’s not an option on Catalyst switches.  Front-to-back cooling is an option, but then the switch ports are in the cold aisle, meaning that cables must be passed through the rack and into the hot aisle.  It can work, but it’s not as clean.

However, buying back-to-front cooled switches is but the beginning of the process.  The switches are mounted to the back of the cabinet and are shorter than the cabinet, meaning that there is a gap around the back of the switch where it’s not sealed to the front of the cabinet.  Fortunately, the contraption industry has a solution for that as well.  In our case, we went with the HotLok SwitchFix line of passive coolers.  These units are expandable; they use two fitted rectangles of powder-coated steel that telescope to close the gap between the switch and the cabinet.  They come in different ranges of depths to fit different rack depth and switch depth combinations and typically mount inside the switch rails leading to the intake side of the switch.  Nylon brush “fingers” allow power and console cables to pass between the switch and the SwitchFix and into the hot aisle.

The rear of the switch as viewed from the cold aisle side. The SwitchFix bridges the gap between the rear of the switch and the front of the rack.

While this sounds like an ideal solution, in reality the heavy gauge steel was difficult to expand and fit correctly, and we ended up using a short RJ-45 extension cable to bring the console port out of the SwitchFix and into the cold aisle for easy switch configuration.  The price was a little heart-stopping as well, though it was still better than cobbling together homemade plastic and duct-tape contraptions to do the job.

With the switches sorted, cable managers became the next issue.  The contractor provided standard 2U cable managers, but they had massive gaps in the center for cables to pass through–great for a 2-post telco rack, but not so great for a sealed cabinet.  We ended up using some relatively flat APC 2U cable managers and placed a flat steel 2U filler plate behind them, spaced out with #14 by 1.8″ nylon spacers from Grainger.  With the rails fully forward in the cabinet, the front cover of the cable manager just touched the door, but didn’t scrape or significantly flex.

One racks are in place and equipment is installed, the rest of the rack needs to be filled to prevent mixing of hot and cold air. There are a lot of options, from molded plastic fillers that snap into the square mounting holes to powdered coated sheets of steel with holes drilled for mounting. Although the cost was significantly higher, we opted for the APC 1U snap-in fillers. Because they didn’t need screws, cage nuts or tools, they were easy to install and easy to remove. With the rails adjusted all the way up against the cabinet on the cold aisle side, no additional fillers were needed around the sides.

With every rack unit filled with switches, servers, cable managers, telco equipment and snap-in fillers, sealing the remaining gaps was our final issue to tackle from an efficiency perspective.  While the tops of the cabinets were enclosed by the roof system, there was still a one-inch gap underneath the cabinets that let cold air through.  Even though the gap under the cabinet was only an inch high, our 18 cabinets had gaps equivalent to about three square feet of open space!  We bought a roll of magnetic strip to attach to the bottom of the cabinets to block that airflow, reduce dust intrusion and clean up the look.

Lessons Learned

There’s no other way of saying this: this was a lot of work. A lot of stuff had to be purchased after racking began. There are a lot of gotchas that have to be considered when planning this, and the biggest one is just being able to seal everything up. Pretty much the entire compute equipment industry has standardized on planning their equipment around front-to-back cooling, which makes using them in a contained-aisle environment simple. Unfortunately, switch manufacturers are largely just not on board. I don’t know if it’s because switching equipment is typically left out of such environments, or if they just don’t see enough of it in the enterprise and small business markets, but cooling switches involves an awful lot of random metal and plastic accessories that have high markups and slow shipping times.

However, I have to say that having equipment sitting at rock-stable temperatures is a huge plus. We were able to raise the server room temperatures, and we don’t have the hot-spot issues that cause fans to spin up and down throughout the day. Our in-row chillers run much less than the big CRAC units in the previous data center, even though there is much more equipment in there today. The extra work helped build a solid foundation for expansion.

Working in the Hot Aisle: Power and Cooling

Part 1 of this saga can be found here.

An hour of battery backup is plenty of time to shut down a dozen servers… until it isn’t.  The last thing you want to see is that clock ticking down while a couple thousand Windows virtual machines decide to install updates before shutting down.

We were fortunate that one of the offices we were consolidating was subleased from a manufacturer who had not only APC in-row chillers that they were willing to sell, but a lightly used generator.  Between those and a new Symmetra PX UPS, we were on our way to breathing easy when the lights went out.  The PX provides several hours of power and is backed by the generator, which also backs up the chillers in the event of a power outage.  The PX is a marvel of engineering, but is also a single point of failure.  We witnessed this firsthand with an older Symmetra LX, which had a backplane failure a couple of years previous that took down everything.  With that in mind, we opted to go with two PDUs in each server cabinet: one that fed from the UPS and generator, and one that fed from city power with a massive power conditioner in front of it.  These circuits also extend into the IDFs so that building-wide network connectivity stays up in the event of a power issue.

Most IT equipment comes with redundant power supplies, so splitting the load is easy: one power supply goes to each PDU.  For the miscellaneous equipment with a single power supply, an APC 110V transfer switch handles the switching duties.  A 1U rackmount unit, it is basically a UPS with two inputs and no batteries, and it seamlessly switches from one source circuit to the other when a voltage drop is detected.

As mentioned, cooling duties are handled by APC In-Row chillers, two in each aisle.  They are plumbed to rooftop units and are backed by the generator in case of power failure.  Temperature sensors on adjacent cabinets provide readings that help them work as a group to optimize cooling, and network connectivity allow monitoring using SNMP and/or dedicated software.  Since we don’t yet need the cooling power from all four units, we will be programming the units to run on a schedule to balance running hours between the units.

Cooling in the IDFs is handled by the building’s chiller, with an independent thermostat-controlled exhaust fan as backup. As each IDF basically hosts just one chassis switch, cooling needs are easily handled in this manner. As users are issued laptops that can ride out most outages, we were able to sidestep having to provide UPS power to work areas.

Next time:  Keeping the hot hot and the cool cool.

Working In the Not-Yet-Hot Aisle

“This is no longer a vacation.  It’s a quest.  It’s a quest for fun.” -Clark W. Griswold

The 48U enclosures and the in-row CRACs are in place and bolted together, but there’s no noise except the shrill shriek of a chop saw in the next room.  Drywall dust coats every surface despite the daily visits of the kind folks with mops and wet rags.  The lights overhead are working, but the three-cabinet UPS and zero-U PDUs are all lifeless and dark.

Even in this state, the cabinets are being virtually filled.  In a recently stood-up NetBox implementation back at HQ, top-of-rack switches are, contrary to their name, being placed in the middle of the enclosures.  Servers are being virtually installed, while in the physical world, blanking panels are being snapped into place and patch panels are installed by the cabling vendor, leaving gaping holes where there will soon be humming metal boxes with blinkenlights on display.  Some gaps are bridged with blue painters’ tape, signifying with permanent marker the eventual resting place of ISP-provided equipment.

During the first couple of weeks after everything is bolted down, we’re pretty much limited to planning and measuring because of the room is packed with contractors running Cat6, terminating fiber runs, plumbing the CRACs, putting batteries in the UPS, connecting the generator, wiring up the fire system–it’s barely controlled chaos.  Within a couple of weeks, the pace slows a bit; it’s still a hardhat-required zone, but the fiber runs to the IDFs are done and being tested, patch panels are being terminated and the fire system has long since been inspected and signed off by the city.  A couple of more weeks and we know it’s time to get serious when the sticky mat gets stuck down inside the door, the hardhat rules are rescinded and the first CRACs are fired up.

Thus begins the saga of a small band of intrepid SysAdmins working to turn wrinkled printouts, foam weatherstripping, hundreds of cage nuts, blue painter’s tape and a couple hundred feet of Velcro into a working data center.  This marks the first time I’ve worked in a hot-aisle/cold-aisle data center, much less put one together.  This is something I’ve wanted to do for years, but there’s remarkably little detailed information on the web about this process; the nitty gritty of data center design and construction is usually delegated to consultants who like to keep their trade a closely-guarded secret, and indeed, we consulted with a company on the initial design and construction of our little box of heaven.

The concept of hot-aisle/cold aisle containment is pretty straightforward and detailed in hundreds of white papers on the Internet: server and network equipment use fans to pull cool air in one side of the unit and blow heat out the other.  Therefore, if you can turn your data center into two compartments, one that directs all of the cooled air from your A/C into the cold intake side of the equipment, and one that directs all of the heated air from your equipment back into the A/C return, you increase the efficiency and reduce the costs of running your A/C, and you keep hot exhaust air from returning to the intake of the equipment.  Ultimately, if done right, you can turn up the temperature in the cold aisle, further reducing your costs, as you don’t have “hot spots” where equipment is picking up hotter exhausted air.  The methods for achieving this vary greatly.

And more importantly, it turns out that there are some caveats that can either increase that initial cash outlay significantly or reduce overall efficiency.

Stay tuned as I dig into the details of this new project.