FSU

Your physical environment

Computing environments evolve, but the overall trend over years has been more smaller boxes such as the ones used in this class are becoming more common. And even those are now being partitioned into smaller logical computers, such as with VirtualBox, Xen and Vmware (changing the trend seen in the mid to late 90s when large computers such as Sun Enterprise 10000 class were being divided into a 8 or 16 ``domains''.)

Instead of seeing large boxes sitting on their own in a computer floor, generally smaller computers are ``rack-mounted'', and generally follow a convention of specifying their height in ``u'' notation:

Fortunately, the physical challenges are less with smaller boxes: they are less bulky, they weigh less, they (generally) consume less power and less cooling than the behemoths of old. Some challenges do grow: more power sockets are necessary to handle more machines in a single rack. Some boxes support multiple power cords in an effort to improve redundancy.

Maintenance? Should I buy a hardware maintenance contract?

Safe board-handling

Monitors: going to LCDs

Fortunately, we have almost completely moved away from old CRT technology, into the widespread use of LCD technology.

These are easier to move, and in the event of an earthquake (not likely here in Tallahassee!), are much less dangerous if they fall on or near you.

But for those occasional times when you might still work with the old glass CRTs, keep in mind that they are relatively fragile and also are not safe to open up. If they don't work, there are not any system administrator-level parts inside of a monitor that are serviceable.

Memory

Memory these days is a commodity item. Prices are very attractive (8 gigabytes of DDR3 memory from Newegg is currently around $50 to $80 as of this morning for commodity memory), and generally you should not buy memory even for a non-commodity machine from the original vendor if you can avoid it.

Memory is very static sensitive, and you should strongly consider using a grounding strap whenever you work with it.

Removing memory is generally more challenging than inserting it. Generally, it snaps in easily (but certainly not always), but removal on some older machines was not as easy --- the worst in my memory was a box from Sun where I succeeded in breaking more than one (expensive) memory board on a particularly bad day.

Preventive maintenance (also known as ``p.m.'')

In the past, preventive maintenace was a big job duty of system administrators. Cleaning printers, cleaning tape units, vacuuming or blowing out dust: there was certainly a plebian side to being the system administrator. Fortunately, there is less call for such mundania these days except for dust problems.

Some preventive maintenance is done to combat human failings. Covering over vents is still all-too-common. Even rack-mounted systems are not immune to air problems. The worst that I have seen was was when cardboard was casually inserted into a rack system that had closed doors. Until I happened to open the doors on the rack for that machine (which was showing a significant over-temperature reading on its internal sensors), I had no idea why it was getting too warm.

Among the parts wont to fail are fans and power supplies. Fortunately, these days it is becoming common on servers to have N+1 separate power supplies, where any one of them call fail and the machine will continue to run. Both failing fans and failing power supplies can now often be detected by the computer hardware itself (though in the case of fans, it may be indirectly through temperature monitoring on some equipment.)

Environment

A 2009 visit to a Data Center in Orlando.

As does LAH, I recommend keeping your machine room around 66 to 68 degrees Fahrenheit (19 or so Celsius) with about 30%-50% humidity. I have seen a large machine room which was being kept around 78 degrees, which I thought was too warm.

However, there are now studies that show that higher temperatures are feasible. The overall air temperature in a server room can be misleading as an indicator of what temperature your equipment is actually experiencing, and it is a very good idea to monitor your equipment's built-in sensors, which are excellent these days (even if the interpretations can still be tricky.)

However, cooling equipment can be a bit worrisome itself; if you have raised flooring, you may come in as I did one day to find that the cooling equipment had vented water down into the raised flooring creating a lake (or at least largish puddles as has occurred in the machine room we had in the first floor of MCH.)

Having a separate temperature monitoring setup is a good idea. LAH mentions a standalone product (Phonetics Sensaphone); another possibility is that other equipment such as PDUs often also have their own environmental monitoring capabilities. (A PDU is a power distribution unit. These are very common in large server rooms.)

Power

In providing power in a server room, many places, including our computer science server room, provide UPS power. Generally, the UPSs used in computer rooms provide power conditioning in addition to emergency power service in the event of power failure. Using such a UPS will provide your computers with a safe source of power, even in the event of brownouts or power surges, and will likely be able to filter line noise (especially important in some industrial scenes.)

Also in providing power, if you have any input as to the design of a new server room, I recommend that you look at putting your rack power provisioning over the racks rather than under, such as with raised flooring. As mentioned previously, raised flooring provides a great place for a new man-made lake, and that is the last thing that you want power cables to be sitting in. Try to get both sides of the racks powered (and if you have the luxury, try to get the power from two different sources, such as from two different UPS units or at least two separate PDUs.) When you have large power requirements coming in, using PDUs to monitor and distribute the power can be advantageous. I have seen a 400-500 server environment that was very successful with two huge UPSs feeding many PDUs that then distributed and monitored the power environment.

Also, check with your electrical experts as to any phase issues that they might perceive. Modern PDUs are very good at displaying phase information.