Best Practices for Server Management

Wednesday, October 19, 2011

The proper care and feeding of server systems requires a variety of different tasks that have to be performed on a daily, weekly, monthly, or other routine basis to ensure that your servers operate at optimal levels. Here is a listing of tasks that should be included in your server maintenance plan.

Review Event logs

Using the system’s Event Viewer, check the system, application, and security logs for warning and error messages related to service startup, application, and unauthorized access. If you have more than one server that you manage, it is best to set up a centralized infrastructure to collect all of this information. Microsoft has a product called Systems Center Operations Manager that does an excellent job in this area.

Archiving logs

Depending on the business drivers, logs stored on the server should be archived for future reference. Logs related to security should be archived according to the organization’s security policy. Some logs need to be retained according to law.

Disaster Recovery

Backups should be performed regularly for each individual server, preferably on a daily schedule. It is not necessary to back up the contents of the entire hard drive(s), but rather the critical data. This is because in a recovery scenario, you can simply re-install the operating system and applications prior to restoring the data from your backup set. A “full-metal” restore, a backup set that contains the entire content of the hard drive, is much more complex to recover from. Windows has a backup solution built-in, but there are other third party applications. Symantec offers very good backup solutions. To ensure that your backups are actually valid, routine restores should be performed. It would not be good timing to figure that your backups are not running correctly when you are required to restore lost data. In addition, you should have a backup strategy in place that includes backups on and off site to ensure that events such as natural disasters don't wipe out your only recovery option.

Monitor System Performance

It is important to monitor your system’s performance. When you first setup your server(s), it is a very good idea to capture a performance baseline. You would then setup your monitoring so that you can collect daily, weekly, and monthly statistics to help you determine how well your server is performing. Comparing those collected statistics and comparing them to the server baseline provides a method of predicting when the server may cross certain thresholds that will impact performance. You can use Windows’s built-in performance monitoring solution or other third party applications. If you have more than one server to monitor, you should consider an Enterprise product. Again, you can use Microsoft’s solution, Systems Center Operations Manager. You should be tracking CPU performance, Disk Utilization, and Memory consumption.

Software and Hardware Monitoring

When it comes to servers, you want to be prepared for equipment failure. Fans, hard drives, and other components in a failed state can bring the server down. You can mitigate some of these issues by having redundant systems within the server such as multiple fans, CPU, RAID arrays, etc... However, even with this added redundancy, monitoring the hardware is very important. Some of the server hardware vendors have products that you can use freely to monitor your server’s hardware. Generally, these products use SNMP. Some of these products have a “phone-home” capability which will alert the vendor of failed hardware and if the server is under warranty, you would expect the hardware to be shipped to you automatically. Aside from the hardware monitoring, you also should be monitoring the health of the server, including network availability, services, scheduled tasks, etc…

Disk Defragmentation

It is important to “defrag” your server’s hard drives regularly, especially for file servers. As data is stored and deleted from the file system, you’ll find that new data stored begins to be placed on the hard disk in a fragmented form, that is parts of the file are scattered on the disks. This is not a problem for the file system, as those fragments are tracked. However, fragmentation causes degradation of hard drive performance. Obviously, if you file is fragmented in 100 pieces, it does take extra time for the file to be put back together.

Antivirus/Malware Protection

When running antivirus and malware protection type of software on your systems, you need to ensure that the agents are running and are up-to-date on the engine and pattern definition files. The easiest way to manage this type of environment is to use a central management server in which the agents report back to a central data-store. Aside from real-time protection, you should enable routine jobs that scan the hard drives for malware. Keep in mind that some files may need to be excluded from the scans.

Service Packs and Hotfixes

It is very important to remain abreast of your vendor’s schedule for the public distribution of patches, hot-fixes, service packs, etc… For many years now, Microsoft has standardized this on a monthly basis. Other vendors such as Adobe have implemented a quarterly schedule. When patches become available, it is important that you fully test the patches by applying them to a “control” group prior to general distribution to all of your servers. While most patches do not create issues on the servers, some have in the past. You should always attempt to stay current on installing missing patches, especially those that have been categorized as critical, or security related.

Service Accounts

If you are running applications on your servers that require dedicated service accounts, always try to use domain accounts rather than local accounts. This will give you the ability to centrally manage these accounts. Since the majority of service accounts passwords are generally not changed on a regular basis, it is important to always assign a strong, complex password to these accounts.

Asset Management

Asset management is an important part of server management. Servers should be tracked from the moment of procurement to the end of its life cycle. There are many applications that do a very good job in this arena. At the very minimum, if you are unable to track this information using software, a manual spreadsheet should be maintained which includes information about the location of the server, hardware serial numbers, warranty information, owner and system admin contact information.

This is just a general overview of some of the most common tasks required to keep a server operating at optimal performance. When you add applications and other services into the picture, there are many other tasks that need to be considered.

Did you find the page informational and useful? Share it using one of your favorite social sites.

Recommended Books & Training Resources

MCITP Windows Server 2008 Enterprise Administrator: Training Kit 4-Pack: Exams 70-640 70-642 70-643 70-647 Windows Server 2008 R2 Unleashed