The proper care and feeding of server systems requires a variety of different tasks that have to be performed on a daily, weekly, monthly,
or other routine basis to ensure that your servers operate at optimal levels. Here is a listing of tasks that should be included in your server
Review Event logs
Using the system’s Event Viewer, check the system, application, and security logs for warning and error messages related to service
startup, application, and unauthorized access. If you have more than one server that you manage, it is best to set up a centralized
infrastructure to collect all of this information. Microsoft has a product called Systems Center Operations Manager that does an excellent
job in this area.
Depending on the business drivers, logs stored on the server should be archived for future reference. Logs related to security should be archived
according to the organization’s security policy. Some logs need to be retained according to law.
Backups should be performed regularly for each individual server, preferably on a daily schedule. It is not necessary
to back up the contents of the entire hard drive(s), but rather the critical data. This is because in a recovery scenario, you can simply
re-install the operating system and applications prior to restoring the data from your backup set. A “full-metal” restore, a backup set that
contains the entire content of the hard drive, is much more complex to recover from. Windows has a backup solution built-in, but there are other
third party applications. Symantec offers very good backup solutions. To ensure that your backups are actually valid, routine restores should be
performed. It would not be good timing to figure that your backups are not running correctly when you are required to restore lost data.
In addition, you should have a backup strategy in place that includes backups on and off site to ensure that events such as natural disasters
don't wipe out your only recovery option.
Monitor System Performance
It is important to monitor your system’s performance. When you first setup your server(s), it is a very good
idea to capture a performance baseline. You would then setup your monitoring so that you can collect daily, weekly, and monthly statistics to
help you determine how well your server is performing. Comparing those collected statistics and comparing them to the server baseline provides
a method of predicting when the server may cross certain thresholds that will impact performance. You can use Windows’s built-in performance
monitoring solution or other third party applications. If you have more than one server to monitor, you should consider an Enterprise product.
Again, you can use Microsoft’s solution, Systems Center Operations Manager. You should be tracking CPU performance, Disk Utilization, and Memory
Software and Hardware Monitoring
When it comes to servers, you want to be prepared for equipment failure. Fans, hard drives, and other
components in a failed state can bring the server down. You can mitigate some of these issues by having redundant systems within the server
such as multiple fans, CPU, RAID arrays, etc... However, even with this added redundancy, monitoring the hardware is very important. Some of the
server hardware vendors have products that you can use freely to monitor your server’s hardware. Generally, these products use SNMP. Some of
these products have a “phone-home” capability which will alert the vendor of failed hardware and if the server is under warranty, you would
expect the hardware to be shipped to you automatically. Aside from the hardware monitoring, you also should be monitoring the health of the
server, including network availability, services, scheduled tasks, etc…
It is important to “defrag” your server’s hard drives regularly, especially for file servers. As data is stored and deleted from the file system, you’ll find that new data stored begins
to be placed on the hard disk in a fragmented form, that is parts of the file are scattered on the disks. This is not a problem for the file
system, as those fragments are tracked. However, fragmentation causes degradation of hard drive performance. Obviously, if you file is fragmented
in 100 pieces, it does take extra time for the file to be put back together.
When running antivirus and malware protection type of software on your systems, you need to ensure that the
agents are running and are up-to-date on the engine and pattern definition files. The easiest way to manage this type of environment is to use a
central management server in which the agents report back to a central data-store. Aside from real-time protection, you should enable routine
jobs that scan the hard drives for malware. Keep in mind that some files may need to be excluded from the scans.
Service Packs and Hotfixes
It is very important to remain abreast of your vendor’s schedule for the public distribution of patches,
hot-fixes, service packs, etc… For many years now, Microsoft has standardized this on a monthly basis. Other vendors such as Adobe have
implemented a quarterly schedule. When patches become available, it is important that you fully test the patches by applying them to a “control”
group prior to general distribution to all of your servers. While most patches do not create issues on the servers, some have in the past. You
should always attempt to stay current on installing missing patches, especially those that have been categorized as critical, or security
If you are running applications on your servers that require dedicated service accounts, always try to use domain
accounts rather than local accounts. This will give you the ability to centrally manage these accounts. Since the majority of service accounts
passwords are generally not changed on a regular basis, it is important to always assign a strong, complex password to these accounts.
Asset management is an important part of server management. Servers should be tracked from the moment of procurement to
the end of its life cycle. There are many applications that do a very good job in this arena. At the very minimum, if you are unable to track
this information using software, a manual spreadsheet should be maintained which includes information about the location of the server, hardware
serial numbers, warranty information, owner and system admin contact information.
This is just a general overview of some of the most common tasks required to keep a server operating at optimal performance. When you add
applications and other services into the picture, there are many other tasks that need to be considered.
Recommended Books & Training Resources