Intermittent Web Site Trouble - May Recur
2008-02-07 01:05 AM
The web site's server is experiencing periodic excessively high loads. There may continue to be temporary outages of service on this web site until the problem is resolved.
Our web site is hosted on shared hardware with a number of other web sites. It's a very capable machine (Quad-Xeon processors at 2400 Mhz with 4 GB of RAM), but some of the programs that support other web sites are excessively loading the machine's processors at certain times. It seems to happen in the late morning to early afternoon, central time. Our hosting company is working on the problem now. If this web site were critical to the mission of the ELS, we could migrate it to different server hardware, and we may do that yet. However, the problem might be fixed without that. (I, for one, hope so, now that Lent is upon us with services three days a week!)
I apologize for the inconvenience if you have tried to access the web site when it was not available. You would receive one of several errors from the main (Apache) web server program, complaining that another server program is not cooperating with it. The problem is that the other server - which runs most of this web site - is being killed (forced to quit) when it's working hard to do something useful, leaving the main (Apache) web server hanging with nothing to send your web browser. This happens automatically when the machine's processor usage gets too high. After it's killed, our web server program tries to start up again. Unfortunately, it creates a good bit of processor load when it does, which means it soon meets another untimely end. Now, imagine a few (maybe 25 or so) different programs being killed and restarting at about the same time. It drives the machine's load even higher, worsening the problem, and extending the downtime.
As I mentioned, our hosting company is working now to solve the problem. However, it might easily continue to recur for a little while. I've taken steps to keep our web server from using so much processor time when it restarts, to hopefully keep it from being killed. However, that also means it will run slower - probably too slow at times, so that both the Apache server and you get tired of waiting. So I again apologize for the inconvenience.
Meanwhile, I'll be developing an upgrade path for our web site to new versions of the Zope and Plone software that will hopefully eliminate our memory leak problem and make it unnecessary for the server to be restarted at all.
