<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet title="XSL formatting" type="text/xsl" href="http://www.gandibar.net/feed/rss2/xslt" ?><rss version="2.0"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
  <title>Gandi Bar - Explanation of the outage that lasted 4 hours  - Comments</title>
  <link>http://www.gandibar.net/</link>
  <atom:link href="http://www.gandibar.net/feed/rss2/comments/1321" rel="self" type="application/rss+xml"/>
  <description>Gandi blog, to share our opinions</description>
  <language>en</language>
  <pubDate>Sat, 13 Mar 2010 16:36:17 +0100</pubDate>
  <copyright></copyright>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Dotclear</generator>
  
    
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - David</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170640</link>
    <guid isPermaLink="false">urn:md5:4a6746b0b32ee2544a1c12ef314125fd</guid>
    <pubDate>Thu, 16 Jul 2009 00:59:40 +0200</pubDate>
    <dc:creator>David</dc:creator>
    
    <description>&lt;p&gt;Was there an outage?  Neither me or my server seem to have noticed (last reboot was sometime in December).&lt;/p&gt;


&lt;p&gt;I do sympathise with you though, it's just one of those things which just jump at you out of the blue.  I too had an electrician come around to do some maintenance and lean on a certain panic button which stopped a multi-million dollar operation dead on its tracks.  Admittedly the button should have been properly guarded.  It was in a cramped space and the electrician in question was a professional wrestler thus a bit of a tight fit.  I guess we all learned from that one &lt;img src=&quot;/themes/default/smilies/smile.png&quot; alt=&quot;:)&quot; class=&quot;smiley&quot; /&gt;&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - thom</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170453</link>
    <guid isPermaLink="false">urn:md5:17b8fc1091b138aff39e6cf3b98c6e80</guid>
    <pubDate>Tue, 07 Jul 2009 23:55:07 +0200</pubDate>
    <dc:creator>thom</dc:creator>
    
    <description>&lt;p&gt;my server was down for few hours, and went backonline by itself. I had troubles in the past, and I assume there'll be trouble in the future, but as Gandi isn't a long time hosting company and the virt technology is still a bit young I don't really blame them. I'll pay attention to what the future will tell us and will look for other solutions for sure, but big outages like that do happen sometimes at Gandi or at other places ...&lt;br /&gt;
I guess you learned how to handle this kind of outage, I guess we learned (again ?) to have some backup server at hand ...&lt;/p&gt;


&lt;p&gt;Good work, but please do better next time&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Wouter</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170305</link>
    <guid isPermaLink="false">urn:md5:d9adc9cc40b006e1b9bbfb1dc4f87108</guid>
    <pubDate>Mon, 06 Jul 2009 17:44:27 +0200</pubDate>
    <dc:creator>Wouter</dc:creator>
    
    <description>&lt;p&gt;My server came back online rather quickly, but the data disk wasn't mounted anymore. A reboot didn't help either, I had to mount it manually and add it to /etc/fstab. I run my mail and web daemons from there, and they didn't start at reboot. To those that have problems with services that fail to restart, check if your data drive is mounted, even when the web interface tells you the drive is attached to your server.&lt;/p&gt;


&lt;p&gt;Every hosting provider I have experience with either professionally or personally has had these sorts of problems to some extend, so it's not that big a deal in my opinion... But Gandi needs to learn from it and I hope all the things that went wrong are properly analysed so next time something big goes down there are less side effects.&lt;/p&gt;


&lt;p&gt;After all, there are a lot of bad electricians and plumbers on the market... &lt;img src=&quot;/themes/default/smilies/smile.png&quot; alt=&quot;:)&quot; class=&quot;smiley&quot; /&gt;&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Nicolas (Gandi)</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170154</link>
    <guid isPermaLink="false">urn:md5:9af4e80b84889627dcda6bcfd323c00d</guid>
    <pubDate>Sat, 04 Jul 2009 12:52:16 +0200</pubDate>
    <dc:creator>Nicolas (Gandi)</dc:creator>
    
    <description>&lt;p&gt;Jordi: Have you contacted our customer care service ? Servers are all up but services on some servers have some difficulties. We have made an automatic fsck on Gandi AI but some Expert servers are locked on this. If you are in expert mode read this part &lt;a href=&quot;http://www.gandibar.net/post/2009/06/02/What-to-do-if-your-server-stops-responding&quot; title=&quot;http://www.gandibar.net/post/2009/06/02/What-to-do-if-your-server-stops-responding&quot; rel=&quot;nofollow&quot;&gt;http://www.gandibar.net/post/2009/0...&lt;/a&gt;&lt;br /&gt;
if you are in AI mode, our customer care works today and will take care of you&lt;/p&gt;


&lt;p&gt;Mc: Point 1 is for this week &lt;img src=&quot;/themes/default/smilies/smile.png&quot; alt=&quot;:)&quot; class=&quot;smiley&quot; /&gt; Point 2 is currently in test. Click on &amp;quot;my server is locked&amp;quot; when you contact the customer care duplicates your demand to the emergency line.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - mc</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170152</link>
    <guid isPermaLink="false">urn:md5:57eb4b66658c183821923b8ecda179aa</guid>
    <pubDate>Sat, 04 Jul 2009 08:45:59 +0200</pubDate>
    <dc:creator>mc</dc:creator>
    
    <description>&lt;p&gt;Stories of extraterrestrials intervening aside, the claims made by Gandi can not be backed up. It's like promising 100% uptime, unless too many servers go down, or safe backup of 1 TB, only to find that the backup disk is 500 GB. Those promises are then a gamble that you'll never actually need a double set of servers and/or equivalent disk space. Of course, this is how much of the hosting indsutry, by competitive necessity, works (with its &amp;quot;unlimited&amp;quot; resources on offer), but those companies are usually very careful about fronting words like reliability and redundancy in the blurb.&lt;/p&gt;


&lt;p&gt;It would be more accurate to say that the every system has a set degree of those factors, which is directly proportional to the &amp;quot;spares&amp;quot; at hand. Given that you have loyal and paying customers (I have been with you since the first week of the beta), I sincerely hope that all those free servers you cite as a contributing factor for the meltdown did not tip the odds for an event like this to happen the way it did. Now, that would actually upset me.&lt;/p&gt;


&lt;p&gt;I hope for two things:&lt;/p&gt;


&lt;p&gt;1. Monitoring of services. This has been on the wishlist since the start with zero progress. There is obviously a need.&lt;/p&gt;


&lt;p&gt;2. A red-alert channel for support. I have only contacted support two or three times. Replies have taken more than 24 hours. Clearly, that equals inadequate support when servers refuse to respond. A filter that channels those really urgent issues into much faster responses would be appreciated. Even plumbers have a hotline for water leaks.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Jordi</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170150</link>
    <guid isPermaLink="false">urn:md5:dc4edb427a1d542cf0f1c29e7035d745</guid>
    <pubDate>Sat, 04 Jul 2009 01:07:24 +0200</pubDate>
    <dc:creator>Jordi</dc:creator>
    
    <description>&lt;p&gt;My virtual server is STILL down - 2 days later.  This is insane.  Can't restart from the console either.  Rock on Amazon EC2&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Nicolas (Gandi)</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170149</link>
    <guid isPermaLink="false">urn:md5:294e3210e9e7f1160c1be14f24af2594</guid>
    <pubDate>Fri, 03 Jul 2009 22:55:18 +0200</pubDate>
    <dc:creator>Nicolas (Gandi)</dc:creator>
    
    <description>&lt;p&gt;Ok the part on the presentation about ufos is probably a little bit exagerated. I concidere the expert in UPS who came and switch off in perfect harmony the power our datacenter 1 as an Alien.&lt;/p&gt;


&lt;p&gt;After this kind of event, we have checked the RAID status before restarting any server. A part of the customer affected as been automatically and quickly transfered to the room 2. The other part (still a lot) try to simultanously start on the room 1 (about 3000 servers) this took really too much time at the beginning but we have found where was the problem and the last 80% has been done in about 30 minutes.&lt;br /&gt;
Now we are ready for the next Alien assault to react faster in case of.&lt;/p&gt;


&lt;p&gt;So yes, we never test this kind of crash with so many servers before and we didn't anticipate the problem that has stuck the full restart, In a way I fully admit we failed on that.&lt;/p&gt;


&lt;p&gt;But the promise of reliability, flexibility and redundacy is still in this technology.&lt;br /&gt;
We can loose a machine, a rack, several rack but not yet the biggest room we have in complete.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - dasein</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170146</link>
    <guid isPermaLink="false">urn:md5:bba37c9bc77ca0ec436ca319e10f2212</guid>
    <pubDate>Fri, 03 Jul 2009 22:02:51 +0200</pubDate>
    <dc:creator>dasein</dc:creator>
    
    <description>&lt;p&gt;I appreciate the communications and transparency regarding the situation, and the full explanation. But this has really given me pause about hosting any major sites on Gandi, which I was almost ready to do.&lt;/p&gt;


&lt;p&gt;Your promotional explanations on the Gandi site regarding hosting suggest that the chain of events you've described should not have happened at all. In other words, the promise of reliability and redundancy, so prominent on the Gandi site, clearly cannot be delivered. A huge disappointment, and Gandi's hosting is not ready for prime time at all. Really unhappy that infrastructure stability was emphasised when the reality was a very unstable hardware architecture and configuration.&lt;/p&gt;


&lt;p&gt;FAIL.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Nicolas (Gandi)</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170144</link>
    <guid isPermaLink="false">urn:md5:01e3f1108b7b5f7b1e6e77f6dea27bd2</guid>
    <pubDate>Fri, 03 Jul 2009 17:14:19 +0200</pubDate>
    <dc:creator>Nicolas (Gandi)</dc:creator>
    
    <description>&lt;p&gt;Almost everybody has been managed from 9 this morning as soon as they has arrived with an issue on their server. If you have sent a mail to our customer care you should have received an answer now.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - Will</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170138</link>
    <guid isPermaLink="false">urn:md5:651626ec83ff534f888eccf96c045eaf</guid>
    <pubDate>Fri, 03 Jul 2009 10:41:26 +0200</pubDate>
    <dc:creator>Will</dc:creator>
    
    <description>&lt;p&gt;My server has been down for nearly 24 hours now, it's just not good enough. This should have been fixed long ago.I was expecting more from Gandi, I am very angry about it.&lt;/p&gt;</description>
  </item>
      
    
    <item>
    <title>Explanation of the outage that lasted 4 hours - unsatisfied</title>
    <link>http://www.gandibar.net/post/2009/07/02/Explanation-of-the-outage-that-lasted-4-hours#c170129</link>
    <guid isPermaLink="false">urn:md5:05abc3a4bcdcc74539d595135bd29337</guid>
    <pubDate>Fri, 03 Jul 2009 02:19:52 +0200</pubDate>
    <dc:creator>unsatisfied</dc:creator>
    
    <description>&lt;p&gt;Guys - my virtual Xen site is still down in spite of various attempts to reboot from the website CP -  This is the Nth time this year due to hardware or other problems that I've experienced downtime with Gandi - I think that I'm moving to Slicehost&lt;/p&gt;</description>
  </item>
      
</channel>
</rss>