is there an ongoing network outage again -.- same symptoms as yesterday at least hm they are back? What's up with the internet connection down there the last couple days twice now I've lost my connection in the morning same here as you can see from all those ping timeouts as well no updates on twitter Not acceptable. My monitoring software shows it was down yesterday at 6:55 until 7:10, then it was spotty for about 15 minutes. mind you the resolution on those times kinda sucks, I think it is 3 or so minutes, but still my monitoring software doesn't show an outage yesterday. Are you sure it's not a routing issue between you and them? mine did too... I'm sure quite a few other people here saw it as well I too had a problem yesterday at that time so whatever is being monitored isn't capturing the user experience fully my nagios couldn't ping *itself*. that's insane when maayan reports "maayan is unpingable"... maayan also couldn't reach chris, another machine on a different kvm hrm, are you two on the same host, by chance? maybe it's something specific to that host? if a host can't ping itself.. thats a bit extreme. are you monitoring the external IP? and internal IP? or loopback? and this a common thing? i'd be really interested in knowing if internal pings fail, and if loopback pings fail.. if it's a re-occuring thing. I think I'm on 13 RandalSchwartz: another question. does your 'uptime' change? I'm wondering if your seeing a hardware watchdog/restart thing? maybe watchdog reset is failing, and the machine is restarting even though the hosts are responsive (I've actually seen that before. and as the host is restarting, the networking isn't available at the host level, even though the host is running (in the process of shutting down) err, starting up rather. hmm, actually I'm on kvr08 - my memory is not would it should be I'm on kvr02 also also saw some Internet "pausing" earlier. I waited it out and it came back within 10 minutes. But I don't really do anything but email and IRC from this server. We run our ticket system on a VPS there, not sure what hardware that one is on, my personal VPS is on kvr09 and I lost connection jpalmer: server wasn't restarted, I would have had a notification for that, also I would have really felt it on my vps as I have some things just running in tmux not auto started (since I am deving on them) no uptime restart 10:08AM up 47 days, 16:09, 5 users, load averages: 0.00, 0.00, 0.00 some irc connections dont ping out screen sessions still alive I saw the same thing yesterday from 7:50 to 8:15 my internet went away, but the box was otherwise unaffected. as in "went away", I mean I couldn't even ping my own address. it was very weird. yup Hmm. I thought you could change the CD in the tray via the console already kvr12 had some load issues for a bit (yesterday) I guess that part isn't automated. i just reset a bgp session now (packetexchange peer disappeared, wth?) :o up_the_irons++ diamondshakes++ RandalSchwartz: "own address" meaning external IP? internal IP? loopback ip? my public IP jpalmer: www.arpnetworks.com was unreachable during the time as well that's what I'm testing in nagios maayan couldn't ping its own public IP nor could it reach chris (.insightcruises.com) RandalSchwartz: any chance you could set your monitoring to ping a local IP? as well as loopback? I wonder if it's a deeper issue. jpalmer: well, maybe up_the_irons's reset might help things? diamondshakes: if he can't ping from his box TO his box, I doubt it's a BGP peering issue. diamondshakes: he's saying his nagios box can't ping itself. o yeah - I've seen this before too. yeah that's not a bgp issue about once every six weeks lasts about 10 minutes basically, net connections die, and the box can't ping itself and as we see, it's not just my boxes. seems to be rather systemic it's odd though. because if you're pinging the same IP (from that box) the traffic shouldn't even be hitting the switch. so it's not an ARP table corruption issue or something.. but it still has to go through the VM's drivers so maybe there's a VM problem as in, I'm pinging the "emulated" em0 thats kinda the direction I'm headed. you're right in that a good test to add would be to ping the lo0 interface too and see if they both die at the same time I'll get on that I think you should add 2 more "hosts" to your nagios. bring up an RFC1918 address (as an alias on em0) and also monitor the loopback address as a seperate host. this will tell you if it's specific to "em0" or something else. ooh. yeah more things to add for this weekend's "hours billable to Neil" weekend :) I wonder if there is a newer version of the virtualization osftware than ARP is using. might be worth investigating the Changelog from that version on. you don't have to do rfc1918 ok - in transit... back soon. just ping the link local v6 address toddf: good call. also, it's probably worth finding out if it's isolated to ipv6, or ipv4. given the description, I'd lend towards os specific handling of the emulated nic. would be interesting if any openbsd people had the same issue, or other os's .. I suspect as has been stated it is an 'emulated nic stops responding properly' maybe interrupts stop happening properly or something .. for 10mins .. then resumes again toddf: completely different versions of the linux kernel having this problem? Seems unlikely. especially given that we use kvm/qemu at work and I've never seen that issue on our own hardware. Wraithan: linux? I thought guests were generally *BSD I'm talking guest os support for the emulated nics here toddf: thats a good call. all my VM's are centos, and I haven't noticed my monitoring complaining. I guess from context you use loonix inside as the guest os. interesting to know. more info is useful until it is determined what is exactly happening. (I suspect one vps going bezerk for a bit then settling down, but thats just a wild stab in the dark) toddf: The two VPSs on ARP that I have are linux, one Ubuntu 10.10 the other Arch Linux I have a total of 3 VPS's, all are centos. (2 5.x, 1 6.x) and I haven't seen this issue. toddf: and it has been 2 different hardware machines, since LT said he is on 09 and I am on 08 murphy bezerk can be something that effects networking in general since vlans can (and are for some clients) available on multiple host systems got to promote arp on the freebsd mailing list a moment ago awesomeness oo, i see the email 13 minutes ago freebsd-questions@ and its already archived in the web http://tinyurl.com/3gdwzdm :P http://v.gd/arp_plug comeon, create one with some style? ;-) >.> it doesnt go direct >.> though I will take exception, last I checked, elastic hosts is not what I would consider a similar pricepoint. oo, qr code too its in the 'cheaper than ec2' bucket perhaps but .. I've yet to find anything similar to ARP until it gets to be a very large machine and then why not just host real hardware at that point? ;-) (I look when people try to convince me otherwise, and always walk away knowing I've got the best deal around) maybe i am like 3 years behind, but is "cloud" another word for "virtual server" ? it wasn't "triple the price of ARP" like the other vendors I found no and people are just selling the same old thing under a new name? so it made it closer to "similar" but you can treat it as such similar but more :) tons more and no ipv6 yet haha indeed cheapest one i can find there is $4/mo *$44 yeah - but once you start pricing the $75-ish and up VPS, it gets closer randalschwartz: ah, I hadn't remembered that, yeah the size of vps's you are using makes them similar indeed. I'm looking at the low end ;-) gamarco: "cloud" is a useless marketting term, that has no real meaning or value. ask 100 people what "cloud" is, and you'll get 85 different answers. it's a vps, no it's a colocated server, no, it's a web app. no, it's virtualization on the local lan, no it's.. all bullshit. the only 'concrete' definition for cloud is "it's a real, or virtual thing, where you can host your data or application, either on the internet, or locally" which essentially means.. every server, website, application, or database is a compnent of this nebulous 'cloud' idea. whenever i heard cloud computing i think of peoples heads up in the clouds being blinded by them cluelessly saying the world clouds over and over and only apple could take such a useless term, and make it not only useless, but redundant by calling it 'icloud!' lol what have those clouders been smoking (AND successfully SELL it, no less.) i want some o/~ I've looked at clouds from both sides now... o/~ Its no shocker that magazines such as PC world are pushing the term cloud just another buzzword HAH! I just quick polled the office. asked 18 people. guess what? no TWO people answered "when I say something about using 'the cloud' what exactly am I referring to' no two people said the same answer. lol its because its so rediculously vague yes - everyone has their own individual cloud! that's the beauty! My Personal Cloud Jean Cloud van Damme I hope my cloud does not rain away ok, finally.. 2 similar answers (I'm at 33 people polled now) and similar in a "white and balck are similar, in that they are both colors" kinda way. (one said "internet" one said "anything on a network, or not on your local machine") That's about as vague as possible; seems like the right answer. hehe by the second definition.. if bobby and sue are in an office, and bobby saves a word doc to his own machine.. to bobby, it's not a cloud document. but to sue, it is! love that cloud it's got everything in it i wanted to say thanks to the folks at arpnetworks! just setup a new vps and was really happy to see ipv6 configured by default! it was also trivial to reinstall my node to freebsd-8.2, so thanks guys! yeah the ipv6 is nice gonna spend tonight migrating my rootbsd server over to my new arpnetwoks node :) that's always good fun tho i guess almost anything with FreeBSD is good fun lol - good point nomadlogic welcome to Arp are you using zfs as / on your node? if not, may I strongly suggest it? :) thanks! there are wiki entries about how to do it here or I can give you the URL I have bookmarked my plan was to purchase more space from you all (or get a second instance) and run zfs on there i'm gonna need to buy more space probably at some point :( zfs as root is easily extended though echo "please increase my disk to 120GB" | mail support@arpnetworks.com (time passes) heh boot from DVD, type "gpart recover" "gpart resize blah" and reboot can you do that live or no and now you have more ZFS disk I don't think so or not sure with ZFS i have boring UFS / =/ i was initially going to run 9.0-CURRENT on this guy for some testing, but there is a bug with the installer that ships with the 9.0-BETA1 iso :( I tend to avoid bleeding edge in fact I stay off a release for at least a few weeks just call me conservative :) heh - yea, for prod systems no doubt :) i was hacking on the freebsd amazon port that colin percival is doing for a while and was hoping to do more testing on my own guess i'm not a good tester though since i didn't report the installer bug i ran into :p heh - I interviewed him about that yesterday twit.tv/floss176 nice! "it's what I do" jpalmer: i like the way you think is there a place to check how much bandwidth I've used so far this month? cricket graphs, from your dashboard thanks! RandalSchwartz: you could have easily gotten $25 consult fee for that for the interview, or for pointing me at bandwidth graph? :) graph. didn't see the interview. $50 easy for that =) yeah i mighta gone another month or 3 before actually asking support@ but irc helps me fight my own laziness A number of your customers complain about inability to access their machines, I have data, others have data, showing it happened yesterday as well. Please, please, tell me what to call that, other than down. Don't be dicks and try to hide when bad shit happens to multiple customers, address it, even if you are still investigating just say so, don't call me out as a liar instead. And if you aren't investigating it, say so, so I can take mine, my company, and several friends to some other service who cares. Wraithan: have you created a support ticket? G: I had assumed that due to mutliple @'d people in this channel replying to the multiple people in this channel reporting problems that it was being taken care of Wraithan: as the topic says, all bar one @ are long-time customers, and generally IRC isn't really an official communication method As this is a channel with a single # that means it is official, unless there is a violation of the freenode rules. Under that assumption, reporting something here and having an @ say something in return, espcially one like up_the_irons who I know for a fact works for the company. I only had one report of an issue and my monitoring did not alert me to any issues either; I reported this morning that I bounced a BGP session as well up_the_irons: should I assume you don't read anythign when you are typing in here as the problem this mornign was reported right before you said you rebooted that So, apparently, the "outage" was enough to have people talk about it in here, but not enough to email support. That doesn't sound like much of an outage. up_the_irons: well the day before the network really went nuts up_the_irons: If I talk about it in here, when you are around, I'd assume you are paying attention. Should I not assume that. ? yeah, this morning i saw some people talk about an issue, so I checked my bandwidth graphs, noticed a session down. So I bounced it. I do not monitor IRC 24/7, but I check the support desk quite often Wraithan: do not assume that at all, I run IRC in a screen session. I'm *always* logged in, but don't assume I'm actually at the computer. up_the_irons: I assumed because you spoke during the convoration not because you were online I am online all the time too. similar setup. if I speak, then I'm here fuck it, not worth the energy I'll file a ticket i may not necessarily read all the scrollback, but i usually do eek what the... cloudkick deleted my node from it's monitoring ouch I have no idea either Wraithan: for the record, I don't hide when bad things happen. I was *not* investigating anything either, since I had no reports of problems and my monitoring did not page me (this is a weakness on my part for not having geographically disperse nagios setups). So if this makes you uncomfortable, I'm more then happy to admit it now so you can find a provider that suits you better. I value honestly and transparency; I don't want to hold anyone here if they don't want to be here. I was moving into a new house all day yesterday, so Iwasn't online at all except for checking emails (e.g. support) from my android and with that, I must leave the office cd $home hey up_the_irons at some point, we shall meet :) I'm in pasadena this weekend Wraithan - up_the_irons is pretty transparent, but ARP may not be the right place to host you. lots of nines, but maybe not enough. :) it's lots of cheap nines, but if you want more, you'll pay tuv - remember that xen = paravirtualized RandalSchwartz: hmm.. didn't know that. what's (or where can i learn about) the difference? try http://lmgtfy.com/?q=xen RandalSchwartz: xen != always paravirtualised. Xen can do full virtualisation as well. HVM. not originally Yeah RandalSchwartz: IMO 9's are practically worthless. The simple fact is that you have to architect your own availability not your hoster. Not even Amazon is bulletproof - whole regions *have* gone unavailable before. recently. They basically DDOSed themselves. Not DDOS, just DOS. too bad chunkhost is so persnickety with their setup, supposedly full HVM, but 'one partition filesystem' doesn't convince me. in any case, $7 for a 512MB xen vm is pretty damn cheap, and googling lovevps didn't yield any complaints. though they are relativelly new Interesting. They could be overcommitting. who knows. but they do have 'NO OVERSELLING' on their home page header heh yeah I just noticed that. they seem to be reselling from few other services: burst.net, hostdime, ... (they have an openvz option) RandalSchwartz: ah cool :) heh, 'NO OVERSELLING'. 1) don't believe everything you read, 2) "overselling" is relative ;)