^ grrrr bad times in bryce land? Yes. VPS seems to be having some issues with its underlying storage . Or that's my hypothesis. Load average suddenly spiked (over 100) and eventually it went "dead" Now it's sortof up (haven't rebooted yet, would rather not), kindof responding on the VNC terminal Actually I think it's very slowly rebooting (from when I accidentally clicked the "Send CtrlAltDel" button thinking it was the "Send key sequence..." menu I'm familiar with in other VM stuff) Yep, exactly what I did. Well damn. Was hoping I could "recover" it. usually if load average spikes heaps it's due to lots of swapping mercutio: Indeed there appears to have been a bunch of swapping that began to happen around the time, from the spotty information I have. And if swapping isn't keeping up or worse is blocking then it all goes to shit. on desktop's i would normally be lead to believe it was chrome's fault well swap by definition blocks :) (Right but I meant blocks and never returns...) oh right did you get a process list? you might be able to figure out what was causing it from that Remarkably yes, as things sorted themselves out on the [accidental] shutdown, my top refreshed haha Unfortunately, it says *everybody* was swapping. Because they were. Because it was 2+ hours since it started. was anything using lots of ram? Just the daily backup job which has $never caused this before but other variables may be at play. it may be lots of apache processes or such none using that much on their own but just with lots of them adding up... LOL Apache... although i suspect you may not use apache :) well apache seems to be one of the common examples for that behaviour. Actually I do, but it's a single small instance for handling DAV traffic to a single private vhost. Anyhow, from iotop: Actual DISK READ: 3.85 M/s | Actual DISK WRITE: 80.04 M/s that's decent speed for swap. It's one of the newer nodes no less :) kct03 i gathered that the old nodes wouldn't swap that quick.. haha, exactly but yeah it doesn't sound like it's disk performance issue so much as some kind of extra ram utilisation. Indeed. It swapped itself to death Some sort of race condition or the like that caused it to swap harder and harder and harder, the OOM-killer never did a thing it seems :( (oh this explains part of the load - it was time for the full, not incremental, backup) i hvae never had much luck with the oom killer it often kills the wrong thing and like apache is spwaning way too many processes that are bloating up so it kills mysql lol, whatever it takes to keep people from browsing your LAMP website right? :P I wouldn't say I rely on OOM-killer, I was just surprised it didn't kick-in. tbh, i haven't had any exposure to the OOM killer in a long time. There we go. Broke it again. [ 8160.104100] INFO: task jbd2/dm-4-8:404 blocked for more than 120 seconds. That's why I wonder if there are issues with the underlying storage... Or Linux. Or the host. Or some driver. (Yes I slammed it with some relatively high load activity - copying from one large database to another, so big chunk of RAM getting used I imagine) At present, it appears the kernel's waiting for data to get written to disk / filesystem writes to commit, and it's just blocking. http://sprunge.us/PBgJ anyone here use mutt? A bit in the past. I liked it, but I was never "great" at it. trying to run down an annoying bug. was hoping someone was using the latest with the integrated sidebar and running a mac client. integrated sidebar? (the fact I have no idea what that is should give you an idea how minimal my experience has been) yeah, in 1.7 they merged in the popular sidebar patch makes for a nice addition to the interface Damn... Ended up having to hard power-off/on my VPS :( brycec: do you think something is up with the disks presented to kct0* hosts via ceph?