^ grrrr
bad times in bryce land?
Yes.
VPS seems to be having some issues with its underlying storage . Or that's my hypothesis.
Load average suddenly spiked (over 100) and eventually it went "dead"
Now it's sortof up (haven't rebooted yet, would rather not), kindof responding on the VNC terminal
Actually I think it's very slowly rebooting (from when I accidentally clicked the "Send CtrlAltDel" button thinking it was the "Send key sequence..." menu I'm familiar with in other VM stuff)
Yep, exactly what I did. Well damn. Was hoping I could "recover" it.
usually if load average spikes heaps it's due to lots of swapping
mercutio: Indeed there appears to have been a bunch of swapping that began to happen around the time, from the spotty information I have.
And if swapping isn't keeping up or worse is blocking then it all goes to shit.
on desktop's i would normally be lead to believe it was chrome's fault
well swap by definition blocks :)
(Right but I meant blocks and never returns...)
oh right
did you get a process list?
you might be able to figure out what was causing it from that
Remarkably yes, as things sorted themselves out on the [accidental] shutdown, my top refreshed
haha
Unfortunately, it says *everybody* was swapping.
Because they were.
Because it was 2+ hours since it started.
was anything using lots of ram?
Just the daily backup job which has $never caused this before
but other variables may be at play.
it may be lots of apache processes or such
none using that much on their own
but just with lots of them adding up...
LOL Apache...
although i suspect you may not use apache :)
well apache seems to be one of the common examples for that behaviour.
Actually I do, but it's a single small instance for handling DAV traffic to a single private vhost.
Anyhow, from iotop: Actual DISK READ:       3.85 M/s | Actual DISK WRITE:      80.04 M/s
that's decent speed for swap.
It's one of the newer nodes no less :)
kct03
i gathered that
the old nodes wouldn't swap that quick..
haha, exactly
but yeah it doesn't sound like it's disk performance issue so much as some kind of extra ram utilisation.
Indeed.
It swapped itself to death
Some sort of race condition or the like that caused it to swap harder and harder and harder, the OOM-killer never did a thing it seems :(
(oh this explains part of the load - it was time for the full, not incremental, backup)
i hvae never had much luck with the oom killer
it often kills the wrong thing
and like apache is spwaning way too many processes that are bloating up
so it kills mysql
lol, whatever it takes to keep people from browsing your LAMP website right? :P
I wouldn't say I rely on OOM-killer, I was just surprised it didn't kick-in.
tbh, i haven't had any exposure to the OOM killer in a long time.
There we go. Broke it again.
[ 8160.104100] INFO: task jbd2/dm-4-8:404 blocked for more than 120 seconds.
That's why I wonder if there are issues with the underlying storage... Or Linux. Or the host. Or some driver.
(Yes I slammed it with some relatively high load activity - copying from one large database to another, so big chunk of RAM getting used I imagine)
At present, it appears the kernel's waiting for data to get written to disk / filesystem writes to commit, and it's just blocking.
http://sprunge.us/PBgJ
anyone here use mutt?
A bit in the past. I liked it, but I was never "great" at it.
trying to run down an annoying bug. was hoping someone was using the latest with the integrated sidebar and running a mac client.
integrated sidebar? (the fact I have no idea what that is should give you an idea how minimal my experience has been)
yeah, in 1.7 they merged in the popular sidebar patch
makes for a nice addition to the interface
Damn... Ended up having to hard power-off/on my VPS :(
brycec: do you think something is up with the disks presented to kct0* hosts via ceph?