heh, thats a pretty great domain visinin haha, thanks i tried to get fre.sh while i was at it, but no luck heh it looks like a .sh is like 100 usd forcefollow: i'm here Yawn up_the_irons: My VM clock is all over the place and ntpd seems not to be doing anything. Ideas? mmm... strange what ntpd is it exactly? openntpd? i know openntpd will only slowly update a bad clock (so it doesn't jump) Ah, perhaps I'm an idiot. I would have expected an error message if the config file did not exist, but instead it was running and doing nothing! So I changed that. Let's see if my clock syncs up now. :) haha :) mhoran: tried sending you a maintenance advisory (actually, to everyone): A8566E8C59 1542 Thu Sep 10 05:24:31 garry@rails1.arpnetworks.com (host vroom.ilikemydata.com[71.174.73.69] said: 450 4.7.1 : Recipient address rejected: Greylisted for 5 minutes (in reply to RCPT TO command)) mhoran: so I hope you still get it Should go through when the mailserver retries to send it We do the same thing Thorgrimr: ah, gotcha, cool The idea being that spammers won't waste the time to come back and try again, but any decent MTA will Dovecot killed itself when I synced my clock and Postfix got confused. Secondary MX, which does greylisting, answered because primary was down. Fun! mhoran: at least your secondary picked it up mike-burns: should be something like 03:00 EDT Thorgrimr: gotcha, interesting whats the syntax to make unreal ircd bind to a range of ports. I found a Yahoo Answers thread that converted 11:00 PST to EDT, amusingly. mike-burns: was it correct? obsidieth: not sure... Yup! nice doh. that was easy No email for me :( Thorgrimr: it's coming real soon (about 10 minutes) for the record, i could not be more pleased with how this is working so far up_the_irons RAD obsidieth: glad you like it :) up_the_irons: No worries, I'm at work now anyway, and teaching this afternoon, so no play for me ah shucks ;) Thorgrimr: ...and it's off! Alrighty then :) up_the_irons: So this ntpd issue is related to a Cacti issue I'm trying to track down. I thought ntpd would keep it in sync, and now that it's set up correctly, it seems to be. mhoran: your cacti or my cacti? :) However, my cron tasks aren't running on time at all. My cacti. ah 5 minute tasks are running, sometimes, over a minute late. sounds like a cron issue So my graphs are basically useless. I figured it was because my clock wasn't synced and it was getting all confused, but it seems something else may be up. Wondering if you've seen anything similar. if the time is right, but cron doesn't execute on time, suspect cron. perhaps it needs to be restarted i've seen time drift on VMs (pretty much across the board: Xen, KVM, VMware, etc...) but ntpd pretty much keeps it in line i haven't had any issues with cron though, as long as time is sync'd Yeah. I've never seen this cron issue before. Didn't have it when running VMware, and my Xen boxes at work seem to be doing just fine. Huh. mhoran: what time does your VM show? Thu Sep 10 09:18:22 EDT 2009 as of now, the host says: Thu Sep 10 06:19:20 PDT 2009 i'm not sure how cron gets its time, from hardware clock, OS, or what.. probably through whatever stdlib C call provides that Okay. I restarted cron, let's see what that does. roger heavysixer: how's it hangin up_the_irons: yo just getting ready to start working on digisynd's site again., you? heavysixer: provisioning VMs gotcha you are getting quite a few clients now huh? Nope, still screwed up. Huh. heavysixer: it's picking up up_the_irons: That's quite the upgrade the server is getting! mhoran: logs don't show anything? mhoran: yeah, 16 GB of RAM is going in, and another quad-core Xeon @ 2.66GHz bad boy heavysixer: haven't done Amber's VPS yet, had two orders in front of her; tell her not to hate me ;) up_the_irons: no worries we are not at the point where we are ready to deploy. heavysixer: cool up_the_irons: soon though ;-) up_the_irons: Will this bring it to dual quad core or quad quad core? so no slacking heavysixer: oh it's gonna be up today, for sure. :) up_the_irons: cool mhoran: dual quad That's exciting. mhoran: really interested to see how load avg goes down with the addition of more cores to distribute the load My static Web site will certainly benefit from the power! LOL Yeah. if the load avg drops in half, that'd be awesome utilization of the cores Oh totally. omg, now I know how to say "Sent from my iPhone" in Japanese iPhoneから送信 Hahaha. saw that on the bottom of a new customer's email (who is in Japan) Huh. So my clock seems to be fine, but these tasks are not running 5 minutes apart. They're all over the place. It's almost like the scheduler is not synced with the clock or something. mhoran: you should do like: */1 * * * * root uptime erm */2 or w/e it is Yeah. so it emails you something simple see if you get the emails on-time and at regular intervals Got it in there now. We'll see. :) cool That's a good way to make me feel loved! Send myself e-mail. LOL So I have this set to run every minute. It ran at 9:46, then 9:49. Skipped everything in between. Interesting. Sep 10 09:46:20 friction /usr/sbin/cron[25023]: (mhoran) CMD (/bin/date) Sep 10 09:49:05 friction /usr/sbin/cron[25051]: (mhoran) CMD (/bin/date) Didn't even try to run it in between. mhoran: did you use "*/2", i think that's every other minute * * * * * heh let's see, on one of my Linux VMs, I have: ep 9 06:20:01 ice /USR/SBIN/CRON[12324]: (root) CMD ([ -x /usr/sbin/update-motd ] && /usr/sbin/update-motd 2>/dev/null) Sep 9 06:25:02 ice /USR/SBIN/CRON[12400]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )) Sep 9 06:30:01 ice /USR/SBIN/CRON[12430]: (root) CMD ([ -x /usr/sbin/update-motd ] && /usr/sbin/update-motd 2>/dev/null) so that's pretty much right on every 5 minutes let me find a FreeBSD one... Yeah. I'm drifting way past the minute. Interesting. Sep 10 06:35:36 freebsd-ha /usr/sbin/cron[19804]: (root) CMD (/usr/libexec/atrun) Sep 10 06:41:08 freebsd-ha /usr/sbin/cron[19807]: (root) CMD (/usr/libexec/atrun) Sep 10 06:46:25 freebsd-ha /usr/sbin/cron[19823]: (root) CMD (/usr/libexec/atrun) Sep 10 06:50:36 freebsd-ha /usr/sbin/cron[19826]: (root) CMD (/usr/libexec/atrun) Sep 10 06:56:08 freebsd-ha /usr/sbin/cron[19833]: (root) CMD (/usr/libexec/atrun) atrun is supposed to run every 5 minutes and look at that, cron is like being lazy about it it's "about" every 5 minutes the time delta is pretty close to 5 minutes, but it's not executing "on the dot" Yeah. That's what's upsetting cacti. Hrm. Thu Sep 10 09:56:52 EDT 2009 That one was almost a minute late! i wonder if cron is seeing a different time [25430] TargetTime=1252591500, sec-to-wait=36 [25430] sleeping for 36 seconds [25430] TargetTime=1252591500, sec-to-wait=-115 Interesting. whoa, weird I received two maintenance notices, identical, except for 'Message-Id:...@rails1.arpnetworks.com>' vs 'Message-ID: ..@garry-thinkpad.arpnetworks.com>\nUser-Agent: Mutt/1.5.16..' just fwiw ;-) toddf: There was an error when sending the first one, so to be safe I sent it again from just my regular client (mutt) toddf: looks like you got them both anyway :) absolutely time for me to expire, thankfully i slept a little already cd $bed ;-) [mike@jack] ~% date; sleep 1; date Thu Sep 10 11:30:44 EDT 2009 Thu Sep 10 11:30:47 EDT 2009 10 minutes later, [mhoran@friction] ~% date; sleep 300; date Thu Sep 10 11:20:10 EDT 2009 On my work laptop, [mhoran@mhoran-thinkpad] ~% date; sleep 60; date Thu Sep 10 11:29:22 EDT 2009 Thu Sep 10 11:30:22 EDT 2009 So, something is up. [mhoran@friction] ~% date; sleep 300; date Thu Sep 10 11:20:10 EDT 2009 Thu Sep 10 11:41:10 EDT 2009 hello! 30 minutes of downtime eh? ballen: You running FreeBSD? ya Been trying to diagnose some issues with cron, which seem to trace back to sleep(), which may even be scheduler related. Does date; sleep 60; date act as expected for you? rgr chking When I ran it, sleep 300 waited for took 21 minutes. up_the_irons: when you get in let me know, would like to discuss expectations of uptime, etc and so forth mhoran: yea its all sorts of messed up [ballen@arp ~]$ date; sleep 10; date Thu Sep 10 12:46:29 EDT 2009 Thu Sep 10 12:47:09 EDT 2009 7.2? ya Yeah. Something is definitely borked. I noticed because my 5 minute Cacti cron has been complaining for months. :) does 7.2 use the new scheduler I ran cron in debug mode and saw that it had a negative sec-to-wait. So then I tested sleep, which is exhibiting the same behavior. Yes, it does. Not totally sure if it's scheduler related, or something else. But, something is definitely busted. yep Hopefully up_the_irons can help us figure it out. May need to mail the FreeBSD lists as well. Probably after work. It's crazy today. yea just woke up from working on thesis till 4am last night hopefully no one at work misses me whats sleep use to tell time? nanosleep() is the syscall. hmm, yea really don't feel like figure this one out at the moment. Let me know if figure out anything. My gut feeling is it has to do with KVM/Qemu likely how nanosleep is counting time and how KVM is sharing cycles brb need coffeee mhoran: here's what I have from a Linux VM: garry@ice:~ $ date; sleep 1; date Thu Sep 10 12:12:07 PDT 2009 Thu Sep 10 12:12:08 PDT 2009 garry@ice:~ $ date; sleep 20; date Thu Sep 10 12:12:15 PDT 2009 Thu Sep 10 12:12:35 PDT 2009 the host box is the same on FreeBSD it is jacked: [arpnetworks@freebsd-ha ~]$ date; sleep 1; date Thu Sep 10 12:15:37 PDT 2009 Thu Sep 10 12:15:40 PDT 2009 Good to know. Looks like all of us on FreeBSD are experiencing this. Do you have something that's not 7.2 (the old scheduler)? mhoran: i believe I do, but it's stopped right now cuz i ran out of RAM (hence the maintenance tonight :) w00t, OpenBSD still rocks it: Heh. Okay. s3.lax:~> date; sleep 20; date Thu Sep 10 12:01:14 PDT 2009 Thu Sep 10 12:01:34 PDT 2009 s3.lax:~> date; sleep 1; date Thu Sep 10 12:01:39 PDT 2009 Thu Sep 10 12:01:40 PDT 2009 s3.lax:~> date; sleep 1; date Thu Sep 10 12:01:41 PDT 2009 Thu Sep 10 12:01:42 PDT 2009 Interesting. given OpenBSD is probably the least virtualized OS, and it is working, I'd have to point the finger at FreeBSD on this one, instead of KVM/QEMU. however, it probably has to do with the interaction of the two Yeah. I did not have this problem with VMware. that OpenBSD VM is on the same host too mhoran: you tried 7.2 w/ VMware? Ooh, this is 7.1. I think I have a 7.2 running somewhere. 7.1/ESXi -- vps% date; sleep 20; date Thu Sep 10 15:19:15 EDT 2009 Thu Sep 10 15:19:35 EDT 2009 vps% date; sleep 1; date Thu Sep 10 15:19:38 EDT 2009 Thu Sep 10 15:19:39 EDT 2009 vps% date; sleep 1; date Thu Sep 10 15:19:40 EDT 2009 Thu Sep 10 15:19:41 EDT 2009 So that's good. I'll play with it more tonight around the maintenance window; I'll have a lot of time to kill then Yeah, 7.2/ESX is fine. Same as 7.1. ah ok greg_dolley: welcome to IRC thx :-) greg_dolley: haha, yo greg this is andy from revver, not sure if you remember me up_the_irons: Do you have a machine you can test this on? Apparently adding hint.apic.0.disabled="1" may fix this. mhoran: machine = FreeBSD VM? cablehead: he must be at lunch... up_the_irons: Yes. mhoran: sure, where should I put that? in sysctl.conf? Oh, I left out -- adding ... to /boot/loader.conf ah ah up_the_irons: either that or rocking out to some thumping metal cablehead: true! mhoran: does this look right: [arpnetworks@freebsd-ha ~]$ cat /boot/loader.conf hint.apic.0.disabled="1" [arpnetworks@freebsd-ha ~]$ That should be it. ok, rebooting... cablehead: hey! I remember you ;-) man i dont thin i'll ever get a freebsd vps Don't say that! We'll get to the bottom of this ... Aside from that, it works great! Yeah, no complaints from me. I don't do a lot of sleep-related work. noo not cause of that when i use bsd, i use it for serious thing.. i build things by hand, or ports never packages. jeev: so what's the prob? ;) with ports, you can install everything from source, that's one cool thing about it yea mhoran: ok, so, had trouble with that line. it won't find the disk for some reason. I had to go into boot loader and do 'unset hint.apic.0.disabled' and then it booted fine except, vps.. = slow ;) Interesting. mhoran: you want to play with my test VM? i could give you login and console Sure, probably have some time after work ... jeev: Haven't found my VPS to be slow. At work, we run everything virtualized, and it's fine. ok, just don't trash it too much, I use it for DRBD testing at the moment :) Heh, okay. No worries. jeev: My ARP Networks VPS is faster than my laptop much of the time. faster? w00t eh i mean like how often can you build world. i want a peer1 LA colo for cheap. jeev: peer1 ain't cheap i hear yea there was someone on wht who did colo for like 70 or somthing i forgot what bandwidth but he told me he's leaving it soon up_the_irons: you on ballen: yo 30 minutes of downtime seems a bit long no ? ballen: it is, but what can I do; I have RAM and a CPU to install, and if I rush it, something could break, and then the downtime would be much greater ballen: if it was just RAM, it'd be a lot quicker no way to migrate vm's? ballen: it would take longer than 30 minutes ;) a 'dd' from LVM to disk, then transfer to another server, and 'dd' from disk back to LVM <-- takes some time as well, and you're down the whole time sigh.... ballen: my ultimate goal is to have the VM disk images to be on a DRBD volume, if performance turns out to be still good ballen: i'm currently testing this SAN storage...that fix it all so centrally store the images, and do diskless botting on the Qemu booting even ballen: and then it would be possible to "live" migrate and if everything goes right, there would be no downtime at all Nat_UB: SAN storage is very expensive Tell me about it...doing that at two sites at $WORK doesn't DRBL use NFS ? ballen: DRBD would be "central" store (more like, two boxes get paired), and it is trivial to boot off it. That's a solved problem, but there are performance issues to account for yea I've used DRBL a year ago to deploy a 80+ machine lab ballen: DRBD is a distributed block device; not related to NFS awwww Diskless Remote Booting Linux hah LOL anywho n/m ;) ballen: trust me, I feel your pain, I have several important VMs of my own that are going down (arpnetworks.com site itself, pledgie.com, my shared hosting server) whats the need for the new hardware, obviously other than increasing capacity. Couldn't just buy a new server? ballen: I'm going to try to be as quick as possible; and once I certify the DRBD setup I'm testing currently, I will those who want to be on that, go on it. ballen: Giving him hell huh? a little just 30 minutes downtime suuuucks :) but understandable I suppose In this case I'm still building....so downtime no concern for me hehehehe I work in the 'NO DOWNTIME' field...so I've heard all the griping before....Irons, keep up the good work! ballen: the "just" in "just buy a new server" is the hard part ;) I don't buy cheap boxes, I have to shell out about $6K, and that just isn't gonna happen given I can double the cores on the current box *and* double the RAM on the current box up_the_irons: does DRBD do synchronous writes? up_the_irons: well you should have thought of that ahead of time :-p ballen: "DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based" yes yes ballen: dude, to be honest, I was like this: when you write to one side of mirror does it wait till the other side to finish "I don't know how well my new VPS offering will sell, so let's not buy both CPUs and 32 GB of RAM off the bat, start with 1 CPU and 16 GB of RAM and upgrade later" <-- now that bites me in the ass :) wondering if that is the source of performance issues ballen: OK, gotcha. that part of it is configurable ballen: I configure it to wait for the write on the other end; it has to be very consistent up_the_irons: yea I figured that was your train of thought. Just giving you a hard time yea async without conflict resolution is a piss poor idea ballen: my thinking is, i'd rather sacrifice performance than have a catastrophic failure althought although* if you think of it in a master -> slave configuration where the slave will never write ballen: where it won't matter much is in reads, cuz reads will come off the local disk; which is kinda an advantage over an external SAN setup whats wrong with doing writes asynchronously ballen: well, master box crashes while writing to local disk, yet that block isn't replicated on the slave? I think that would be a problem hmm I guess its a matter of not allowing it to get too far out of sync and allowing a certain amount of time for lost data whatever one would be confortable with ballen: yeah, I think the #1 goal with DRBD is for the data to never go out of sync; but it gives you some knobs to play with so how much performance loss are you seeing using it i'm not to the hard benchmark part yet; more like "play with the machine; does it feel slow?" so far, i can't really tell the different ah *difference cool, what kind of network do you have between the machines He's got 10gig... :) ballen: 1G just one link per box? ah dedicated to task? ballen: right now i actually have two boxes physically linked together, no switch in between ah ballen: if I got more NICs, I could bond them, and I hear the network speed would be faster than disk write speed and then performance issues are mute; but I want to *see* this in action before I certify it Nat_UB: i wish i had 10G :) the intel ones are like $2K a pop yea good plan. There may be some overhead in bonding. Does DRBD run over TCP/IP ballen: yes, it does run over TCP/IP Haven't tried 10g but done some bonded stuff AoE (ATA over Ethernet) is another alternative, that runs on layer 2, but is pretty much feature-less and does not afford much protection to someone accidentally writing to the volume from two boxes at the same time (which will instantly corrupt it) I use AoE for backup images only but that said, AoE is pretty cool in its simplicity yea AoE is pretty neat i got this from a 'dd' test on my FreeBSD DRBD testing VM: 1048576000 bytes transferred in 76.882014 secs (13638769 bytes/sec) so like 13 MBps not all that great but those are real writes, not cached yea, whats that same benchmark on a typical FreeBSD VM let me see this is what i'm running: dd if=/dev/zero of=delete-me bs=1M count=2000 you want to make sure 'count' is about double your RAM, so caching goes away now, the performance of 'dd' will give some raw numbers that may or may not correlate with how the VM actually performance during normal use; that would depend on a lot of other factors. Even if dd has lower speeds on DRBD, the trade-off for uptime and easy of VM migration may well be worth it true I'll be on later, and will be around during the maintance window. Let me know if you make any breakthroughs with DRBL. ballen: will do! Nat_UB: thanks for the support up there, BTW ;) Sure thing! U'r the man! :) we need more IRC'ers build bots ;) no, real people :) i think i might have a recruit for you obsidieth: nice :) obsidieth: someone you know? or found on a message board? btw, guys, let me know if there are other boards out there besides WHT that have an advertising section; I post weekly on WHT and recently found webhostingboard.net. If there are others, I'd like to know :) cd $data-center i will advertise you for $1000/month on my google PR1 jeev: LOL ;) palm pre is so lame ping yeah someone i know people on efnet arent used to servers that actually stay up:p huh efnet, heh back in the day efnet sucks lol, DRBD is now following you on Twitter! "DRBD is now following you on Twitter!" that is heh looking at espresso machines, and grinders is dropping a grand on an espresso machine + grinder a good thing? wow i'd rather buy a good 48 port GB switch for that and for a grand, even that is hard to find heh have a good gig switch already don't use it I keep my computer equipment at home very light to keep power down yeah, i don't have much at home i keep it all at the data center :) its just a netgear shit i just bought a dell poweredge 48 port gig and i haven't even sent it to the datacenter there's some new cage here where the fool has like 30 power circuits in 200 sq. ft. i was like "wwhhhhhhhhaaaaa?" that must be me LOL any idea what amperage? i was a hazard at uscolo they would detour people around my stuff during tours i hate dell management interface on their f*cking switches, but they are priced well yea i aint gonna use the management stuff just cli http://www.netgear.com/Products/Switches/FullyManaged10_100_1000Switches/GSM7212.aspx it's some ieee standard my home switch heh ballen: looks like 20 ampers each; they must be in redundant A/B pair, cuz I can't imagine they actually gave him all that power damn ballen: i've heard some good things about the GSM ironically brb k yea its a solid switch, but I really haven't had to do much with it Grinder: http://www.visionsespresso.com/node/73 Espresso Machine: http://www.wholelattelove.com/Rancilio/ra_silvia_2009.cfm why do you need that its more of a question of why do I not need that as well as a small coffee addiction lol I really enjoy espresso drinks, and it would save me money in the long run if I don't goto any cafe $3 latte once a day that's the point of coffee or drinks LOL to go into the place is 1095 bucks and see how bitches hot hahaha up_the_irons wishes he could get the girls from glendale! lmao well some are fugly but some are hot the chicks in glendale, yeah some are pretty hot sexUAL def some hot females at some various cafes I've been into also cafe is like 15 minutes away and in the morning, F that! shit sounds like you're from Charlevoix, MI when you say it's 15 min away ya wow, what a guess! hah ballen, gto a linux or bsd router on your cable ? so this is a fun thing actually I'm behind a WISP who uses AT&T, and Charter ahh i was gonna say sniff me some mac addresses i steal cable internet sometimes hah although i have a primary isp i prefer stealing charter, 20 megs sometimes their network sucks T - 15 minutes ironically, the time right before a maintenance window is a time where I actually wait and do nothing (I've already prepared), so now I'm just waiting for the clock to strike the right time weird what are you gonna do i didn't read it yep, always annoying period of time cuz I could start now, but I told everyone 11, so I must wait jeev: RAM + CPU upgrade on everything ? jeev: just one box, but it holds the majority of the guys in here is arpnetworks just one box >? :) i dont mind really so far, stable as fark jeev: no, i have several, but the one in question is my newest cool so how is CPU split is it burst for everyone ? no burst if you ordered 1 cpu, you get 1 cpu then what ahh how many cpu's in the box i'm on some guys are running SMP, but not many jeev: 1 CPU, quad core so i'm considered a one cpu user since cpuinfo shows me a single cpu so I actually have a core all my to myself? so the server i'm on has 4 cores right now so 4 users? heh ok guys, T minus 1 minute i'll get disconnected k have fun, don't break anything ;-) see I just had to swear heh is that personal stop logging ? hmm I'd assume so [FBI]: off and hes back w00000000000000000000000000000000t what did I miss? I'll show you what you guys missed: total used free shared buffers cached Mem: 31856 15133 16723 0 2905 64 look at "free" :) nice up_the_irons, i cancelled while you were gone. lol just kidding lmao ahahahah up_the_irons, so the server im' on has only 4 cores? and now this: garry@kvr02:~$ cat /proc/cpuinfo | grep 'model name' model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz jeev: now it has 8 :) [FBI]: welcome back but... it had 4, so 4 customers? jeev: no no, customers can share cores oh so how many cores do i have just one, is it dedicated jeev: there is no setting that says "this VM gets this core", although I *can* do that, just haven't. I let the Linux KVM/QEMU scheduler put the VMs on the least loaded core in real time ok jeev: nobody has a dedicated core; i don't think my business model could support that. cores are the least numerous resource RAM is easier disk is easiest yea i dont care realy 41 minutes cutting it a bit close weren't we :-p heh, i just took a video of all the blinking lights on this box, w/ my iphone ballen: oh yeah man, I failed that one hard ballen: the RAM did not take at first, still registered 16GB, not 32 ah, tough day ballen: I had to unrack box and put them in different channels :( ballen: i'm just happy everything went OK; ya never know when opening boxes and tinkering around cutsman: whoa, who are you? :) my Nagios is all green! yea, I tend to avoid doing such things to production boxes ballen: I really try not to also; but sometimes is unavoidable. This will be the last time major maintenance is done on a box before I get DRBD live migrations working; then it will be a mute point sounds good it is i don don don wonder who cutsman was obsidieth: yo