down again? mercutio: no alarms yet, but i'll check it seems back again it was short short enough to not be able to debug properly :( wouldn't be the s7.lax router cuz it doesn't reboot that fast ;) yeh i wonder if it was ntt ya know, actually, my home twcable is down right now it was up via another host which is nlayer hmm maybe related,d unno well when i was first looking there was that small ntt packet loss i'm using my android and tmobile ahh ok damn debugging is annoying :/ srsly i got alerts too though i got alerts then debugged s7.lax uptime is 1 hour, 27 minutes woo ;) heh (and the alerts are on diff network) but i assume it was probably ntt issue is your hoem connection over ntt? did it come back? haven't checked it looks like it didn't go out, but hit 90% packet loss for about 5 minutes but it wasn't so bad when it first hit so i'm going with ddos or ntt fault oh interesting, my route to comcast from new zealand screwed up for same period too might be co-incidental thoug @wa why did the chicken cross the road Why did the chicken cross the road?;To get to the other side., (ha, ha) @wa why is the sky blue? Why is the sky blue?;The sky\'s blue color is a result of the effect of Rayleigh scattering. Shorter-wavelength blue light is more strongly scattered in the earth\'s atmosphere than longer-wavelength red light; the human eye perceives the color blue when looking at the sky as a result. @wa why do we exist? Couldn't grab results from json stringified precioussss. @wa when will ipv6 reach widespread adoption Couldn't grab results from json stringified precioussss. well it wasn't on arp but from my local monitoring quite a few things experienced packet loss then without me noticing @wa will there come a time when ISPs will hand out IPv6 addresses and tunnel IPv4 traffic over it haha Couldn't grab results from json stringified precioussss. @wa tldr TLDR (acronym);too long didn\'t read mmm whisky mercutio: nah, we've not done anything with netamp we're looking to have a play with netfpga in the near future though dunno what gave me the idea that you did hmm we used to run a thing called nettest and our main software we're developing at the moment is called amp (http://amp.wand.net.nz/) nettest + amp = netamp? :P nice i want some nice way to trace from multiple difference places and record what networks it traverses and identify where things break when arp had issues with ntt and any2ix i didn't have much luck finding a path that didn't use those two in a short space of time manually we can pretty much do that cool. assuming you build your test schedule properly well routes can change over time i see path length interesting best way to visualise an outage like that would be our matrix - http://amp.wand.net.nz/matrix/absolute-latency/both/nzamp/nzamp/ oh it still hard to tell if it changes by a little the site is kind of slow from here dunno if it backend or what sorry for speed. i have a new major database improvement to push out shortly ahh ok lambda is broken? lambda is in the US :) how often does it update? everything else is in NZ oh red is < 300 msec and > 160 rather than > 300 msec if you look at relative latency rather than absolute, you'll get a better picture of changes our postgres database is a bit broken at the moment ~200million rows of data in a very naive schema mada doesn't measure that well on that either I've got all that fixed, just need to convert all the data and push out ahh right so does this need lots of resources to run? actually the loss thing is prob fine nah i see vocus is the borken one hah vocus is really broken I'm not sure what they do to our AMP monitor machine but we see packet loss to everywhere rate limit icmp? forwarded ICMP? maybe as we see packet loss end to end on icmp? yeah or tcp/udp/etc? this is my favourite graph type - http://amp.wand.net.nz/view/amp-traceroute-rainbow/9238/1393323647/1393496447 i saw a route going via cogent drop icmp completely once from a host doign monitoring shows up broken routing quite quickly it was very bizzare :/ trademe bounces between wgtn/akdl a lot note - trademe have the worst high availablity setup ever a lot? contsantly try every 10 minutes http://202.49.71.24:24/smokeping/smokeping.fcgi?target=Wider.trademe yes i.e their TTL on their A record 1 msec to akld 10 msec to wgtn http://amp.wand.net.nz/view/amp-icmp/2415/1393384891/1393476724 but it pretty consistent 3i think it similar behaviour really yeah, their DNS server just randomly selects from AKL or WLG whenever you query it then it gets cached for the TTL yeh i know it's mental they should just do anycast imo i'm amazed they can share state so well between their datacentres if they have a real outage on one of them, then people will reload page anyway they don't afaik how do they keep you logged in? it's basically just a proxy type setup because it goes to the same backend constantly just with a cookie i think not based on ip or anything yeah, but the cookie is a session id that the backend needs to know about hence they must do constant replication between each site to keep session IDs in sync yeh well it goes back to the db probably what do they need to keep state on anyway? that's why it's generally advisable to direct the same group of people to the same set of frontend/backend servers your session hmm because the cookie doesn't have your username/pass in it it has some secondary auth (like a session ID) to verify you've logged in maybe they replicate yeah they will i dunno really very very often i had a slow image loading issue on one of them once it was fine on one bad on the other i tried contacting them to no avail :/ but i don't see why they don't just cdn their images and direct all their normal web traffic to one location yeah did you watch any of the NZNOG videos btw? they don't even really need to cdn their traffic probably from janurary a couple the GCSB one? i watched fincham's one on rpki a+++++ for the GCSB one so funny and the apnic one i dunno there was terrible video/sound quality and most of the talks sounded pretty uninteresting :/ yeah the richard naylor couldn't be there this year (the dude who normally does the streaming) i liked the idea of 1.2.3.4 as standard anycast dns the GCSB one was awesome and i think the idea of rpki is slightly interesting watch the Q&A geoff houston's talk was awesome as usual too was chatting to him at the pub the night before i dunno bad audio quality is common for most talks oh god i watched some of the one by that guy about not keeping state i think that's hwen i stopped heh i already know about state issues etc haha, Roland? yes I feel asleep during that one dibbins or something? heh it was a good message but it didn't need to go on for what felt like 60 minutes well before hand i ntoiced he'd give the same exact talk before and then i noticed the slides were the same it was kind of amusing he is mr stateless firewall i'm in between you need state for NAT :/ i hate nat but snyhcronising is a pita and arp's route just changed did ntt go down again (i left mtr running) any2ix and ntt are down it looks like but it maybe up_the_irons doing ios things :/ cos it didn't generate alerts etc. so this monitoring thingy is it light on clients? netamp and does it require lots of ram on server? mercutio: yup we have plans to run on embedded devices cool it's just a little c program sweet and you can tack on rabbitmq if you want proper persistance (but requires 50mb of memory) is it ready for testing now? on the server or client? the server is closed source at the moment (because of funding requirements / commercialisation potential) but the client app is pretty open oh but is it available for other people to run? we try and run it on as much as we can but we have a few trials soon on a few NZ ISPs i suppose it prob not then to trial monitoring a large residential ISP core interesting mercutio: http://wand.net.nz/amp/ this is the link we gave out at NZNOG for large ISPs we're giving them 1U dell servers to run as monitors for smaller ISPs we're looking to give out embedded boxes to handle the monitoring and we should have all the performance issues fixed up by the end of march so it'll stick all of the isp's in a list? we have a 4 year MBIE grant to develop the software to the stage of being able to monitor NZ's internet rather than individuals being able to monitor different sites? we're working towards ISPs being able to pick and choose what the monitor tbh, nz internet seems better than overseas internet for domestic traffic yeah, it's really not too bad but i mean one user can't have 5 different sites that all monitor each other though we find some pretty broken stuff now and then other than silly people like you rate limiting to 10 megabit :/ you mean like routing via australia? So, the most recent funny one i suppose there's less infrastructure to go wrong in nz REANNZ <--> Callplus didn't work that should be able to go over ape their routers both said they should talk to each other over APE in both directions but on the APE layer 2, they couldn't talk what why not so it was a blackhole between them we logged into reannz's APE router bloody route servers ran ping [callplus ape ip] should directly connect :/ and it started working :) oh god yeah.... ubergroup had an issue like that before except it'd start working it hasn't fallen over again yet they're running vpls or something from there? probably because we run probes once every 30 seconds and make sure the ARP table never expires :P ubergroup would drop the first few pings to them always then start to work also, callplus had a few ipv6 related issues we got fixed up callplus have ipv6? yeah did you hear about the i217-lm issues? nah? apparently i217-lm have bugs and disabling ipv6 makes them stop breaking switches s/i217-lm/everything apparently everything have bugs and i217-lm are on standard hp and dell hosts these days apparently disabling power saving stuff might fix it http://www.edugeek.net/forums/hardware/132287-optiplex-9020-systems-spewing-ipv6-multicast-traffic-while-asleep-causing-havok.html stuff like that :( sigh there's a few things floating around we use a lot of dell at work haven't hit that one yet older ones probably i217-lm is on haswell ones afaik all our dells come with broadcom chips but I buy proper intel server NICs for all of them i217-v is integrated on consumer desktop boasrds with haswel onwards broadcom isn't terible it really is it's a lot better than it used to be for a desktop? depends on the broadcom chipset I guess my windows computer at home has onboard broadcom we have some benchmarks of consumer broadcom vs consumer intel NICs for iperf i can do 972 megabit/sec iirc intel requires 20% less CPU usage (less spurious soft interrupts) for a given iperf run it's something close to that if not that hmm mercutio: you probably didn't see: https://twitter.com/arpnetworks/status/438949834670612481 TWITTER: We will be upgrading the IOS on our s7.lax router between 02:00-02:30 PST; expect NTT and Any2 IX to go down (Thu Feb 27 08:12:50 +0000 2014) up_the_irons: it's only 12am here! :) ahh no i didn't up_the_irons but i remember you said something about ios yeah and doing osmething when the first issue came up happy to say s7.lax is now running a shiny new IOS is it going to crash less now? :) crash less or more, hard to say :) need more data points gizmoguy: 20% less cpu usage on iperf? i give it 50 / 50 chance. this will rule out IOS issues, but there is still a possibility of bad hardware 20% cpu usage on iperf for gigabit is pretty high that's what my core2duo does :/ mercutio: I believe it is because intel do a lot better with coalescing (http://www.intel.com/support/network/adapter/pro100/sb/CS-032546.htm) i actually ordered a new sup for s7 today gizmoguy: i dunno they both suck with udp traffic :/ if everything ends up being OK, then it'll just be a spare otherwise, it'll replace the current one but yeah it could be i went from 40% to 20% cpu usage from i7-3770 to i7-4770 for infiniband well ip over infiniband at over 10 gigabit yeah i was surprised at the difference it is quite amazing how much poor interrupt handling can screw you but it's the same card in both machines just better interrupt sharing? i dunno, the intel coalescing is better though yeh these infiniband careds are terrible for that i can generate 80,000 interrupts a sec iirc but at high interrupt loads different cpus/cache etc can mkae more diff we tend to tune coalescing / MSI-X / irqbalance on any machine we care about doing throughput tests on and tcp offload stuff is way better than udp offload stuff i tend to disable irqbalance what tuning do you do? yeah it's all about sending irqs to the right CPU cores my notes are at work i was trying to figure out if there was anyway i could reduce cpu for infiniband rather - the student I make tune all my machines is at work :) yeah single cpu here so it shouldn't matter afaik so all the pci-e lanes on the same cpu but yeah, i think i was going somewhere but i can't remember where gigabit isn't too complicated it's 10 gigabit+ where things get complicated anyway. i should sleep. got a meeting in 9 hours from now and I need my beauty sleep :) heh people who schedule meetings for 9am friday should be shot what's that embedded monitoring device btw? heh we're trialing a few i should stick app in :/ I have one of these on my desk - http://www.freescale.com/webapp/sps/site/taxonomy.jsp?code=IMX6X_SERIES but it'll be small and we've also ordered one of those new intel embedded thingys but it hasn't arrived yet ahh ok this is a better link for the freescale thing - http://www.element14.com/community/community/knode/single-board_computers/sabrelite i dunno why can't just do a virtual machine we're working on it just have a few timing related issues to solve first ahh ok time drift is a big problem for us and time drift occurs rather frequently in vm environments i see well if it's small it fine anyway so, we're just building a list of common "correct time settings" for all the popular VM environments to serve as a minimum set of requirements to host our VM image i've only seen clock drift on vmware and parallels oh that the windows one macosx :/ we had some people running it it was soo bad haha virtualbox! i'll let you in on a secret (lambda is already AMP running on a VM) ok because I don't care about accurate timing myself haha i guessed it would be but the funders do but unfortunately working for an academic instute, we get in trouble for not considering the 'science' aspect :P i see smokeping has way higher latency with ping to localhost on xen so we just have to do some timing verification to prove it's not bad then we can deploy it it jumps from like 10 usec to 50 usec yeah, generally if you use virtualised clock drives you're fine not that 50 usec is a long time but it's about as much as most ethernet coalescing is for like KVMs pvclock - https://rwmj.wordpress.com/2010/10/15/kvm-pvclock/ ahh ok xen in pv mode is probably fine too? i'll admit i've not looked at xen i'd rather real hardware with coalescing disabled most embedded hardware doesn't even support coalescing though http://wiki.xen.org/wiki/Xen_FAQ_DomU#How_can_i_synchronize_a_dom0_clock.3F maybe? ^ "I have problem with domU clock. It lose 30 minutes each day. How can i synchronize it with dom0 clock? " yeah, this is why we want to run our tests... maye it does have issues heh some VM software is terrible at keeping time i use ntp anyway yeah NTP mostly insulates you but i only care for second accuracy for logs mmm ok we're technically only worried about drift we don't care if we have an accurate time well i'll chat with you when you don't need to sleep i suppose :) just that 1 second == 1 second ahh so you need way to measure it some thernet cards can timestamp too yup PTP yeah, it's pretty cool Precision Time Protocol hmm i think you should use udp :/ lol with traffic volumes slingshot prioritise icmp i heard :/ because otherwise people complain about high pings lol i dunno if it true our AMP monitor is in slingshot/callplus's core but if anyone is doing that it'd be slingshot so we don't have too many issues also turns out when you tell people you're going to publish your results on a public webpage i think it's more that they rate limit some traffic and icmp and http aren't rate limited as much they give you the best possible connectiviy they can yeh sounds right but that isn't err typical 10 gigabit to the core! our machines only do 1gig :( telstraclears performance is shocking http://amp.wand.net.nz/view/amp-icmp/1779/1393326283/1393499083 but they actually advertise being fast at broadband or at least they were based on that silly testing thingy callplus probably have the best google connectivity from everyone we monitor (because we're two P nodes away from their google cache :)) what is that gmail youtube.com red line == ipv6 blue line == ipv4 oh umm you mean to their local cache? yeah google are terrible at deploying ipv6 yeah their local cache www.gmail.com is better cos it still in sydney it's bloody difficult to measure these things so is www.google.com www.google.com isn't if you have questions about google yes it is www.google.co.nz is a cname to www.google.com and is in auckland you won't find www.google.com hosts in NZ usually yes you do err what IP do you see it hosted on? err you did it's the syedney one atm but it's been in nz before exactly nah, it won't I've had lengthy chats to google frontend engineers about this before nah it did for a while they can't serve www.google.com from a cache at the moment google.com and google.co.nz they can but now the www. varients s/now/not but not the www. varients weird well it was only temp i think the short answer is windows XP is the problem stuff shifts around a lot the longer answer is SSL is the problem at least vocus and orcon were hosting nz google mirrors last i knew for a while on their normal youtube caches the GGC (https://peering.google.com/about/ggc.html) nodes I know about in NZ: callplus / reannz / FX there are others as well, I just haven't seen them myself but they can't serve all of google from those cache nodes there is vocus, fx, orcon, snap, with ip's at least oh yes, sorry i've seen vocus and snap too i thnk telstraclear had one too and telecom didn't? but maybe telecom do now telecom's an interesting beast i assume vodafone probably do I have no visibility inside it really well they still have terrible email % host www.google.co.nz alien.xtra.co.nz Using domain server: Name: alien.xtra.co.nz Address: 202.27.184.3#53 Aliases: Host www.google.co.nz.meh.net.nz not found: 5(REFUSED) woot they fixed that at least :/ lol from a telecom residential connection telstraclears is still open actually that's strange google takes me straight to SYD www.google.co.nz has address 74.125.237.191 Host www.google.co.nz not found: 3(NXDOMAIN) with some stuff in between google sometimes bounces it to somewhere distant too alright sleepy time, catcha later night if you want an amp monitor just fill out the form oh and one of us can get back to you % host www.google.co.nz 8.8.8.8 Using domain server: Name: 8.8.8.8 Address: 8.8.8.8#53 Aliases: www.google.co.nz has address 60.234.81.176 for some reason 8.8.8.8 gives me orcon :/ try www.google.com same diff it's also a LOT of addresses not for google nah i did it i mean it showed the same thing oh right % host www.google.com 8.8.8.8 | wc -l 22 and try a curl --head http://www.google.com ? can't transparently proxied cause it might just 302 you to a different server ah, right i suppose i could but another time i'd have to bounce somewhere google can be pretty messy :) yes and email comes from ages away yes so do 8.8.8.8 dns requests actually it you dump on the server unfortunately NZ is too small for them to properly fix our connectivity yeh i understand which is suprising considering I have two friends who work for the google frontend/traffic team and both are from NZ if they just fixed sydney it'd be ok searching is faster int he US than sydney they just sigh and grunt everytime I report issues if you really want to speed up browsing route to google via US err searching lol it's about 50% faster or more right I assume SYD datacentre doesn't actually have much searching cached yeah so requests go to SYD, then SYD forwards to the US and it goes to somewhre in asia that prob has less cached than in the US or yeah, asia yeh either or it's faster to go to US from NZ the internet is too big :( than from AU and i think it goes via asia not the US but yeah monitoring shit is good i'd like to see perforamnce measuring of google too :/ of real queries time curl -v 'http://www.google.co.nz/search?q=nzehrald&oq=nzehrald&aqs=chrome..69i57l2j69i59l2.2448j0j7&sourceid=chrome&espv=2&es_sm=122&ie=UTF-8#nfpr=1&q=nzherald' > /dev/null like that or such just evacuated two more wasps out of my server room haha "evacuated"? That conjures a specific mental image yeah, threw them in a dvd-r spindle cover and threw them in the parking garage ohlawd they're lethargic because it's cold in there hahaha so they move verrrrrrrry slow Nice! so i just took 'em out one at a time one had flown away after a few mins in the sun i'm just happy the guy who pulled this gear from outside put it in the server room and not our closet because it's 40-something in the server room, but it's 70 in the closet sweet, good guy My new 2560x1440 27" display is gorgeous, and just freaking enormous :D my 27" is kind of small twss Okay! twss! 'my 27" is kind of small' well text i stiny is what kind of monitor is it bryce? like ips or pls or what IPS iirc http://www.monoprice.com/Product?c_id=114&cp_id=11401&cs_id=1130703&p_id=10509&seq=1&format=2 yeh ips The thing is exceptional. Great picture. The whole backside is solid metal. etc i have one ips one pls whoa, monoprice sells monitors? funny lol m0unds yep Bought this one through Massdrop for $345 too funny m0unds: They're known for very good S. Korean monitors, with an American warranty. (and shipping source, etc) i've seen all the cat-named ones mentioned all over the place (hardforum, overclockers.net etc) but those are ebay purchases shipped from sk EDID vendor "KJT", prod id 30917 brycec: i just have cheap korean ones but yeah once i had one i wanted two :) so i have one for windows one for linux i wonder what my edid is my EDID is ACB neither mean anyhting to me m0unds: you should get one don't really need one, haha one issue is they're dual-link dvi, which means most video cards can only support one they're also 16:9 so it not much bigger than 24" 16:10 people really struggle to read my monitors :/ 283x77 console size if full screen terminal brycec: so you'd recommend that monitor? i sometimes desire a new monitor capable of more than 1920x1080 yet both my laptops can't drive more than that, so.. boo... need newer gear my laptop can do 1920x1200 i think get a desktop? i don't think many laptops can drive high res screens, mac ones can but you need an expensive adawpter well it less expensive than mac screens macbooks can via thunderbolt port and the thunderbolt -> displayport cables are like $10 USD or less that gets you up to like 2560x1600 i think ugh, this wind sucks up_the_irons: So far, yes. I'm still getting used to it, adjusting to having so much space. But the screen quality definitely seems to be top-notch. brycec: cool m0unds: you need dual dvi rather than displayport for the cheap screens though which need to be active which are more like $80 USD that's displayport to dual-link dvi, so maybe you need both on newer macs ah. My smokeping is up to 6 slaves. This is a PITA to disseminate updates. heh brycec: automation needed