just closed chrome http://i.imgur.com/MuzOtkW.png Ram usage went from like 22 GB to 14 GBGB heh mine uses more than that my unbound instances use like 2gb of ram fwiw s/GBGB/GB Ram usage went from like 22 GB to 14 GB but you can do tuning etc. i wonder how much ram most resolvers use mercutio: have you worked with asterisk at all? mercutio: One data point - a node in a cluster of unbound boxes here acting as customer facing resolvers and answering about 1500 queries a second has 2G of RAM, and uses about 1G of it as cache plett: and not hitting cache limits? partially got blocked dns records taking up lots of ram too mercutio: Nope, not hitting limits at all. There is unallocated RAM on the machine which unbound could expand in to if it needed more space for its cache pool (I think. It's been a while since I configured unbound on those machines) plett: defaults don't set much cache pool mem.cache.rrset=1330235008 unbound-control stats_noreset has the various things it's using mem.cache.message=872415584 rrset-cache-size: 200m msg-cache-size: 100m yeah num-threads: 4 you're running those numbers really low I think those are the pertinent config options Not with 4 threads I'm not it's still kind of low you shouldn't need 4 threads too it's hard to bench though because how do you decide how much it helps caching less accessed sites The comment in the config file says I followed http://unbound.net/documentation/howto_optimise.html do you enable prefetch btw? yeah the rrset to messag size is about right Yes, we have prefetch. That made a big difference to percieved customer performance but bumping up can get higher hit rate that said, not restarting the server makes a bigger difference restarting dns resolvers tends to tank performance for a little bit Yes, it would :) and yeah prefetch is damn awesome. We might be able to tweak those numbers slightly, but I'm leaving it well alone while it's working :) i wonder how many requests/sec arp does heh you can probably make it work on 32mb i used to have problems with bind like 15 years ago? where it would use up more and more memory over time back when ram wasn't as plentiful Having more threads does help when you're doing dnssec validation. We found that with a single thread it was getting bogged down with cpu operations. ahh yeah i thought 2 threads would be ok for 1500/sec so did dnssec raise cpu usage for you quite a lot? there's a cacti plugin for monitoring unbound which is kind of cool Yes, I'm looking at the Cacti graphs as we speak :) ahh :) And yes, doing dnssec validation gave a noticable jump in cpu usage. I don't have exact for it any more though And cpu load has slowly incresed over time, presumably as more zones enable dnssec signing. Nothing drastic though, this one box appears to use 15-20% cpu as a VM on an i3 But it's hard to get good stats on cpu usage because it's virtualised. So long as it's not 100% then I'm happy :) yeah The i3 it's running on is starting to look a little bit underpowered. It was bought as the lowest power server we could find, to run local services like dns resolvers and radius servers in the same rack as the comms equipment. But we've gradually found more 'essential' services which need to be right next to where the customers connect and have put more load on it and it's identical brother removing virtualisation could fix it probably. but yeah i3s kind of suck e3-1230 is probably fine you can probably upgrade the same host with replacement cpu although 2gb of ram? oh that's 2gb virtual Machine has 32G I think, it's only a small box i3s can only address 32gb it's kind of cool that there's a trend to move more services close to users We did it because we wanted internet connectivity to keep working if our internal backhaul links went down. So each pop has its own local radius server and customer database, and things like dns resolvers are local too. that's pretty nifty self-contained do you use anycast? Yes. We run exabgp on the dns resolver nodes to inject the customer facing IP addresses into the network, with a test script to check that it can talk to the local resolver and withdraw the route if the node is down sweet that sounds great :) There are two customer facing IPs, and 6 nodes (a pair of nodes in three locations). In normal running, each location routes queries to the local resolvers but we can (and have tested) running everything with only a single location's servers running harra http://pastebin.com/RZLbeWfM is that sufficient to say ubuntu uses systemd don't care for thereg, but here: http://www.theregister.co.uk/2015/07/31/incident_managers_pagerduty_pwned/ in case anyone hadn't seen it & uses pager duty Title: "PagerDuty hacked ... and finally comes clean 21 days later. Cheers • The Register" Ouch. If anyone finds a technical description of the exploit I'd be very interested in it i doubt you will most companies don't share that stuff Especially the ones that don't come clean about it happening in the first place. theregister is useful so much more often than i want to admit to they at least seem to include something quickly. they use php? The attacker gained unauthorised access to an administrative panel provided by one of our hosting providers. At the request of law enforcement, we are not able to provide additional information." i think that is the key thing to take away from this hack if you host with a provider with an insecure control panel then everything could be compromised from there. you're as weak as you're weakest link. which is one of the reasons that i never trust any provider who uses solusvm lololol solusvm Very well known for being insecure. And having a dumb IPv6 deployment methodology (assigning /128)