mhoran: If I wasn't interested in doing anything 4k, would the USB-C dongle suffice? mnathani: do you know which generation of X1 you have? brycec: I've been reading the Ceph thread... I second mercutio's comments. I also dig that you dig the expandability of Ceph; I really like it as well. Right now we're not really adding new Ceph nodes, but replacing disks with bigger disks as they fail or we need more space. With over 200 disks in our cluster, failures are rather common. There's an IG pic of many failed disks somewhere lol brycec: you mentioned something about "but if I'm no longer at the company, I don't want them to be SOL"; this is something big to consider... Ceph really needs someone "manning the ship"; if there isn't this person, an off-the-shelf NAS would be better I should mention we do Ceph consulting ;) a long of things in a company really need someone to be able to deal with when they leave it's one of the biggest IT issues probably one of the reason outsourcing is popular with some i think as far as becoming dependant for something like ceph it can be outsourced if you leave as it's reasonably generic it's not nearly as bad as writing uncommented code.. brycec: re: "Is there such a thing as "resellable" Ceph / Ceph as-a-service?" -- you're referring to "multi-tenant Ceph" and this has been discussed a lot on the mailing lists and such; I haven't kept up with it thought. But google that phrase and you can probably find your answer. I've thought of offering Ceph-as-a-Service too ;) i suspect that one day not a very very long way away people will offer remote file systems for normal company operations now that fibre can get to 1 msec latency re: brycec | (In my defense, I've glossed-over the CRUSH stuff as being a level of configuration I don't need to know/care about yet) the CRUSH map is critical to Ceph and you'll need to understand it if you run Ceph ;) But, by default, the CRUSH map should do what "you would think." Like, if you have 5 OSD hosts, and a replication factor of 3, it's not going to put all 3 copies on the same host. That would just be stupid. But, what you can do, like mercutio said, is to define your own level of "failure domains". If you have all hosts in 1 rack, the topic of failure domains is mute, because you basically only have 1. But if you have your Ceph cluster in multiple cabinets, or multiple cages, or multiple suites, or ... you get it ... then you can define a CRUSH map that will allow one cabinet to go down, and you're still good, or even one cage. Of course, the cost of doing this increases tremendously. ceph was kind of designed for stupid large configsd right where running out of dc space is a real concern and where budgets are sporadic/big rather than gradual they've actualyl just added some downsizing support, but before it you enabled too many "pg" it's placed data in too many locations and you couldn't reduce it that has been one of the major gotchas though. there aren't too many. the other common issue like i said before is people undersizing the network yeah it take a bit to setup a test bed, but you may get more of a feel for things... up_the_irons: yeah the dongles are fine for non 4k. Even 4k at 30hz. It's the 60p additional bandwidth that is the problem. Thanks up_the_irons I'll give the consultation some serious thought. Would be prudent to have someone more experienced guide me and check my work. Would really hate to make decisions/mistakes early on that are next-to-impossible to recover from (I'm already hamstrung by several of those from previous so-called-admins here, such as RAID5 for VM disks :'( Just been gradually shifting things around, working towards undoing that... Down to just 1 host like that at least.) I will say that Ceph's featureset is *enormous*, it can do _so many things_ it's almost overwhelming. I'm starting with RBD and CephFS (naturally) but there are are things I'd love to play with... like object storage (a la S3), that could be fun. I setup a little test cluster yesterday with 4 hosts, learned the hard way that OSDs really care about disk latency (they were all VM's under another VM - hey, it's what I could manage) But hey it gave me a little bit of practice :) And 60 whole raw GB! (because 4x15GB OSD) (felt silly having a whopping 60GB but hey, it's a test) up_the_irons: You might consider grabbing http://www.aprnetworks.com and redirecting to ARP (if it's available) I've made that typo too many times... up_the_irons: https://www.instagram.com/p/BwMFaf_A4Uu/ is the photo with the failed drives. That seems like a lot (16), albeit not *too* horrible given you run 200 and I don't know how long those drives have been in service (but I'd be curious to know more about ARP's hardware failure rates) Instagram: "Spinning disks suck 😑\n(These are all failed drives)\n#datacenter #datacentre #datacenters #server #servers #cloudcomp #serverroom #serverrack #sysadmin #tech #technology #techgeek #wiring #cat5 #cabling #cables #wires #behindthescenes #networking #networkengineer #cableporn #datacenterporn #switches #arpnetworks #hp" by arpnetworks ...I just noticed their failure dates are written on them, so that seems less bad I'm also curious at what point does ARP "fail" a disk - a slow drive response? an approaching-limit SMART value? SMART health check? The drive just no longer identifying? I recall there was a visually-impaired chap that hung out in here so I'm mentioning this thread, hoping they're still hanging out here https://marc.info/?l=openbsd-misc&m=155749719520339&w=2 I'm tempted to reply to that despite my refusal to engage with misc@. My eyesight is only getting worse but the three times I casually thought "I'd like to set up screenreaders on OpenBSD now so I know how to do it when I actually need it" were met with complete failure. OK I'll reply. mhoran: ah OK brycec: yeah I also wanted to play with the S3-like storage brycec: that's cool, 4x 15GB OSDs haha nice tip on the typo ;) brycec: that box is 2 layers deep, so 32 bad disks ;) brycec: Ceph will usually send an error about a scrub failing. I keep records of the OSDs that fail scrubs and after a few times, I investigate them all. If I see any having problems, we pull them yeah we have a couple blind customers serial console support has been critical (for obvious reasons) yesterday's topic: is the T480s battery not removable? (the EMEA says "Integrated Li-ion 57 Wh battery"... so what is "integrated" ?) and what's better... Li-ion or Li-Po ? X1s seem to use Lipo and.. is 2560x1440 on a 14" screen too high a res? I would think fonts could be quite small if you actually want to use all that space their regular 1920x1080 res screen is cheaper (and more common) and still higher than my 1600x900 on my T520 (wow this machine is old) battery is removable but "not customer replaceable" (according to lenovo) https://www.ifixit.com/Teardown/Lenovo+ThinkPad+T480s+Unboxing+&+Quick+Teardown/113690 https://download.lenovo.com/pccbbs/mobiles_pdf/t480s-hmm_en.pdf Ah OK doesn't seem that hard - pages 67 & 75 of maintenance manual s/75/74/ doesn't seem that hard - pages 67 & 74 of maintenance manual my tablet is 2560x1600 @ 10" having higher resolution doesn't normally hurt in linux/android. in windows it gets messed up sometimes you can basically set your dpi huge, and just get better fonts mercutio: ah OK ime, it doesn't really cause too much trouble on win10 (i have a 4k display on my laptop) depends on the app, it's legacy ones that have issues some apps are bad on 1440p on a desktop even sure, but that's the fault of the app not supporting modern displays i had a single app that looked funky, and i just told windows to scale it instead of letting the app decide and it took care of it til the dev updated it up_the_irons: Mine is X1 Carbon 5th Gen - Kabylake (Type 20HR, 20HQ) Laptop (ThinkPad) - Type 20HR