Older blog entries for etbe (starting at number 968)

BTRFS and ZFS as Layering Violations

LWN has an interesting article comparing recent developments in the Linux world to the “Unix Wars” that essentially killed every proprietary Unix system [1]. The article is really interesting and I recommend reading it, it’s probably only available to subscribers at the moment but should be generally available in a week or so (I used my Debian access sponsored by HP to read it).

A comment on that article cites my previous post about the reliability of RAID [2] and then goes on to disagree with my conclusion that using the filesystem for everything is the right thing to do.

The Benefits of Layers

I don’t believe as strongly in the BTRFS/ZFS design as the commentator probably thinks. The current way my servers (and a huge number of other Linux systems) work of having RAID to form a reliable array of disks from a set of cheap disks for the purpose of reliability and often capacity or performance is a good thing. I have storage on top of the RAID array and can fix the RAID without bothering about the filesystem(s) – and have done so in the past. I can also test the RAID array without involving any filesystem specific code. Then I have LVM running on top of the RAID array in exactly the same way that it runs on top of a single hard drive or SSD in the case of a laptop or netbook. So Linux on a laptop is much the same as Linux on a server in terms of storage once we get past the issue of whether a single disk or a RAID array is used for the LVM PV, among other things this means that the same code paths are used and I’m less likely to encounter a bug when I install a new system.

LVM provides multiple LVs which can be used for filesystems, swap, or anything else that uses storage. So if a filesystem gets badly corrupted I can umount it, create an LVM snapshot, and then take appropriate measures to try and fix it – without interfering with other filesystems.

When using layered storage I can easily add or change layers when it’s appropriate. For example I have encryption on only some LVs on my laptop and netbook systems (there is no point encrypting the filesystem used for .iso files of Linux distributions) and on some servers I use RAID-0 for cached data.

When using a filesystem like BTRFS or ZFS which includes subvolumes (similar in result to LVM in some cases) and internal RAID you can’t separate the layers. So if something gets corrupted then you have to deal with all the complexity of BTRFS or ZFS instead of just fixing the one layer that has a problem.

Update: One thing I forgot to mention when I first published this is the benefits of layering for some uncommon cases such as network devices. I can run an Ext4 filesystem over a RAID-1 array which has one device on NBD on another system. That’s a bit unusual but it is apparently working well for some people. The internal RAID on ZFS and BTRFS doesn’t support such things and using software RAID underneath ZFS or BTRFS loses some features.

When using DRBD you might have two servers with local RAID arrays, DRBD on top of that, and then an Ext4 filesystem. As any form of RAID other than internal RAID loses reliability features for ZFS and BTRFS that means that no matter how you might implement those filesystems with DRBD it seems that you will lose somehow. It seems that neither BTRFS nor ZFS supports a disconnected RAID mode (like a Linux software RAID with a bitmap so it can resync only the parts that didn’t change) so it’s not possible to use BTRFS or ZFS RAID-1 with an NBD device.

The only viable way of combining ZFS data integrity features with DRBD replication seems to be using a zvol for DRBD and then running Ext4 on top of that.

The Benefits of Integration

When RAID and the filesystem are separate things (with some added abstraction from LVM) it’s difficult to optimise the filesystem for RAID performance at the best of times and impossible in many cases. When the filesystem manages RAID it can optimise it’s operation to match the details of the RAID layout. I believe that in some situations ZFS will use mirroring instead of RAID-Z for small writes to reduce the load and that ZFS will combine writes into a single RAID-Z stripe (or set of contiguous RAID-Z stripes) to improve write performance.

It would be possible to have a RAID driver that includes checksums for all blocks, it could then read from another device when a checksum fails and give some of the reliability features that ZFS and BTRFS offer. Then to provide all the reliability benefits of ZFS you would at least need a filesystem that stores multiple copies of the data which would of course need checksums (because the filesystem could be used on a less reliable block device) and therefore you would end up with two checksums on the same data. Note that if you want to have a RAID array with checksums on all blocks then ZFS has a volume management feature (which is well described by Mark Round) [3]. Such a zvol could be used for a block device in a virtual machine and in an ideal world it would be possible to use one as swap space. But the zvol is apparently managed with all the regular ZFS mechanisms so it’s not a direct list of blocks on disk and thus can’t be extracted if there is a problem with ZFS.

Snapshots are an essential feature by today’s standards. The ability to create lots of snapshots with low overhead is a significant feature of filesystems like BTRFS and ZFS. Now it is possible to run BTRFS or ZFS on top of a volume manager like LVM which does snapshots to cover the case of the filesystem getting corrupted. But again that would end up with two sets of overhead.

The way that ZFS supports snapshots which inherit encryption keys is also interesting.

Conclusion

It’s technically possible to implement some of the ZFS features as separate layers, such as a software RAID implementation that put checksums on all blocks. But it appears that there isn’t much interest in developing such things. So while people would use it (and people are using ZFS ZVols as block devices for other filesystems as described in a comment on Mark Round’s blog) it’s probably not going to be implemented.

Therefore we have a choice of all the complexity and features of BTRFS or ZFS, or the current RAID+LVM+Ext4 option. While the complexity of BTRFS and ZFS is a concern for me (particularly as BTRFS is new and ZFS is really complex and not well supported on Linux) it seems that there is no other option for certain types of large storage at the moment.

ZFS on Linux isn’t a great option for me, but for some of my clients it seems to be the only option. ZFS on Solaris would be a better option in some ways, but that’s not possible when you have important Linux software that needs fast access to the storage.

Related posts:

  1. Starting with BTRFS Based on my investigation of RAID reliability [1] I have...
  2. ZFS vs BTRFS on Cheap Dell Servers I previously wrote about my first experiences with BTRFS [1]....
  3. Reliability of RAID ZDNet has an insightful article by Robin Harris predicting the...

Syndicated 2012-04-27 08:40:05 from etbe - Russell Cokeretbe - Russell Coker

BTRFS and ZFS as Layering Violations

LWN has an interesting article comparing recent developments in the Linux world to the “Unix Wars” that essentially killed every proprietary Unix system [1]. The article is really interesting and I recommend reading it, it’s probably only available to subscribers at the moment but should be generally available in a week or so (I used my Debian access sponsored by HP to read it).

A comment on that article cites my previous post about the reliability of RAID [2] and then goes on to disagree with my conclusion that using the filesystem for everything is the right thing to do.

The Benefits of Layers

I don’t believe as strongly in the BTRFS/ZFS design as the commentator probably thinks. The current way my servers (and a huge number of other Linux systems) work of having RAID to form a reliable array of disks from a set of cheap disks for the purpose of reliability and often capacity or performance is a good thing. I have storage on top of the RAID array and can fix the RAID without bothering about the filesystem(s) – and have done so in the past. I can also test the RAID array without involving any filesystem specific code. Then I have LVM running on top of the RAID array in exactly the same way that it runs on top of a single hard drive or SSD in the case of a laptop or netbook. So Linux on a laptop is much the same as Linux on a server in terms of storage once we get past the issue of whether a single disk or a RAID array is used for the LVM PV, among other things this means that the same code paths are used and I’m less likely to encounter a bug when I install a new system.

LVM provides multiple LVs which can be used for filesystems, swap, or anything else that uses storage. So if a filesystem gets badly corrupted I can umount it, create an LVM snapshot, and then take appropriate measures to try and fix it – without interfering with other filesystems.

When using layered storage I can easily add or change layers when it’s appropriate. For example I have encryption on only some LVs on my laptop and netbook systems (there is no point encrypting the filesystem used for .iso files of Linux distributions) and on some servers I use RAID-0 for cached data.

When using a filesystem like BTRFS or ZFS which includes subvolumes (similar in result to LVM in some cases) and internal RAID you can’t separate the layers. So if something gets corrupted then you have to deal with all the complexity of BTRFS or ZFS instead of just fixing the one layer that has a problem.

The Benefits of Integration

When RAID and the filesystem are separate things (with some added abstraction from LVM) it’s difficult to optimise the filesystem for RAID performance at the best of times and impossible in many cases. When the filesystem manages RAID it can optimise it’s operation to match the details of the RAID layout. I believe that in some situations ZFS will use mirroring instead of RAID-Z for small writes to reduce the load and that ZFS will combine writes into a single RAID-Z stripe (or set of contiguous RAID-Z stripes) to improve write performance.

It would be possible to have a RAID driver that includes checksums for all blocks, it could then read from another device when a checksum fails and give some of the reliability features that ZFS and BTRFS offer. Then to provide all the reliability benefits of ZFS you would at least need a filesystem that stores multiple copies of the data which would of course need checksums (because the filesystem could be used on a less reliable block device) and therefore you would end up with two checksums on the same data. Note that if you want to have a RAID array with checksums on all blocks then ZFS has a volume management feature (which is well described by Mark Round) [3]. Such a zvol could be used for a block device in a virtual machine and in an ideal world it would be possible to use one as swap space. But the zvol is apparently managed with all the regular ZFS mechanisms so it’s not a direct list of blocks on disk and thus can’t be extracted if there is a problem with ZFS.

Snapshots are an essential feature by today’s standards. The ability to create lots of snapshots with low overhead is a significant feature of filesystems like BTRFS and ZFS. Now it is possible to run BTRFS or ZFS on top of a volume manager like LVM which does snapshots to cover the case of the filesystem getting corrupted. But again that would end up with two sets of overhead.

The way that ZFS supports snapshots which inherit encryption keys is also interesting.

Conclusion

It’s technically possible to implement some of the ZFS features as separate layers, such as a software RAID implementation that put checksums on all blocks. But it appears that there isn’t much interest in developing such things. So while people would use it (and people are using ZFS ZVols as block devices for other filesystems as described in a comment on Mark Round’s blog) it’s probably not going to be implemented.

Therefore we have a choice of all the complexity and features of BTRFS or ZFS, or the current RAID+LVM+Ext4 option. While the complexity of BTRFS and ZFS is a concern for me (particularly as BTRFS is new and ZFS is really complex and not well supported on Linux) it seems that there is no other option for certain types of large storage at the moment.

ZFS on Linux isn’t a great option for me, but for some of my clients it seems to be the only option. ZFS on Solaris would be a better option in some ways, but that’s not possible when you have important Linux software that needs fast access to the storage.

Related posts:

  1. Starting with BTRFS Based on my investigation of RAID reliability [1] I have...
  2. ZFS vs BTRFS on Cheap Dell Servers I previously wrote about my first experiences with BTRFS [1]....
  3. Some RAID Issues I just read an interesting paper titled An Analysis of...

Syndicated 2012-04-27 07:10:05 from etbe - Russell Cokeretbe - Russell Coker

Links April 2012

Karen Tse gave an interesting TED talk about how to stop police torture as an investigative tool [1]. Mostly it’s about training and empowering public defenders.

Phil Plait gave an interesting TED talk about how to defend the Earth from asteroids [2].

Julian Baggini wrote an interesting article for the Financial Times about the persecution of Atheists in the US [3].

Charlie Todd of Improv Everywhere gave an amusing TED talk about Improv events that he has run [4]. He is most famous for organising people to wear blue shirts and khaki pants in Best Buy, but he’s done lots of other funny things.

Paul Zak gave an interesting TED talk about trust, morality, and oxytocin [5]. One of the many interesting fact that he shared is that oxytocin levels can significantly increase when using social networking sites. So people who use Facebook etc are likely to be more trustworthy as well as more trusting.

The Occupy the Judge Rotenberg Center movement aims to stop the torture of Autistic children in the US [6], Anonymous is involved in that too.

Paul Lewis gave an insightful TED talk about the use of crowdsourced data in news reporting [7]. A lot of the analysis of “citizen journalism” is based on comparing bloggers with full-time paid journalists, but Paul describes how professional full-time journalistic work can be greatly assisted by random people filming and photographing things that seem noteworthy. Make sure your next phone has the best possible camera – phone cameras will never be great but the quality of the camera you have with you is what matters.

Sam Harris published an interesting interview with Tim Prowse who is a Baptist minister who faked belief for two years after becoming an atheist [8A]. He also references The Clergy Project – a support group for atheists who are current or former members of the clergy [8B].

Cracked has an insightful article about 6 things that rich people need to stop saying [9]. How do the 1% not understand these things?

Barack Obama and Nichelle Nichols (who played Lt Uhura in the original Star Treck) give the Vulcan Salute in the White-House [10].

Gabriel Arana wrote an insightful article about his experiences with the ex-gay movement [11]. The “therapist” who hurt him so much is still doing the same to other victims.

S#!T Ignorant People Say To Autistics is an interesting youtube video about ignorant and annoying people [12]. Strangely I’ve received little of that myself, I wonder whether women on the Autism Spectrum get a lot more of that than men.

At GManCaseFile an ex-FBI agent has written an informative post about how the TSA is failing [13].

The Nieder Family has an interesting article about how patents are being used to prevent the creation of assisted communication (AAC) devices for children [14]. Apparently the company that has the patents wants all AAC devices to be really expensive and profitable for them. This is yet another example of patents doing harm not good.

Renew Economy has an informative article by Giles Parkinson about the affect that solar power generation will have on power prices [15]. In short as solar systems produce power when it’s most needed (during the day and at the hottest time of the day for warm climates) it will dramatically reduce the auction price for wholesale power. That will hurt the business of the power companies and also allow lower prices on the retail market.

Related posts:

  1. Links March 2012 Washington’s Blog has an informative summary of recent articles about...
  2. Links February 2012 Sociological Images has an interesting article about the attempts to...
  3. Links April 2010 Sam Harris gave an interesting TED talk about whether there...

Syndicated 2012-04-23 14:59:31 from etbe - Russell Cokeretbe - Russell Coker

Neighborhood Watch

While writing my previous post I heard a huge noise at the front of my house. I found one man being restrained in a seated position on the ground at my front door, the man who was holding him down was accusing him of theft and asking me to call the police, and a woman was hanging around and crying.

When calling the police I discovered that Optus (the Telco that provides the virtual service which Virgin Mobile uses) doesn’t accept 112 as an emergency number! This combined with the fact that CyanogenMod 7 on my phone doesn’t accept 000 as an emergency number meant that I had to unlock my phone before calling the police. Unlocking your phone late at night when there’s a situation that needs police attention isn’t as easy as you would hope. As an aside there are usually no penalties for testing the emergency service on your phone, people who install PABX systems and other significant telephony devices test emergency services calls as a matter of routine, so testing emergency calls from your phone is a really good idea. If anyone knows how to configure CyanogenMod 7 to support 000 as an emergency call then please let me know!

Anyway the man who was held down claimed that a friend of his had given him a bag containing tools that he had lugged from some place not particularly near my house. The man who was holding him down said that he witnessed the other man stealing the tools from his neighbor – not far from my house. The woman was apparently the girlfriend of the man who was accused of burglary.

The end result was that the police arrested the man who was accused of burglary and his girlfriend. He didn’t have any obvious injuries and the police said that the man who detained him did them a favor, so it seems unlikely that there will be any assault charges filed. Presumably the man who detained the burglar is explaining it all at the police station now, I hope the police gave him a chance to put on pants and shoes first.

The man who made the burglary accusation said that his house was robbed last night which is why he was more observant than usual tonight.

This makes me glad of my policy of rejecting every job offer which involves moving to the US. In Australia hand guns are really hard to get so there’s no way that a house burglary will involve a gun and there’s also no way that someone who wants to help the police will have a gun. So while it was unpleasant to have this happen at my front door it didn’t involve any risk to me. It could have ended up with someone other than me getting a beating but the probability of serious injury or death for them was quite low. As everyone knew that no-one had a gun and no-one wanted to be charged with assault it made sense for everyone to avoid excessive force. From what I saw no excessive force was used.

The police arrived fairly quickly and EVERYONE was glad to see them. All up it took a bit more than 30 minutes from the first noise to the police departing after arresting both suspects and filling out a bunch of paperwork. I was impressed by that!

Related posts:

  1. CyanogenMod and the Galaxy S Thanks to some advice from Philipp Kern I have now...

Syndicated 2012-04-22 16:00:28 from etbe - Russell Cokeretbe - Russell Coker

Autism as an Excuse

A Polish geek going by the handle of mmemuar has recently written a blog post claiming that people use Autism as an excuse for bad behavior [1]. He gets enough things wrong in one short post to make it worth debunking it. It seems that Google’s translation of Polish isn’t as good as some other languages, but unless Google mistranslated about a dozen sections such that they had the exact opposite meaning then mmemuar’s post has a lot of wrong ideas.

Does Anyone use Autism as an Excuse?

I’ve read a lot of blog posts written by people on the Autism Spectrum, read many forum discussions, and talked to more than a few people in person. So far I haven’t yet encountered any evidence of people using an Autism Spectrum Disorder (ASD) diagnosis as an excuse. There are probably about 70,000,000 people who meet the diagnostic criteria (most of whom have not been diagnosed due to not having access to anyone who is qualified to do an assessment). The number of people who have been diagnosed is large enough that I couldn’t claim that none of them have ever used it as an excuse. The number of self-diagnosed people is also large enough that there has to be some people who wouldn’t get diagnosed if professionally assessed. But I don’t see any evidence that using an ASD as an excuse is at all common.

I imagine that some people would take someone merely mentioning the fact that they have an ASD diagnosis in a public place (EG a mailing list or a blog post) as some sort of an excuse. One problem with such an interpretation is that for every way in which people on the Autism Spectrum annoy other people it’s the ones who aren’t diagnosed (or who reject a diagnosis) that will do it the most. Being diagnosed with an ASD is correlated with annoying other people less. Another problem is that keeping quiet about such things when they get raised for discussion so often takes a psychological toll.

When someone is diagnosed as an adult a fairly common reaction is to study Psychology and Sociology (usually through web sites such as Sociological Images – which I highly recommend [2]) and try to get a better understanding of other people. Any time you assume that everyone else thinks like you then you will get things wrong, when someone gets an ASD diagnosis they will probably take more care to avoid making such mistakes.

Is Autism Obvious?

Mmemuar says “if you were full of autistic, which is very easy to overheat the brain from excess signals at the input, it would be very obvious to all“. There are some people who are utterly incapable of acting like an NT. But the majority of people on the Autism Spectrum have some ability to pass as NT, it just takes a lot of effort. So if someone is spending all their effort to walk, talk, and make eye contact in a way that most people expect then they will have little spare effort for other things. This can result in them having little patience for other people. The solution to this is to not require people to look average.

Even apart from Autism there are people who fidget, don’t make eye contact in the way you expect, and do other things a little differently. Being tolerant of such things won’t hurt you and will generally make things easier for everyone.

TheAnMish gave a good Youtube presentation about the way that she has to act “normal” [3]. Note that while her video represents her own personal experiences (which differ slightly from those of other Aspies – particularly male Aspies) they are regarded as representative enough for her video to be shown by Tony Attwood (a world renowned expert on Asperger Syndrome) at a conference about women on the Autism Spectrum. As an aside I disagree with her use of the word “normal” without scare-quotes.

Learning about Psychology through Sci-Fi

In the end of the blog post and in some of the comments there is discussion about learning to understand people through reading sci-fi books. The first problem with this is that fiction books generally have a range of characters that is determined by the author’s understanding of people, if you read multiple books by an author then you will usually notice the same character types. The second problem is that characters in fiction books are simplified to fit into a reasonable sized book, in real life people have lots of really boring reasons for doing what they do, in fiction only stuff that is interesting ends up in print.

But the biggest problem is that fiction books just aren’t a good way of learning about people. Whatever lessons might be in a fiction book will probably be missed by a reader who is concentrating on the plot. You can learn things about people by reading an analysis of literature by a Psychologist or a Sociologist, but apart from that you probably won’t be able to learn things if you don’t already know them.

As an aside, my experience of reading sci-fi books suggests that some of the popular sci-fi authors have such a poor understanding of people that it impairs their ability to write believable fiction about human characters. If I was going to try and learn about people by reading fiction I’d choose something that’s been popular for a long time. If a book has been popular for more than 100 years and sold well in different times, cultures, and languages then it probably has something to say about the human condition.

Catching up on Youth

Mmemuar says “absolutely nothing prevents you to the age of twenty he began to catch up on some ‘of youth’“. Actually one significant thing is that the human brain develops in particular stages, the older you get the more difficult it becomes to learn things. So even if it was just a matter of learning things someone who was behind at age 20 would have some significant difficulty in catching up. But some things just can’t be learned, for example someone who has extreme discomfort in making eye contact can’t just learn to be happy with it.

Also there are some things which people would have learned if it was possible. For example people who have Prosopagnosia (an inability to recognise faces) suffer extreme bullying in school, if they could just learn to recognise people then they surely would do so, so if they complete school without learning then it’s probably going to be impossible for them. Prosopagnosia is one of many conditions which can contribute to social difficulties and therefore contribute to an ASD diagnosis.

Can Aspies get Married?

Some people think that people on the Autism Spectrum can’t get married. In fact this belief is so widely held that some people who seem very obviously Autistic are convinced that they are NT simply because they are married! The fact that there are more than few books offering advice to people who have married someone who is on the Autism Spectrum is clear proof that such claims are bogus.

The Relevance to Geek Communities

People who meet the diagnostic criteria for Asperger Syndrome will almost certainly be Geeks due to the “Restricted, repetitive patterns of behavior, interests, or activities” section of the diagnostic criteria (the proposed revision for the DSM-V is the best reference I know for this [4]) as the modern use of the term Geek applies to anyone who has an extreme interest in something. The more Geeky a community is the greater the incidence of people who could be diagnosed with an ASD.

Spreading ideas such as those of mmemuar will lead to people not being assessed for an ASD. I would have been assessed earlier if it wasn’t for hearing an influential member of the Linux community say some things which were similar in concept (although not as deliberate). Having people not be assessed is bad for the individuals in question and bad for the community.

Related posts:

  1. Autism Awareness and the Free Software Community It’s Autism Awareness Month April is Autism Awareness month, there...
  2. Autism vs Asperger Syndrome Diagnostic Changes for Autism Spectrum Disorders Currently Asperger Syndrome (AS)...
  3. Communication Shutdown and Autism The AEIOU Foundation The AEIOU Foundation [1] is a support...

Syndicated 2012-04-22 15:03:46 from etbe - Russell Cokeretbe - Russell Coker

The Most Important things for running a Reliable Internet Service

One of my clients is currently investigating new hosting arrangements. It’s a bit of a complex process because there are lots of architectural issues relating to things such as the storage and backup of some terabytes of data and some serious computation on the data. Among other options we are considering cheap servers in the EX range from Hetzner [1] which provide 3TB of RAID-1 storage per server along with reasonable CPU power and RAM and Amazon EC2 [2]. Hetzner and Amazon aren’t the only companies providing services that can be used to solve my client’s problems, but they both provide good value for what they provide and we have prior experience with them.

To add an extra complication my client did some web research on hosting companies and found that Hetzner wasn’t even in the list of reliable hosting companies (whichever list that was). This is in some ways not particularly surprising, Hetzner offers servers without a full management interface (you can’t see a serial console or a KVM, you merely get access to reset it) and the best value servers (the only servers to consider for many terabytes of data) have SATA disks which presumably have a lower MTBF than SAS disks.

But I don’t think that this is a real problem. Even when hardware that’s designed for the desktop is run in a server room the reliability tends to be reasonable. My experience is that a desktop PC with two hard drives in a RAID-1 array will give a level of reliability in practice that compares very well to an expensive server with ECC RAM, redundant fans, redundant PSUs, etc.

My experience is that the most critical factor for server reliability is management. A server that is designed to be reliable can give very poor uptime if poorly maintained or if there is no rapid way of discovering and fixing problems. But a system that is designed to be cheap can give quite good uptime if well maintained, if problems can be repidly discovered and fixed.

A Brief Overview of Managing Servers

There are text books about how to manage servers, so obviously I can’t cover the topic in detail in a blog post. But here are some quick points. Note that I’m not claiming that this list includes everything, please add comments about anything particularly noteworthy that you think I’ve missed.

  1. For a server to be well managed it needs to be kept up to date. It’s probably a good idea for management to have this on the list of things to do. A plan to check for necessary updates and apply them at fixed times (at least once a week) would be a good thing. My experience is that usually managers don’t have anything to do with this and sysadmins either apply patches or not at their own whim.
  2. It is really ideal for people to know how all the software works. For every piece of software that’s running it should either have come from a source that provides some degree of support (EG a Linux distribution) or be maintained by someone who knows it well. When you install custom software from people who become unavailable then it puts the reliability of the entire system at risk – if anything breaks then you won’t be able to get it fixed quickly.
  3. It should be possible to rapidly discover problems, having a client phone you to tell you that your web site is offline is a bad thing. Ideally you will have software like Nagios monitoring the network and reporting problems via a SMS gateway service such as ClickaTell.com. I am not sure that Nagios is the best network monitoring system or that ClickaTell is the best SMS gateway, but they have both worked well in my experience. If you think that there are better options for either of those then please write a comment.
  4. It should be possible to rapidly fix problems. That means that a sysadmin must be available 24*7 to respond to SMS and you must have a backup sysadmin for when the main person takes a holiday, or ideally two backup sysadmins so that if one is on holiday and another has an emergency then problems can still be fixed. Another thing to consider is that an increasing number of hotels, resorts, and cruise ships are providing net access. So you could decrease your need for backup sysadmins if you give a holiday bonus to a sysadmin who uses a hotel, resort, or cruise ship that has good net access. ;)
  5. If it seems likely that there may be some staff changes then it’s a really good idea to hire a potential replacement on a casual basis so that they can learn how things work. There have been a few occasions when I started a sysadmin contract after the old sysadmin ceased being on speaking terms with the company owner. This made it difficult for me to learn what’s going on.
  6. If your network is in any way complex (IE it’s something that needs some skill to manage) then it will probably be impossible to hire someone who has experience in all the areas of technology at a salary you are prepared to pay. So you should assume that whoever you hire will do some learning on the job. This isn’t necessarily a problem but is something that needs to be considered. If you use some unusual hardware or software and want it to run reliably then you should have a spare system for testing so that the types of mistake which are typically made in the learning process are not made on your production network.

Conclusion

If you have a business which depends on running servers on the Internet and you don’t do all the things in the above list then the reliability of a service like Hetzner probably isn’t going to be an issue at all.

Related posts:

  1. Servers vs Phones Hetzner have recently updated their offerings to include servers with...
  2. Why Internet Access in Australia Sucks In a comment on my post about (relatively) Cheap Net...
  3. The National Cost of Slow Internet Access Australia has slow Internet access when compared to other first-world...

Syndicated 2012-04-17 12:55:11 from etbe - Russell Cokeretbe - Russell Coker

ZFS vs BTRFS on Cheap Dell Servers

I previously wrote about my first experiences with BTRFS [1]. Since then I’ve been using BTRFS on more systems and have had good results. The main problem I want to address is with the reliability of RAID [2].

Requirements for a File Server

Now one of my clients has a need for a new fileserver. They need to reliably store terabytes of data (currently 6TB and growing) which is mostly comprised of data files in the 10MB – 15MB size range. The data files will almost never be re-written and I anticiapte that the main bottleneck will be the latency of NFS and other network file sharing protocols. I would hope that saturating a GigE network when sending 10MB data files from SATA disks via NFS, AFS, or SMB wouldn’t be a technical challenge.

It seems that BTRFS is the way of the future. But it’s still rather new and the lack of RAID-5 and RAID-6 is a serious issue when you need to store 10TB with today’s technology (that would be 8*3TB disks for RAID-10 vs 5*3TB disks for RAID-5). Also the case of two disks entirely failing in a short period of time requires RAID-6 (or RAID-Z2 as the ZFS variant of RAID-6 is known). With BTRFS at it’s current stage of development it seems that to recover from two disks failing you need to have BTRFS on another RAID-6 (maybe Linux software RAID-6). But for filesystems based on concepts similar to ZFS and BTRFS you want to have the filesystem run the RAID so that if a block has a filesystem hash mismatch then the correct copy can be reconstructed from parity.

ZFS seems to be a lot more complex than BTRFS. While having more features is a good thing (BTRFS seems to be missing some sysadmin friendly features at this stage) complexity means that I need to learn more and test more before going live.

But it seems that the built in RAID-5 and RAID-6 is the killer issue. Servers start becoming a lot more expensive if you want more than 8 disks and even going past 6 disks is a significant price point. As 3TB disks are available an 8 disk RAID-6 gives something like 18TB usable space vs 12TB on a RAID-10 and a 6 disk RAID-6 gives about 12TB vs 9TB on a RAID-10. With RAID-10 (IE BTRFS) my client couldn’t use a 6 disk server such as the Dell PowerEdge T410 for $1500 as 9TB of usable storage isn’t adequate and the Dell PowerEdge T610 which can support 8 disks and costs $2100 would be barely adequate for the near future with only 12TB of usable storage. Dell does sell significantly larger servers such that any of my clients needs could be covered by RAID-10, but in addition to costing more there are issues of power use and noise. When comparing a T610 and a T410 with a full set of disks the price difference is $1000 (assuming $200 per disk) which is probably worth paying to delay any future need for upgrades.

Buying Disks

The problem with the PowerEdge T610 server is that it uses hot-swap disks and the biggest disks available are 2TB for $586.30! 2TB*8 in RAID-6 gives 12TB of usable space for $4690.40! This compares poorly to the PowerEdge T410 which supports non-hot-swap disks so I can buy 6*3TB disks for something less than $200 each and get 12TB of usable space for $1200. If I could get hot-swap trays for Dell disks at a reasonable price then the T610 would be worth considering. But as 12TB of storage should do for at least the next 18 months it seems that the T410 is clearly the better option.

Does anyone know how to get cheap disk trays for Dell servers?

Implementation

In mailing list discussions some people suggest using Solaris or FreeBSD for a ZFS server. ZFS was designed for and implemented on Solaris, and FreeBSD was the first port. However Solaris and FreeBSD aren’t commonly used systems so it’s harder to find skilled people to work with them and there is less of a guarantee that the desired software will work. Among other things it’s really convenient to be able to run software for embedded Linux i386 systems on the server.

The first port of ZFS to Linux was based on FUSE [3]. This allows a clean separation of ZFS code from the Linux kernel code to avoid license issues but does have some performance problems. I don’t think that I will have any performance issues on this server as the data files are reasonably large, are received via an ADSL link, and which require quite a bit of CPU time to process them when they are accessed. But ZFS-FUSE doesn’t seem to be particularly popular.

The ZFS On Linux project provides source for a ZFS kernel module which you can compile and load [4]. As the module isn’t distributed with or statically linked to the kernel the license conflict of the CDDL ZFS code and the GPL Linux kernel code is apparently solved. I’ve read some positive reports from people who use this so it will be my preferred option.

Related posts:

  1. ECC RAM in a Cheap Machine In a comment on my previous post about ECC RAM...
  2. Starting with BTRFS Based on my investigation of RAID reliability [1] I have...
  3. Dell PowerEdge T105 Today I received a Dell PowerEDGE T105 for use by...

Syndicated 2012-04-17 07:14:07 from etbe - Russell Cokeretbe - Russell Coker

Guest/Link Post Spam

I’ve been getting a lot of spam recently from people wanting to write guest posts or have their site included in a future links post.

Guest Posts

For guest posts the social convention for the planets which aggregate my blog seems to be that random guest posts are unacceptable. I could change my blog feed to have some posts excluded from the planet feeds but that’s too much effort – and I don’t want random guest posts anyway.

The only situations in which I will accept guest posts are when someone is writing what I might have written (IE they generally agree with me) or if they are a member of the free software community who doesn’t have their own blog but has something relevant to say. Any applicant for a guest post who runs a business that is useless and/or evil IMHO (EG anything related to the TSA) is going to get rejected firmly. Any applicant who tells me that they can “write on a wide variety of topics” probably isn’t capable of making me an offer that I would accept. Tell me that you can write about Linux programming or computer security and I’ll be interested, but you have to provide links to your previous work.

An application to write a guest post that starts with something like “I liked your recent post at the above URL but I think you missed some important points, as I have some experience in that field I think I could write a guest post that would help educate your readers” may be accepted. Probably the best thing to do however is to write comments on my blog, if you can write informative comments and offer to make a longer comment into a guest post then there’s a good chance that I’ll be interested.

Generally though I will only offer the opportunity to write a guest post to someone who writes something really informative in private email or on a closed mailing list. If you can solve a technical problem about Linux that has me stumped then I will almost certainly be willing to accept a guest post about it!

Links Posts

There are many sites which consist of nothing but links to other sites. In the 90′s such sites were really useful but since the rise of Google their value has declined dramatically. As an aside when I ran an Internet Cafe in the 90′s I had every web browser start with the main page of my “Hot Stuff” list, the purpose of this was to increase the hit rate of my Squid cache by having most of the customers visit the same sites. Even back then I probably wouldn’t have bothered if it wasn’t for Squid hits.

It must seem like an easy way to make money to create pages of links to articles with short summaries and then hope for advertising revenue, a domain sale, or the launch of some sort of profitable business once people start using the site. I wonder whether that ever works out for people. It doesn’t seem like a good business model, doing something that requires little skill, which can be done by almost anyone, and which is done by many people. Maybe there are sweat-shops dedicated to this.

Anyway my links posts are unlikely to ever link to any such links pages, I don’t think that the people who read my blog want double indirection. I won’t entirely rule out linking to a links page, but it would have to be of very high quality and related to something very technical about computers. Basically anyone who reads this should give up on the idea of submitting a links page to me, if it’s good enough to make me break my policy of not linking to such things then I’ll probably find it myself.

Conclusion

As a general rule if you want someone to publish your work then you need to look at what they are publishing and make sure that your work fits. With the recent requests for links and guest posts I’ve been getting I have to wonder whether the people making the requests have even read my blog. That method of operating is unlikely to give any success at a blog that has any reasonable number of readers.

I’ll probably link to this from my about page or something. It might discourage some of the spammers.

Related posts:

  1. Link Within Good Things about LinkWithin For the last 10 weeks I’ve...
  2. Feedburner Item Link Clicks For a while I used the Item Link Clicks feature...
  3. Can you run SE Linux on a Xen Guest? I was asked “Can you run SELinux on a XEN...

Syndicated 2012-04-09 15:52:21 from etbe - Russell Cokeretbe - Russell Coker

Flash Storage Update

Last month I wrote about using USB flash storage devices for my firewall and Squid proxy [1]. 4 days ago it failed, the USB device used for the root filesystem stopped accepting write requests. The USB device which is used for /var/spool/squid is still going well after almost 5 months of intensive use while the USB device for the root filesystem failed after 24 days of light use. Both USB devices were of the same model and were obtained at the same time. Presumably one of them was just defective.

I’m now using an old 1G USB device for the root filesystem. When using it on less ancient systems with USB 2.0 there was an obvious speed difference between the 1GB and 4GB devices. But when run on USB 1.2 they can both support the maximum speed of the port so performance probably isn’t any worse. Not that it really matters for the root filesystem, the server is supposed to run without a break for long periods of time so if boot time becomes a performance issue then whatever is causing the reboots will be a much bigger problem.

It’s annoying to have a device fail and the failure rate for USB flash devices running 24*7 is looking rather bad at the moment. But I’m confident that things will run well from now on.

Related posts:

  1. USB Flash Storage For some years I have had my Internet gateway/firewall system...
  2. flash for main storage I was in a discussion about flash on a closed...
  3. Flash Storage and Servers In the comments on my post about the Dell PowerEdge...

Syndicated 2012-04-08 09:48:27 from etbe - Russell Cokeretbe - Russell Coker

The Security Benefits of Automation

Some Random WTFs

The Daily WTF is an educational and amusing site that recounts anecdotes about failed computer projects. One of their stories titled “Remotely Incompetent” concerns someone who breaks networking on a server and is then granted administrative access to someone else’s server by the Data Center staff [1]!

In one of the discussions about that I saw people make various claims about Data Center security, such as claiming that having their own locked room helps. My experience indicates that such things don’t do much good, I have often been granted access to server rooms without appropriate checks.

My experience is that security guards on site generally don’t directly do any good. I once had a guard hold a door for me when I was removing a server from a DC without even bothering to ask for ID! On another occasion in the Netherlands I had a security guard who didn’t speak English unlock the wrong server room for me, I used hand gestures to inform him that I needed access to the room with the big computers and he gave me the access I needed! It seems that the benefit of security guards is solely based on scaring people who don’t have the confidence needed to bluff their way in. Preventing children from thieving is a good thing,

On another occasion I showed ID and signed in for access to a DC owned by my employer and I used my security key to go through a locked door with a sign that promised many bad consequences if I failed to lock the door behind me. Then I discovered that the back door was wide open for the benefit of some electricians who were working in the building. Presumably the electricians who had no security training were expected to act as ad-hoc security guards if someone tried to enter through the back door – presumably they would not have been good at it.

When a company uses part of their own office for a server room then many of these problems disappear. But a common issue in such ad-hoc DCs is the lack of planning and procedures, I have lost count of the number of times I’ve seen doors (and even windows) propped open to allow ventilation because there were too many servers for the air-conditioning to cope. The most ironic example of this is the company that had a walk-in safe (think of a small bank vault with concrete walls and thick solid steel door) used for storing servers but with it’s door propped open to allow cooling. The advantage of a serious hosting company is that they will have procedures for cooling etc and will be very unlikely to do strange and silly things.

Having a locked room in a DC makes some sense, but if security guards have the master keys and are allowed to use them then it might not do much good. The one time I locked my keys in such a room I had a guard let me in without verifying my ID or the claim that there were actually keys locked in the room. Presumably anyone could just claim to have forgotten their keys and get the door unlocked – just like a cheap hotel.

Locking a rack sounds like a good idea, but the racks I’ve seen have had locks which are quite easy to pick. On the one occasion when I had to pick a lock on a rack (due to keys being too difficult to manage for the relevant people) the security guards didn’t investigate, so either the security cameras were not supervised or they just didn’t care about people picking locks in a shared server room. Also if you allow people to do things freely in a shared server room they could install devices to monitor network traffic.

A locked cage in a server room should work well. In the one case where I worked for a company that used such a cage I found it to mostly work well – apart from the few weeks when the lock was broken.

One company that I worked for had scales before the door between a server room and the car-park to prevent people from stealing heavy servers. Of course that wouldn’t stop people stealing hard drives full of data which is worth more than the servers! Also an over-weight colleague had to have the scales disabled for him (as they were based on absolute mass not unexpected changes in an individual’s mass) which presumably means that any skinny employee could steal a 2RU server and still be below the mass threshold.

How to Solve some of these Problems

Computers are subject to all manner of security problems. But they tend not to do arbitrary things for no apparent reason and they will never give in to someone who is charming, attractive, or aggressive – unlike humans.

I have servers running on Hetzner, Linode, and the Rackspace Cloud. I am always concerned about possible security compromises. But I am not worried about someone climbing in a window of a server room or convincing a security guard to let them in through the door. All three of those hosting companies have the vast majority of interactions automated. I can change many aspects of the servers without involving ANY human interaction. Out of the three of those companies I have had some human interaction with Hetzner (who provide managed servers) when a hard drive needed to be replaced – obviously replacing a disk in the wrong server would have been a significant system integrity issue even though everyone would be running RAID-1 and if Hetzner improperly disposed of the broken disk then there could be security issues – but this is an unlikely mistake in the face of a rare occurrence. With Linode and the Rackspace Cloud (and the previous Slicehost hosting that was purchased by Rackspace) the most common interactions I have with employees of those companies are when my clients don’t pay their bills on time – and that’s an administrative not a technical issue. When I do have to contact the support people about a technical issue it’s usually something that’s not immediately connected to the virtual server (EG a loss of routing to the DC).

It seems most likely that there are a fairly small number of people who are allowed in the DCs for companies like Hetzner, Linode, and Rackspace. Those people would probably be recognised by the security guards and their work would be restricted to replacing failing hardware and not involve granting access requests. There are some unusual requests that they can process (EG one of my clients recently transferred a virtual server between business units) but even in those cases the administrative software controls who gets access. This is much better than just handing hardware access to what seems to be the correct physical server to a client.

If you have software running a few computers and operating correctly then you can probably scale it up to run thousands of computers and have it still work correctly. But if you have a team of people controlling access requests and want to scale it up significantly then there are huge problems in hiring skilled people and training them correctly. There is a real risk of security flaws in such administrative software, if someone managed to exploit the automated management system for one of those three companies then they could probably gain access to the private data of any of their customers. But the risk of this seems a lot less than the risk of general incompetence among humans who perform routine and boring tasks which have the potential for great errors.

Related posts:

  1. The Security Benefits of Being Unimportant A recent news item is the “hacking” of the Yahoo...
  2. Security Lessons from a Ferry On Saturday I traveled from Victoria to Tasmania via the...
  3. Public Security Cameras There is ongoing debate about the issue of security cameras,...

Syndicated 2012-03-30 12:42:29 from etbe - Russell Cokeretbe - Russell Coker

959 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!