Background fsck

Wed Jan 21 21:37:27 PST 2004

65264.cpressey at xxxxxxxxxxxxxxx>	<400EDEFA.5020600 at xxxxxxxxx> <20040121153750.0dadb6c8.cpressey at xxxxxxxxxxxxxxx>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 204
NNTP-Posting-Host: 149.99.114.201
X-Trace: 1074749894 crater_reader.dragonflybsd.org 183 149.99.114.201
Xref: crater_reader.dragonflybsd.org dragonfly.kernel:3387

Chris Pressey wrote:

> On Wed, 21 Jan 2004 15:20:10 -0500
> Gary Thorpe <gathorpe79 at xxxxxxxxx> wrote:
> 
> 
>>There was a thread on tech-kern at xxxxxxxxxx regarding relocation of bad
>>sectors and caching and the informal observations were that some IDE 
>>drives LIE about this when the cache is enabled. That's such a
>>wonderful improvement don't you think?
>>
> 
> Right up there with fake-parity memory, I'd have to say.
> 
> 
>>In fact, without hardware being able to reliably and truthful inform
>>the OS on what is happening, no filesystem can guarantee anything.
>>
> 
> Which is kind of what I was trying to get at.  If drive manufacturers
> want to go this route and be taken seriously, they're going to have to
> start adding stuff that's normally in the filesystem to the drive
> itself.
> 
> (More likely they don't fear not being taken seriously, I grant you.)
> 
> 
>>But ATA gets high sequential transfers so I guess thats all that
>>matters.
>>
> 
> Not to me or you, obviously, but most people apparently have a high
> tolerance for crap.

I have never used a PC without an ATA disk: they are cheap, ubiquitous 
and reliable enough for home use. However, even a cheap disk should 
operate properly and not lie about what it is doing: even if it tells 
the OS that cache flushing was ineffective for this particular drive, 
that would be better than pretending it actually did something. That 
just seems like an incomplete product: not all IDE disks are that 
crappy, it seems like just sloppy work from the company.

Of course, I have never heard of modern SCSI disks doing any of this 
regardless.

> 
> 
>>Existing designs may be MORE complex and harder to maintain than
>>softdeps: do you want that?
>>
> 
> Hell no.  That's like the last thing I want.
> 
> But if someone else wants to use (say) EXT3 or ReiserFS with DragonFly,
> I wouldn't want to stop them.  Especially considering these are already
> maintained by other parties.

I think ext3 would be easiest (ext2 is already available across the 
BSD's), but not technically the best. I think ReiserFS wants to grow 
into a database or some unified name space so..... What I would be 
interested to know if implementing some of them under a BSD license 
might peeve the original owners?

> 
> 
>>If it is necessary, design one from scratch using principles the
>>others explored/built on. Its not impossible.
>>
> 
> No, but it's extra work, and it isn't necessary.

> 
> 
>>Since the VFS is NOT the major obstacle to supporting jounraling
>>
> 
> (As I said, I don't really care about journalling.)
> 
> 
>>(almost all discussions I have seen end with "lets get LFS working
>>right instead " which implies that the people who will actually decide
>>want to keep it a BSD-based) and the VFS systems in all the BSDs
>>already support multiple file system, I don't see where you are going?
>>Do you mean to make them more modular/flexible to allow module loading
>>unloading and dynamic addition of filesystems?
>>
> 
> I mean, make it easier to port other filesystems to it.
> 
> As it stands, the VFS does support other filesystems, but poorly.  I'm
> not sure how much of this has to do with UFS being regarded, in large
> part, as the One True Filesystem for FreeBSD, and how much of it is
> strictly technical.

This is what I am hinting at: is the lack of file system choices 
technical or philosophical? If it is the latter, it won't matter how the 
VFS is....

> 
>>Do you honestly think Linux has a good design for this or is it a hack
>>(I don't know I am asking)?
>>
> 
> I don't really know either, but my impression is that there's way less
> cruft in it, even if the design isn't any better.
> 
> 
>>>Sure.  But journalling != atomicity, and I don't care nearly as much
>>>about the former as I do the latter.
>>>
>>Yes journaling is atomicity:
>>
> 
> No.  Journalling implies atomicity, but atomicity doesn't imply
> journalling.
> 
> 
>>either a change makes it into the log or
>>it doesn't, or at least thats the impression I get on HOW they
>>_should_ be designed (with journaling commits being atomic).
>>The other alternative is to try and get EVERY file system operation to
>>be atomic, which will probably be infeasible or completely destroy
>>performance. Disks can only guarantee that small blocks are
>>read/written atomically, so could you please elabourate on how this
>>would work?
>>
> 
> I have no idea.  But I have no reason to believe journalling is the only
> feasible option.  And as I said, I don't really care.

> 
>>>>Suppose the driver has a bug which cause the kernel to use an
>>>>
>>>invalid >pointer: since most OS's are still monolithic, you are more
>>>unsure>about what you may have just corrupted (including FS code).
>>>
>>>Or suppose the kernel just refuses to use the invalid pointer.
>>>
>>
>>Or suppose it IS valid, but it points to the wrong data and you 
>>overwrite something and it is only caught later? What error handling
>>can you do: the error is asynchronous as it will either go undetected 
>>immediately and be revealed later OR it will cause a trap. Unless you 
>>want to add exception handling to a kernel, there is not much else you
>>can do if the error occurs in the same module as the core kernel (as
>>in a monolithic and not in a microkernel, although faults within a 
>>microkernel and not one of the servers would have the same result).
>>
> 
> So why *not* add some form of exception handling to the kernel?  At
> least for the things on the border between the kernel and the rest of
> the world, like device drivers and filesystems.
> 
> The answer is usually "because we don't want to take that sort of
> performance hit."
> 
> Which is fine as long as some level of performance is acknowledged as
> being a higher priority than some level of reliability.

Exceptions are supposed to be exceptional :-) I don't think it matters 
if it takes a longer time to recover from what would otherwise be a 
fatal error. Exceptions are _supposed_ to be implemented in such a way 
that the common, normal code path sees a minimal performance hit.

Actually, would it probably be faster than doing rigorous 
assertions/checks on each pointer?

> 
> 
>>I suppose it would be interesting to people working on fault 
>>tolerance/corrections in things like space exploration, but I doubt 
>>there is enough will to get it working on even commercial systems.
>>
> 
> Heck, if there isn't even enough will to manufacture "honest" hard
> drives and memory for commercial systems, then there certainly isn't
> going to be enough will to build reliable software for them, right?

To be fair, not all IDE disks lie, and IDE is really a consumer product. 
The fake parity memory is a very valid example though...so I have to 
agree. However, languages do support the concept (Java, C++, ADA, are 
some). The problem is how to move the concepts from applications to 
system level software. ADA probably is already doing this, but no one 
outside of aerospace/military seems to use it.

>>Why does this matter? Who needs a cheap slogan anyway?
>>
> 
> I think you missed my point - the trite sig was only to illustrate. 
> Cheap slogans don't matter, but philosophy does, and IMO DragonFlyBSD's
> philosophy could stand to be clearer.
> 
> -Chris
> 

I get the impression the philosophy is to actually use DragonFlyBSD to 
test new concepts and develop new approaches to produce a better BSD 
solution. Best of breed? Research platform?