PATCH preadv/pwritev

Sun Apr 30 07:16:59 PDT 2006

On Sat, Apr 29, 2006 at 12:30:49PM -0700, Matthew Dillon wrote:
> :In that case define a global upper limit for valid offsets and use
> :number above that for the special cases. Just the number of checks for
> :negative offsets and overlarge offsets would be simplified. Beside, if
> :you insist on keeping it signed, make it ssize_t.
> 
>     The problem here is that this is used all over the internals of the
>     operating system.  It makes no sense to obfuscate code that would 
>     otherwise be clearly readable by using 'ssize_t' instead of 'int'
>     when half the code in question has severe limitations on the data
>     size being represented anyhow.

So fix the code to deal with large sizes correctly instead of relaying
on arbitrary types to hide it. Heck, if it is meant as enforced API,
make it a (u)int32_t and not an int.

>     For example, when a UIO is translated into buffer cache ops, a code
>     loop is taking the UIO/IOV lengths and doing calculations to produce
>     BUFfer cache sizes, which are far more limited and clearly should not
>     be using ssize_t.

So does the size of a mbuf. That doesn't mean that int is a good type
nor does it mean that it is set in stone. Talking about sizes, I can
think of a lot of applications which might want to push out more than
INT_MAX, just think of a web server full of DVD images.

>     I am not going to create a mess of code casting size_t or ssize_t to
>     int or vise-versa for the half dozen (or more) places in the code
>     where such translations take place.  Even if we were to do that, basic
>     assumptions have to be made on what 'ssize_t' actually is for the casts
>     to do what is expected... effectively, it MUST be an int anyway or
>     the code blows up.  Because of that, it's really awefully silly to
>     try to abstract the kernel code internally when 'int' works perfectly
>     well.

That's exactly the mess we already have in the ioctl code with longs.
Sure, long == int on every VAX, so why bother to handle it correctly?
There should *never* be a cast in the kernel without value range and in
many places such checks happen implicitly or are downright missing.

About the read example, it is moot since the kernel is allowed to reject
nbytes >= SSIZE_MAX (and actually should do so). It doesn't change the
simple fact that coherent types make coding a lot easier and more
portable. Keep in mind that AMD64 is a popular LP64 platform, which
breaks a lot code making stupid assumptions like above.

You've argued that negative values are useful for "special" values. I
call that interface abuse. The field is a size, not a flag bit. The
times of scarce memory where every bit counts are over. Semantic
overloading is error-prone and a lot of have been uncovered by ensuring
that non-negative values are unsigned and checked for upper-bounds
instead of doing negative checks all over the place, but missing a few.

The mess exists because it was never followed by a real refactoring of
the kernel code.

Joerg