Intro
News
Humor
Sysadmin
Programming
Books
Screenshots
Firefox Extensions
Kitty
Links!
Employment
Al Viro is a brilliant programmer. His words of wisdom are collected here for the enlightenment of all!

On practicallity

> i wanna to rewrite a version of linux kernel from scratch in assembly
> for intel 386+ fo speed and a libc also in assembly for speed
> what do u think guys

That you are utterly unoriginal wanker, what else?

On verification

> Here are some more static analysis reports from Coverity.  These are
> places where the kernel gets a scalar from the user and then uses it as an
> array index with out bounds checking it.
> 
> All of the reports below deal with the isspace macro.  It expands to an
> access to a static array with 256 entries.  If we use an unsigned char to
> index into the array, there are no problems.  However, when that char is
> signed, we can index off the left of the array.
> 
> It seems like this isn't a big deal, but if the isspace array is located
> after some important data structure, we could leak information.

#define __ismask(x) (_ctype[(int)(unsigned char)(x)])
#define isspace(c)      ((__ismask(c)&(_S)) != 0)

Figuring out why the reports mentioned in the quoted text are bullshit
is left as an exercise for readers.

On the Ivory Tower of Academia

> These are some of the types of problems engineers at REAL software shops have 
> to solve to be able to ship REAL product for REAL money.  If you haven't HAD 
> to produce code like this yourself at some point in your carrier then you've 
> lived a sheltered life.
>
> Its disingenuous for you to get on your ivory tower to point and laugh.

	Well, you see, after spending years cleaning up the excrements
of self-styled "REAL engineers" it's either get on the tower to point and
laugh or get on the tower to point and shoot.

On layering and subsystem violation

> Hi Christoph,
> -		skb->dev = (void *) &bfusb->hdev;
> +		skb->dev = (void *) bfusb->hdev;

Wait a bloody minute.  skb->dev is supposed to be net_device; what's going
on here?

On verbose explanations

> My question, I suppose, is:  what are the chances that a
> parallel-port device can be automatically detected

0

On the "Good ideas" people keep spitting out.

> So give us a design.  Make a concrete proposal.  Tell us what the
> API's --- with C function prototypes --- should look like, and how we
> should migrate what we have to your new beautiful nirvana.

But... but... that requires *work*!  How can you demand that?!?

To original toss^Wposter: it's an old observation that in order to be
useful hypothesis has to be falsifiable.  Similar principle applies
to design proposals - to be worth of any attention they have to be
detailed enough to allow meaningful criticism.

What you have done so far is equivalent to coming to a hospital and
saying "aseptic good, infection bad".  That would get pretty much
the same reactions, varying from "yes, we know" to "do you have any
specific suggestions?" and "stop wasting our time"[1].

In short: get lost and do not come back until you have something less
vague.

[1]  If you are insistent enough, you might also earn a free referral
to psychiatrist.  You would have to work harder than you have done
so far, though...

On how to properly fix code incrementally

> @@ -846,6 +846,16 @@ int presto_permission(struct inode *inod
>  
>          cache = presto_get_cache(inode);
>  
> +        /* Nobody gets write access to a read-only fs */
> +        if ((mask & MAY_WRITE) &&
> +                (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)) &&
> +                (IS_RDONLY(inode) || (nd && MNT_IS_RDONLY(nd->mnt))))
> +                return -EROFS;
> +
> +        /* Nobody gets write access to an immutable file */
> +        if ((mask & MAY_WRITE) && IS_IMMUTABLE(inode))
> +                return -EACCES;
> +

That is gratitious, since the only way presto_setattr() is ever called is
as ->permission().

> @@ -132,21 +132,6 @@ int jfs_permission(struct inode * inode,
>  	umode_t mode = inode->i_mode;
>  	struct jfs_inode_info *ji = JFS_IP(inode);
>  
> -	if (mask & MAY_WRITE) {
> -		/*

... and that is broken, since jfs_permission() can be called directly.

FWIW, I would start with
	1) split out simple_permission() - vfs_permission() sans the
r/o checks; vfs_permission() would call it and all in-tree calls of
vfs_permission() would get expanded.

	2) prove that all instances of ->permission() honour r/o checks.
Fix the broken ones (and yes, we do have them - e.g. hfs_permission()
or bogus return values in proc_permission()), after we'd shown that
it's safe.  Note that it's not obvious - e.g. anything around ACLs or
 XFS ioctls is not just fscking ugly - it's brittle as hell and
will require very careful treatment.

	3) once that is done, put r/o checks into the beginning of
permission(9)

	4) for all instances of ->permission(), move r/o checks in
the places that call that instance directly.  Remove them from method
itself.

	And yes, #2 will hurt.  Badly.

BTW, IS_RDONLY() part of that stuff will really hit the fan when you start
touching the FPOS in fs/ext2/xattr.c and around it.  Have fun...

Note that it's not enough to bring relevant vfsmount to every caller of
IS_RDONLY() - if we are calling it to make sure that fs is not r/o,
we _really_ want to make sure that it doesn't get remounted r/o just as
IS_RDONLY() returns.  And yes, there are real bugs in that area.

On expounding on the previous

> > FWIW, I would start with
> > 	1) split out simple_permission() - vfs_permission() sans the
> > r/o checks; vfs_permission() would call it and all in-tree calls of
> > vfs_permission() would get expanded.
> 
> 	 - vfs_permission()
> 	  - hfs_permission()
> 	  - hfsplus_permission()
> 	  - nfs_permission()
> 	  -* permission()
> 	  - presto_permission()
> 	  '- proc_permission()
> 
> please elaborate ...

First chunk: replace calls of vfs_permission() with r/o checks +
simple_permission(), _leaving_ vfs_permission() _itself_ _as-is_.
 
> > 	2) prove that all instances of ->permission() honour r/o checks.
> > Fix the broken ones (and yes, we do have them - e.g. hfs_permission()
> > or bogus return values in proc_permission()), after we'd shown that
> > it's safe.  Note that it's not obvious - e.g. anything around ACLs or
> >  XFS ioctls is not just fscking ugly - it's brittle as hell and
> > will require very careful treatment.
> 
> okay, but why not do it in permission(), and make
> the few callers of vfs_permission use that one instead?

Filesystem code might be calling permission(9) and expecting (broken)
behaviour from its own ->permission().  Switch to uniform behaviour
would break such places, forcing us to compensate there.  And analysis
won't be fun.

> except for jfs_permission() and nfsd_permission()
> (which is hell enough) there are no direct calls
> of *_permission() ...

fs code calling permission() is basically equivalent to direct call.

> > 	3) once that is done, put r/o checks into the beginning of
> > permission(9)
> 
> hmm, obviously you didn't read the patch at all, as that 
> is exactly what I did, so please have a look at it ...

Yes, I have read it.  Try to read the reply - what I'm saying is that it
should be a separate chunk in the series _and_ it will need some preparations
before it can go in.

> > 	4) for all instances of ->permission(), move r/o checks in
> > the places that call that instance directly.  Remove them from method
> > itself.
> > 
> > 	And yes, #2 will hurt.  Badly.
> 
> I don't see why, please enlighten me ...

See above.

> > BTW, IS_RDONLY() part of that stuff will really hit the fan when you start
> > touching the FPOS in fs/ext2/xattr.c and around it.  Have fun...
> > 
> > Note that it's not enough to bring relevant vfsmount to every caller of
> > IS_RDONLY() - if we are calling it to make sure that fs is not r/o,
> > we _really_ want to make sure that it doesn't get remounted r/o just as
> > IS_RDONLY() returns.  And yes, there are real bugs in that area.
> 
> hmm, well but how would moving the ro checks around
> or adding them for the vfs mounts influence that?
> 
> I agree that this _is_ an issue but why should this _not_
> happen with mount -o remount,ro /xy ?

Once the checks are moved up, we split them into pairs (want_write_access(),
drop_write_access()) where the former can fail.  Then the second call gets
moved past the call of fs method.  IOW, we go from

	start fs activity
	deep in the bowels of fs code check for r/o and possibly fail
	do the rest

to
	check for r/o and possibly fail
	do fs work

_and_ to

	request write access and possibly fail
	do fs work
	drop write access

Transition between (2) and (3) is simple and relatively easy to verify
_IF_ the checks are lifted up.  And result will give sane protection
for remount - switch to r/o would have to be atomic wrt requesting
write access and would fail if we had writers.

Note that per-mountpoint r/o will take pretty much the same amount of work -
propagating vfsmounts down to the IS_RDONLY checks only to have that reverted
when we lift the checks up would mean doing more or less the same twice.

It can be done and it can be merged gradually in small and simple chunks,
so that they could be (a) well-understood and (b) well-tested.

The way we split the series *does* matter - composition of these patches
will be large and will touch quite a few places (there's extra fun with
delayed writes - e.g. removal of last link in foo_unlink() will mean
extra request for write access, dropped when inode is finally deleted,
possibly hours after unlink(2) had returned).  Doing that in one or even
3-4 patches would be insane even in 2.7; in 2.6 it's so out of question
that it's not even funny.

On picking the right tool for the task

> >5. Well known. So there would be people around who
> >already know this language and expectations are clear.
> >And there are books around about this language.
> 
> LISP completely violates this requirement.  While I appreciate the power 
> of LISP for abstraction, list processing, and how it lends itself 
> towards many AI-related tasks, it's not a commonly-used language.

Whether it's commonly-used or not, there's another killer problem with LISP -
it's fragmented worse than even Pascal.  And "which subset and extensions
do we have in $IMPLEMENTATION" is worth "which language are we dealing with".
Worse, actually.  If you want a functional language - at least pick a
well-defined one.

On common-sense data structure usage

> Since when did we ever assume that renaming a kobject would rename the
> symlinks that might point to it?  Renaming kobjects are a hack that way,
> if you use them, you need to be aware of this limitation.

Since we assume that these symlinks actually reflect some relationship
between the objects and are really needed for something.  If they are
not - why the hell do we keep them at all?

On terminology

> While we're on all of this, are we going to change "tained" to some 
> other less alarmist word?

"screwed"

On incremental bugfixing

> [please cc: responses to me, have 10k message backlog in l-k folder)
> 
> Is there a good reason why the fs permission function operates on the
> inode instead of the dentry? It would seem if the dentry was passed into
> the function instead of the inode, one would have a better structure to
> play with, such as being able to use d_put() to get the real path name. 
> The inode is still readily accessible from the dentry.

man grep.

Then use the resulting knowledge to find the callers of said function in
the tree.

Then think where you would get dentry (and vfsmount, since you want path)
for each of these.  Exclude ones that have them available.  See which
functions contain the rest of calls.

Repeat the entire thing for each of these functions, until the set is
empty.  At that point you have a sequence of changes that need to be
done.  Start moving from the end of list, changing the prototypes and
updating callers.  You will get a sequence of patches ending with
what you want.  Look at their sizes.  If they are tolerably small and
straightforward - start posting them to fsdevel, one by one.  With
summaries.  Start with posting the list of changes (step 1 propagates
..., step 2...,  step n gives what we want).

Get that stuff merged, one by one.  Since it won't go in one release,
repeat the searches to verify that your analysis is still correct and
no new paths had appeared.

That's how it's done - there's nothing more to it.  And yes, this one
will end up in a moderately long chain of patches - it appeared to be
doable the last time I'd looked, but would result in 10-15 steps _if_
nothing tricky would crop up.