Relocation (Re: Package system)

sander sander at haldjas.folklore.ee
Wed Sep 3 12:23:24 PDT 2003


On Wed, 3 Sep 2003, Matthew Dillon wrote:

> :ibotty <me at xxxxxxxxxx> wrote:
> :>
> :> btw: is someone interested in prelinking? on freebsd5, there is an effort to
> :> prelink. but i do not know, how far this is.
> :> if there is interest, i may dedicate some of my spare time to it (in one
> :> month, more or less).
> :
> :prelinking is a really bad and ugly hack.
> :--
> :	Sander
>
>     It kinda reminds me of Amiga shared libraries... in the UNIX address
>     space model, though, it's even easier.  I'm not sure what this
>     so-called 'prelinking' actually is, but I know how I would implement it
>     in Dragonfly:
>
>     The kernel manages a small section of reserved VM address space and
>     generates pre-loaded library images within that space.  Library
>     dependancies are also preloaded and linked.  When a user process
>     requests a library, if the user's VM space corresponding to
>     the kernel managed VM space is not in use, the kernel can simply map
>     its pre-loaded version (+ dependancies) into the user's VM space.

So what prelinking does is store symbol->address mappings persistently to
save the dynamic linker time. The problem is that when some of the
libraries change, the data becomes wrong - the problem is that it can be
change in any of the libraries or their depednecies that is causing this.
Oh, and the prelink map needs to be per executable, not per library.

When a program DSO[1] is loaded, typicaly becuase it was a dependency oif
some other object (possibly also a DSO) or ldopen() was called:
	a) it will contain new symbols which need to be added to the list
	   of potential symbols (unless these were known already) which
	   will be used in future symbol lookups, inc from the same
	   library
	b) it may contain new unresolved symbols which will need to be
	   resolved / go to the lazy binding list
	c) it may contain a list of dependencies that need to be loaded
	d) it may contain .ini sections that need to be run in the
	   currect order and before main()
	e) you need to skip and only load once (*and* at the correct
	   point) duplicate depedencies mentioned by more than one object

It all starts with loading the program, and having a symbol table with
(probably) lots of unresolveds and a list of dependencies to be loaded.
*Unless* there was LD_PRELOAD, in whcih case that is loaded first, which
means that some symbols are going to be in the resolved state before the
program itself or any dependencies (and their symbols) are loaded. Which
library a particular symbol comes from depends on which ones were loaded
before it. And in general, you cannot predict what address the libarary
end up at, and it will differ among different processes, esp. in 32 bit
machines. Similarily, what library (not to mention address) a symbol
points at depends onteh order the libraries were loaded.

So for prelinking you play the process through once and store the final
symbol table, the libraries loaded (and where) in the file system.  If
something in the libraries or environment changes, then depending on
implementation:
	* things occasionaly go terribly wrong
or
	* things are sometimes much slower than usual (or possibly much
	  slower for some users/in cron jobs etc) and you have hard time
	  knowing why was the case.

and even in best cases you get to regenerate the prelinking information a
lot. If this was inside the executable, you regenerate the checksum
information a lot.

>
>     Not that I am going to actually do this any time soon.  I do not
>     consider it to be all that important of an issue.  There are no
>     significant performance gains, even for scripts, and memory is only
>     saved in certain particular situations (lots of non-forking
>     separately-exec'd instances).  But it seems to me that it *can*
>     be done in a fashion that is totally transparent and fully compatible.
>

Unless you have a sparse 64 bit address space and *same* libraries are
always loaded at the *same* virtual addresses, I'm not sure it is
worthwhile. Conversly, if you have such an environment, you may be able to
just COW the data pages optimistlicly and in most cases get the result
much more cheaply.

> 					-Matt
> 					Matthew Dillon
> 					<dillon at xxxxxxxxxxxxx>
>

	Sander

+++ Out of cheese error +++







More information about the Kernel mailing list