[GSoC 2012] Implement inotify and filesystem indexing service

Vishesh Yadav vishesh3y at gmail.com
Wed Apr 25 23:20:05 PDT 2012


Hello everyone,

I'm Vishesh Yadav, a 3rd year Computer Science and Engineering student
from India and Google Summer of Code 2012 student. I'm interested in
Computer Architecture and Operating System development, which led me
applying for DFBSD.

The goal of my project is to implement Linux's inotify interface and to
write an indexing service for locate that will use inotify to keep
locate database more up-to-date. These new inotify system calls will be
exposed to Linux compatibility layer as well. I've appended my GSoC
proposal to this message.

This is the first time I'll be doing kernel programming, hence I would
really appreciate any advice that you may have. I will put in my best to
deliver what I promised for, and am looking forward enjoying hacking DFBSD.

[Proposal]
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/vishesh/20002

inotify System Calls and Indexing Service for Filesystem
========================================================

Name     : Vishesh Yadav
Email    : vishesh3y at gmail.com
Address  : A2/637, Himsagar Aptts
           Pocket P4
           Greater Noida (U.P)
           India - 201310

Abstract
--------

The goal of this project is to provide file system monitoring facilities
in DragonFly BSD. This project is divided in to two parts -

1. Implement inotify
2. Implement a Filesystem indexing service for 'locate' over inotify.

### inotify ###

DragonFly BSD provides kqueue/kevent interface to monitor file events
which works very well. However it comes with overhead of having unique
file descriptor for each watched file. Also, to watch changes in
directories each file inside that directory has to opened and watched
separately using kqueue/kevent. A system may reach the global/user file
descriptor limit when watching a large number of file.

To solve this problem, I propose to implement Linux's inotify interface.
 inotify interface can be used to monitor files and directories. Each
inotify instance use one file descriptor.

Implementing inotify will benefit various applications that use inotify,
eg.  Gamin, GIO, KIO etc... It will benefit developers who want to
implement file indexers, semantic desktop, malware scanners and
end-users who are looking for application compatibility with Linux.

### Filesystem indexing service ###

The second part of this project proposes to implement a Filesystem
indexing service. The service will prepare database that will be used by
locate utility.  Unlike the traditional updatedb program, this will
listen to filesystem changes and update the database instantly and
therefore the database would be more accurate and up to date.

If time allows (or after GSoC), I propose to extend the locate utility
to store additional information about files such as size, owner,
permissions etc.

Project Goals and Deliverables
------------------------------

During the GSoC period the goal is to -

### inotify ###
    * Implement inotify system calls.
    * Write manpages for inotify.
    * Write few tests.
    * Check well known softwares using inotify such as Gamin, GIO, KIO
    * Expose the new system calls through Linux ABI and test few native
Linux binaries against it.

### Filesystem Indexing Service ###
    * Write indexing service based on new inotify interface.
    * Extend locate to store additional information about the files.
    * Write tests for the new indexing service.
    * Update the man pages.

Implementation Details
----------------------

### inotify ###

There are essentially two ways by which we can have inotify on
DragonFly-BSD -

* Emulate over kqueue - This has been earlier tried in NetBSD (but not
in-kernel). However it still doesn't solve the problem of having an open
file descriptor for each monitored file (in kernel). Secondly inotify
provides few notifications that is not provided by kqueue. However this
approach will avoid much complexity and changing of VNODE structure and
hooking VNOPS. To avoid bloat of vnode structure and review from the
project mentor, this is the preferred method.

* Develop from scratch - In this approach we will have to add a couple
of members to VNODE structre, call inotify event functions in VNOPS.
VNOPS will forward the events to the inotify system. Each vnode will
keep a count and list of its watches. The approach is essentially same
as implemented in Linux where all file system events are sent to
fsnotify which is used by inotify/dnotify.

Overall -

* Three new system calls viz inotify_init, inotify_add_watch and
inotify_rm_watch will be implemented.
* Each open inotify instance is associated with an open inotify_device.
It keeps a list of watches and queued events.
* inotify_watch represents a watch request on specific node and is
associated with an inotify device and vnode.
* Lifetime of inotify_device and inotify_watch is managed by reference
count.
* inotify_kernel_event is an inotify event. A list of these is
associated with each inotify device.
* A basic pseudo filesystem called 'inotifyfs' will be created. It will
implement basic file operations associated such as read, ioctl, release.

Once inotify system is implemented, it will be made available at Linux
emulation layer.

### Filesystem Indexing Service ###

The filesystem indexing service will run as a daemon. We will maintain a
temporary database which will contain all changes since last update. For
every search this temporary database will be checked first for changes.
Whenever updatedb is run, this temporary database changes will be
committed to the main database. This will let us take benefit of already
mature updatedb, new inotify interface to make recent changes available
and avoid rewriting to huge database file everytime.

Optionally, if time allows (or after GSoC period) locate utility could
be further extended to store some additional information about files
such as owner, permissions, file type and other attributes. The user can
query and filter according to these attributes.

Milestones
----------

### Week 1 -
* Write headers and interface.
* Make dummy system calls (inotify_*)
* Implement inotify device and filesystem.
### Week 2-3 -
* Implement in kernel API to add/remove watches, queue events.
### Week 4-7 -
* Implement system calls inotfy_init, inotify_watch_add, inotify_watch_rm.
* Complete the implementation of inotify.
* Test the new interface. Write some tests.
### Week 8 -
* Write man pages.
* Document code.
* Test Gamin, GIO (and KIO if possible) against new interface.
* Prepare code for mid-term evaluation and review.
### Week 8 -
* Start working on Filesytem indexing service.
* Write basic interfaces and data types.
* Read config files and setup the service accordingly.
### Week 9-10 -
* Listen to filesystem changes.
* Write these changes to temporary database.
* Commit temporary database to master database whenever updatedb is run.
### Week 11-12 -
* Test the service.
* Write tests.
* Document code.
* Write/Update man pages.
* Prepare code for master.
* Get the code reviewed.
### Week 13-14
* Buffer period for any possible delays or weird bugs and more testing.

About Me
--------

I'm a 3rd year B.Tech Computer Science and Engineering student from
India. I have deep interest in Operating Systems, System Programming and
Open Source development. I'm well versed with C and UNIX C API and have
been working with it for quite few years. Apart from C, I also program
in C++, Python and Scheme.

Though I've never worked directly on kernel before, but I've good
understanding of Operating System internals. I've also taken Operating
System course from university. I have spent last few weeks studying the
architecture and implementation of DragonFly-BSD and Linux kernel
codebase as well as basic kernel development/debugging workflow.

I know String Matching algorithms and have rudimentary understanding of
ngrams which will be helpful while working with locate and indexing service.

My interest in Operating Systems and Open Source development made me
apply to DragonFly-BSD. I'm very excited and looking forward to be part
of team, contribute code and do my best.

I previously participated in GSoC 2011 for KDE and successfully
completed my project and am still maintaining it.

---------end proposal--------

Regards,
Vishesh Yadav





More information about the Kernel mailing list