JSFileApi

From MozillaWiki
Revision as of 19:06, 3 October 2011 by Zack (talk | contribs)
Jump to navigation Jump to search

This page details a proposal for 563742 - Efficient ctypes API for file handling.

The general idea of this API is to provide a low-level, cross-platform, fast access to file management functions. For this reason, it does not implement some primitives that are very different between platforms, e.g. chmod, mmap, epoll.

Conventions By the way, this document uses the conventions of Google Closure Compiler for type annotations.


Module FileUtilities

This module is the first access point to the file API. It contains constructors, functions to copy, move files, etc. as well as the constants used in the API.


Access a file or a directory

    /**
     * Create a temporary directory.
     *
     * Note: For the time being, there is no guarantee *when* the temporary directory is cleaned
     *
     * @returns {DirectoryDescriptor} a descriptor which may be used to access this directory
     */    
    createTempDirectory: function() {
	//Unix:    [FileUtilities.tmpdir] followed by [mkdtemp] and [FileUtilities.openDirectory]
	//Windows: [FileUtilities.tmpdir] followed by [GetTempFileName] and [FileUtilities.openDirectory]
    },

    /**
     * Open a copy of a file.
     *
     * If neither [directory] nor [name] is provided, the destination file is first created as with [createTemp]
     *
     * @param {DirectoryDescriptor=} destination Optionally, the directory in which to place the file.
     * @param {string=} name Optionally, the name of the file in the directory.
     * @return {FileDescriptor} The copy.
     */
    openFileCopy: function(directory, name) {
	//Unix:     need to implement -- note that some file systems support an [ioctl] for copy-on-write.
	//Windows:  maps to [CopyFile] http://msdn.microsoft.com/en-us/library/aa363851(v=VS.85).aspx
    },


General utilities

    /**
     * Copy a file.
     *
     * Note: OS-accelerated on some platforms.
     *
     * @param {string} source The name of the file/directory to copy.
     * @param {string} target The name of the file/directory to be created.
     * @param {boolean} overwrite If [false] and if the target already exists, fail.
     * @throws FileDescriptorError
     */
    copy: function(source, target, overwrite)
    {
	//Unix:     need to implement with [open], [read], [write], [close]
       //          (check for existing library routines in Glib / Qt)
	//Windows:  maps to [CopyFile] http://msdn.microsoft.com/en-us/library/aa363851(v=VS.85).aspx
            //Check if it works with directories
    },

    /**
     * Move a file.
     *
     * Note: OS-accelerated whenever possible.
     *
     * @param {string} source The name of the file/directory to move.
     * @param {string} target The name of the file/directory to be created.
     * @param {boolean} overwrite If [false] and if the target already exists, fail.
     * @throws FileDescriptorError
     */
    move: function(source, target, overwrite)
    {
	//Unix:     maps to [rename] or, when [rename] returns EXDEV, on [FileUtilities.copy]+[FileUtilities.remove]
	//Windows:  maps to [MoveFile] http://msdn.microsoft.com/en-us/library/aa365239(v=VS.85).aspx
            //Check if it works with directories
    },
 
    /**
     * Remove a file/directory.
     */
    remove: function(name)
    {
	//Unix:    maps to [unlink] 
	//Windows: maps to [DeleteFile]
    },

Constants

    /**
     * Return the location of the directory/folder used to store temporary files.
     *
     * Computed lazily.
     *
     * @return {DirectoryDescriptor}
     */
    get tmpdir() {
	//All platforms: use [nsIDirectoryService] [NS_OS_TEMP_DIR] to get the directory the first time

	//Alternative solution:
	//Unix minus Android: maps to [getenv] for "TMPDIR"
	//Android:            TODO - probably somewhere in the Moz preferences directory - check with nsIDirectoryService
	//Windows:            maps to [GetTempPath] http://msdn.microsoft.com/en-us/library/windows/desktop/aa364992%28v=vs.85%29.aspx
	   //Note: Perhaps we should check with nsIDirectoryServi
    },

    /**
     * Return a well-known directory, such as the user profile directory, etc.
     *
     * Results are cached.
     *
     * Note: for performance and usability reasons, we will probably progressively add functions such as [get tmpdir] for other well-known directories.
     *
     * @param The key of a well-known directory. The list of keys is only defined in http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsDirectoryServiceDefs.h .
     */
    getDirectory: function(key) {
        //All platforms: use [nsIDirectoryService], cache result
    },


Flags

Flags for file opening

Note that these flags are separated for performance+portability reasons. Each category of flag is meant to be or-ed.

    Open: {
	/**
        * Open for reading, writing or both.
        *
	 * @enum {number}
	 */
	Access: {
           /** Open file for reading */
	    READ:   ...,
           /** Open file for writing */
	    WRITE:  ...,
	},

	/**
	 * @enum {number}
	 */
	Content: {
           /** Create file if it doesn't exist*/
	    MAY_CREATE:...,
           /** Create file; fail if the file already exists */
           MUST_CREATE:...,
           /** Write at the start of file if it exists. If not specified, append.*/
	    OVERWRITE: ...,
	},

	/**
	 * @enum {number}
	 */
	Pragma: {
           /** Windows-specific pragma: use Posix-style file names, i.e. two file names who differ only in case should not be collapsed*/
	    POSIX_SEMANTICS:     ...,

           /** Windows-specific pragma: optimize cache for sequential access*/
	    SEQUENTIAL_ACCESS:   ....

           /** Windows-specific pragma: optimize cache for random access*/
	    RANDOM_ACCESS:       ...,

           /** Windows-specific pragma: do not buffer writes*/
	    WRITE_THROUGH:       ...
	}
    },

Flags for seeking in a file

    Seek: {
	/**
	 * Possible methods for seeking.
	 *
	 * @enum {number}
	 */
	Method: {
	    /**
	     * Seek from file start
	     */
	    SET: ...,

	    /**
	     * Seek from current position
	     */
	    CUR: ...,

	    /**
	     * Seek from file end
	     */
	    END: ...
	},
    },

Interfaces

    /**
     * The kind of information that can be found by calling [FileDescriptor.info] or [DirectoryDescriptor.forEachFile].
     *
     * Note that some or all fields may be computed lazily.
     *
     * @interface
     */
    FileInfo: { 
	/**
	 * @return {number} milliseconds
	 */
	get lastModificationTime() : {
	    ...
	},
	/**
	 * @return {number} bytes
	 */
	get size() : {
	    ...
	},

	/**
	 * @return {boolean}
	 */
	get isExecutable(): {
	    //Windows: [GetBinaryType] http://msdn.microsoft.com/en-us/library/windows/desktop/aa364819(v=VS.85).aspx
	    //Unix:    contents of [stat]
	},
       /**
        * Note: this property is OS-accelerated for entries returned by [forEachFile] or by enumerating files in a directory.
        *
        * @return {boolean}
        */
       get isDirectory(): {
            
       }
    },

    /**
     * An exception launched by this module.
     *
     * @constructor
     * @extends {Error}
     */
    Error: function(){}

Instances of FileDescriptor

A FileDescriptor is a low-level object wrapping a native file descriptor (under variants of Unix) or a file handle (under Windows).

Reading

    /**
     * Read some content from a file from the current position, advance.
     *
     * @param {ArrayBuffer} buf The buffer which will receive the data.
     * @param {number} offset The position in the array at which to start putting data, in bytes.
     * @param {number} size The maximal number of bytes to read. This method can read less bytes if the file is shorter.
     * @return {number} The number of bytes read.
     * @throws {FileDescriptorException} In case of error.
     */
    read: function(buf, offset, size) {
	//Unix:    [read]
	//Windows: [ReadFile]  http://msdn.microsoft.com/en-us/library/windows/desktop/aa365467%28v=VS.85%29.aspx
    },
 
    /**
     * As [read], but read from a given position and do not advance.
     *
     * @param {number} fileOffset The position in the file from which to read.
     * @param {ArrayBuffer} buf The buffer which will receive the data.
     * @param {number} offset The position in the array at which to start putting data, in bytes.
     * @param {number} size The maximal number of bytes to read. This method can read less bytes if the file is shorter.
     * @return {number} The number of bytes read.
     * @throws {FileDescriptorException} In case of error.
     */
    pread: function(fileOffset, buf, offset, size) {
	//Unix:     [pread]
	//Windows:  [ReadFile] + [SetFilePointer] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365541(v=VS.85).aspx
    },

Writing

    /**
     * Write some content to a file, advance.
     *
     * @param {ArrayBuffer} buf The buffer containing the data.
     * @param {number} offset The position in the array at which the data starts, in bytes.
     * @param {number} size The maximal number of bytes to read. This method can write less bytes, depending on buffering.
     *
     * @return {number} The number of bytes written.
     * @throws {FileDescriptorException} In case of error.
     */
    write: function(buf, offset, size) {
	//Unix:    [write]
	//Windows: [WriteFile] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365747%28v=VS.85%29.aspx
    },
 
    /**
     * As [write], but write to a specific position and do not advance
     */
    pwrite: function(fileOffset, buf, offset, size) {
	//Unix: [pwrite]
	//Windows: [WriteFile] + [SetFilePointer]
    },

Attributes

    /**
     * Gather information about the file
     *
     * @return {FileUtilities.FileInfo} information about the file.
     */
    stat: function() {
	//Unix:    [lstat]
	//Windows: [GetFileInformationByHandle] http://msdn.microsoft.com/en-us/library/windows/desktop/aa364952(v=VS.85).aspx
    },

    /**
     * Set the size of the file
     *
     * @param {number} newSize The size to give to the file.
     */
    setSize: function(newSize) {
	//Unix:    [truncate]
	//Windows: [SetFileValidData] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365544%28v=VS.85%29.aspx
    },

Misc

    /**
     * Change the position in the current file
     *
     * @param {number} delta Number of bytes. Can be positive or negative.
     * @param {FileDescriptor.Seek.Methodmethod} Determine whether [delta] is to be taken from the start of the file, from the end or from the current position.
     */
    seek: function(delta, method) {
	//Unix:    [lseek]
	//Windows: [SetFilePointer]
    },

    /**
     * Close a file descriptor.
     *
     * Any further operation on that file descriptor will launch an exception
     */
    close: function() {
	//Unix:    [close]
	//Windows: [CloseHandle]
    },

    /**
     * Flush the buffer
     */
    flush: function() {
	//Unix:    [fsync]
	//Windows: [FlushFileBuffers] http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=VS.85).aspx
    },

Instances of DirectoryDescriptor

A DirectoryDescriptor is a slightly higher-level object wrapping a directory _name_ (for reasons of portability & iteration, this seemed more appropriate than _opening_ the directory during construction). On the Unix side, some of the methods rely upon (or have to reimplement) systems that obey recent versions of Posix, with functions such as openat.

Opening/creating

    /**
     * Open a file from a directory
     *
     * @param {string} leafName The name of the file.
     * @param {number} accessMode A or-ing of flags, as specified by [FileDescriptor.Open.Access].
     * @param {number} contentMode A or-ing of flags, as specified by [FileDescriptor.Content.Access]
     * @param {number} pragmaMode A or-ing of flags, as specified by [FileDescriptor.Pragma.Access]
     * @return {FileDescriptor} a FileDescriptor
     *
     * @throws FileDescriptorError
     */
    openFile: function(leafName, accessMode, contentMode, pragmaMode) {
	//Linux:     [openat]
       //Unix:      decide between gnulib [openat] and simply [open]
	//Windows:  cf. [FileDescriptor.open]
    },

    /**
     * Create a temporary file in this directory. This file is deleted when the process closes, when the file is closed.
     */
    createTempFile: function() {
	//Unix:        uses [mkstemp] and [this.openFile]
	//Windows:     maps to [GetTempFileName] + [CreateFile] http://msdn.microsoft.com/en-us/library/windows/desktop/aa363875%28v=vs.85%29.aspx
    },

    /**
     * Open a subdirectory of this directory.
     *
     * @param {string} leafName The platform-specific name of the directory.
     * @param {number=} accessMode A or-ing of flags, as specified by [FileDescriptor.OpenDir.Access]
     *
     * @returns {DirectoryDescriptor} a descriptor which may be used to access this directory
     */
    openDirectory: function(leafName, accessMode) {
	//Unix:    lazy -- may call [openat]
	//Windows: lazy
    },

    /**
     * Create a temporary directory.
     *
     * Note: For the time being, there is no guarantee that the temporary directory will be cleaned
     *
     * @returns {DirectoryDescriptor} a descriptor which may be used to access this directory
     */    
    createTempDirectory: function()
    {
    },

Information

    /**
     * Gather information about the directory
     *
     * @return {FileUtilities.FileInfo} information about the file.
     */
    stat: function() {
	//Unix:    [lstat]
	//Windows: [GetFileInformationByHandle] http://msdn.microsoft.com/en-us/library/windows/desktop/aa364952(v=VS.85).aspx
    },

Browsing contents

This API provides three ways of browsing contents:

  • using the directory as an iterator;
  • using the more powerful function [forEachMap], which also provides fast filtering on platforms where it is possible.
    /**
     * Apply a treatment to all files in the directory.
     *
     * Note: objects of type DirectoryDescriptor are iterable. Therefore, you can also loop through them using a standard [for..in].
     *
     * @param {string=} filter. If provided, uses OS-accelerated, platform-specific, filtering, where available.
     * @param {function(string, FileDescriptor.FileInfo, number, function() FileDescriptor)} onFile  A function called for each file in the directory, with the name of the file, a (lazy) file info for that file and a file number. If the function returns anything [null], the loop stops immediately and returns the value returned by that function.
     *
     * @returns The first value returned by [onFile], or [undefined] otherwise.
     */
    forEachFile: function(filter, onFile) {
	//Unix:    maps to [opendir], [dfd], [readdir]/[readdir64], lazy calls to [stat], lazy calls to [openat]/[open], [closedir]
	//Windows: maps to [FindFirstFile], [FindNextFile], [Close]
    }

Not implemented

  • chmod, chown -- very different between platforms - might implement platform-specific functions
  • select, poll, ... -- very different between platforms, higher level
  • mmap -- probably feasible, just might require additional API
  • locking -- very different between platforms, most likely deserves its own API
  • linking -- very different between platforms
  • readString, writeString -- ArrayBuffer <-> String conversion most likely deserves its own API
  • opening a file or directory from a full path -- error-prone, difficult to optimize, favors hardcoding non-portable paths -- also, we intend to use this API mostly to access files in well-known directories.

Implementation notes

  • For the moment, the JS team does not recommend using js-ctypes for performance-critical code. Rather, they recommend using JS API, so this is probably the right way to go.
  • This is JS code, so by definition not thread-safe.
  • Depending on demands by API users, a C++ version may be produced. In this case, we will probably want to make it MT-safe.