ServerJS/Filesystem API/A: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Oops. Moved to ServerJS/Filesystem_API/A, the more established name space.)
 
m (Add link the ML thread)
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Tier 1 =
Proposal A, Draft 5 (in development)


This tier establishes a basis with binary IO and basic string manipulation for paths.
The "file" module supports the File System API, providing an interface for path, directory, file, link, file stat, and file stream manipulation.


The require('file') module would provide support the File System API.  In turn, the File System API would provide all methods that deal with path strings.  File System API functions would return file stream objects, directory objects, and file stat objects. File stream objects would, for securability, be unaware of their corresponding path, and keep their underlying file system connection, albeit a file descriptor or FILE*, and their parent directory link, secret.
For the purpose of this documentation, "fs" is a variable name for any object that implements the File System API, including the exports of the "file" module, and objects returned by "fs.chroot(path)".


[http://groups.google.com/group/commonjs/browse_thread/thread/c35d10239fade9ab# Discussion Thread].


The File System API would provide the following constants:


* SEPARATOR "/" or system specific analog, like ":" or "\".
=== Security ===


The File System API would provide the following methods:
Objects implementing the File System API, including the "file" module object, are capability bearing objects that carry and mediate authority to read and write to the underlying storage.  As such, the "file" module can return other objects that implement and attenuate the File System API for sandboxing.  Furthermore, streams returned by the file system object are implicitly attenuated to only give the receiver authority to manipulate the given file, without knowledge of the path on which it resides or access to references that would permit it to manipulate other parts of the file system.


* join(...paths:Strings...): takes a variadic list of path Strings, joins them on the directory separator, and normalizes the result.


* split(path:String): returns a string split on the file system's directory separator. The split() method will return an empty string as the first element for absolute paths that do not contain drives. For file systems with drives (such as windows), the first element will be the drive with colon if the path contained a drive specification. The intent is so that join with the SEPARATOR will correctly join the elements to reconstitute the path.
=== Interface ===


* resolve(...paths:Strings...): marches through a list of variadic absolute or relative path Strings, resolving the next path relative to the previous and returning the ultimate destination path.  This is analogous to navigating to the fully qualified URL of a relative URL on a given page.  Unlike "join", which presumes that the base path is always a directory, "resolve" considers a directory separator at the end of a path an indication that the path must be resolved off of the directory rather than the leaf "file".  For example, resolve("a", "b") returns "b", but resolve("a/", "b") returns "a/b" on a system with "/" as its directory separator.  Resolving a fully qualified path relative to any base path returns the fully qualified path, like resolve("a", "/") == "/".  Resolve is purely a string manipulation routine and does not use information about the underlying file system.
The "file" module must export the following:


* relative(from, to): returns the relative path from one path to another using only ".." to traverse up to the two paths' common ancestor.
==== Files ====


* normal(path:String): removes '.' path components and simplifies '..' paths, if possible for a given path.  If the file system is case sensitive, transforms all letters to lower-case to make them unambiguous. "normal" may be implemented in terms of "resolve".
; open(path String, mode|options)
: returns a stream object that supports an appropriate interface for the given options and mode, which include reading, writing, updating, byte, character, unbuffered, buffered, and line buffered streams.  More details follow in the [#Stream] section of this document.
* path
* mode: "rwa+bxc", or
* options
** mode String
** charset String
** read Boolean
** write Boolean
** append Boolean
** update Boolean
** binary Boolean
** exclusive Boolean
** canonical Boolean
; read(path String, [mode|options])
: opens, reads, and closes a file, returning its content.
; write(path String, content String|Binary, [mode|options])
: opens, writes, flushes, and closes a file with the given content.  If the content is a ByteArray or ByteString, the binary mode is implied.
; copy(source String, target String)
: reads one file and writes another in byte mode.
; move(from String, to String)
:
; remove(path String)
:
; rename(path String, name String)
:
; touch(path, [mtime Date])
:


* absolute(path:String): returns the absolute path, starting with the root of this file system object, for the given path, resolved relative to the current working directory.  If the file system supports home directory aliases, absolute resolves those from the root of the file system.  The resulting path is in normal form.  On most systems, this is equivalent to expanding any user directory alias, joining the path to the current working directory, and normalizing the result.  "absolute" can be implemented in terms of "resolve" and "cwd".
==== Directories ====


* canonical(path:String): returns the canonical path to a given abstract path.  Canonical paths are both absolute and intrinsic, such that all paths that refer to a given file (whether it exists or not) have the same corresponding canonical path.  This function may not communicate information about the true parent directories of files in chroot environments.  This function is equivalent to expanding a user directory alias, joining the given path to the current working directory, joining all symbolic links along the path, and normalizing the result.  "canonical" can be implemented in terms of "cwd", "resolve", and "readlink".
('''TODO''' Should the methods in this section use "mkdir" or "makeDir" style names? [http://groups.google.com/group/serverjs/browse_thread/thread/851fea66620bae13 discussion])


* exists(path): whether a file exists at a given path: receives a path and returns whether that path, joined on the current working directory, corresponds to a file that exists. If the file is a broken symbolic link, returns false.
; list(path String) Iterator
: returns an iterator that yields the names of the entries in a directory.  Throws an error if the directory is inaccessible or does not exist.
; mkdir(path String)
: Create a single directory specified by ''path''. If the directory cannot be created for any reason an exception will be thrown. This includes if the parent directories of "path" are not present.
:
; mktree(path String)
: '''TODO''' Decide upon the name of this function. "mkdirs" or "mktree" or "makeTree"
: Will create the directory specified by "path" including any missing parent directories.
:
; rmdir(path String)
: Removes a directory if it is empty. If it is not, or cannot be removed for another reason an exception will be thrown.
:
; rmtree(path String)
:
: This does its best to remove the directory "path", akin to "rm -r" on Unix or "del /s" on Windows. That is it will recursively delete all directories and files under "path".


* stat(path): returns an object that represents a snapshot of the information about given file.
==== Links ====
*# The stat file must contain the following properties:
*## mtime: the time that the file was last modified
*## other properties will be specified in a future specification, including other times, ownership, permissions, and size.
*# The path is resolved relative to the current working directory.
*# Throws an error if the corresponding file does not exist or is inaccessible.


* list(path): returns an Array of file name strings.
==== Paths ====
*# The returned object may be immutable.
*# The path is resolved relative to the current working directory.
*# Throws an error if the directory does not exist or is inaccessible.


* open(path, mode) or open({options})
; [new] Path(path String|Path|Array, [fs FileSystem]) Path
*# options may include {path, mode} and more in a later, more detailed specification.
: returns a Path object that closes on a File System object and a "path" representation. The path object is a chainable shorthand for working with paths in the context of the "file" module.  "Path" objects have no more or less authority to manipulate the file system than the FileSystem object that they are attached to, as any path string is reachable by chaining operations on a path instanceThe FileSystem object defaults to the "file" module if the argument is omitted or undefined. More details follow in the [#Path Path] section of this document.
*# The mode is a String and may contain:
; path(path String|Path, fs FileSystem) Path
*## "r" that means that the stream may be read, and implies that the "read", "readLine", "readLines", "next", and "iter" methods must be supported by the returned stream object.  "readLine" returns "" on EOF, and "next" throws an error instead.  "iter" returns the stream object itself.  "readLine" returns a string excluding the newline character.
: "fs.path(path)" is a shorthand for "new fs.Path(path, fs)".
*## "w" that means that the stream may be written to, and implies that the "write", "writeLine", and "writeLines" methods must be supported by the stream objectAlso implies that the file will be created or truncated if necessary, if not overridden by the "a" or "+" flags.
*## "a" that means that the stream's position will begin at the end of the stream, and that the file will be created but not truncated.
*## "+" that means that the stream will not be truncated.
*# The path is resolved relative to the current working directory.
*# Throws an error if the specified stream cannot be created.
*# close(): Closes the stream. Should be called automatically when the mode is changed, and when the garbage collector collects its last reference.


* read(path, [mode, [options]]): opens, reads, and closes a file, returning its content.
===== Working Path =====


* write(path, content, [mode, [options]]): opens, writes, flushes, and closes a file with the given content.
The current working directory is used by all routines that resolve relative paths to absolute file system paths, including "open", and "absolute".


= Tier 2 =
; cwd() String
: returns the current working directory.
; chdir(path String)
: changes the current working directory.


This tier adds support for encoded and buffered text IO and a chainable Path type.
===== Traditional =====


* open(path, mode, {options}) or open({options})
; join(...)
*# options may include {path, mode, charset, recordSeparator, fieldSeparator} and more in a later, more detailed specification.
: takes a variadic list of path Strings, joins them on the file system's path separator, and normalizes the result. [[ServerJS/Filesystem_API/Join|details]]
*# The mode is a String and may additionally contain:
; split(path String) Array
*## "b" for ByteArray and ByteString streams.
: returns an array of path components.  If the path is absolute, the first component will be an indicator of the root of the file system; for file systems with drives (such as Windows), this is the drive identifier with a colon, like "c:"; on Unix, this is an empty string "". The intent is that calling "join.apply" with the result of "split" as arguments will reconstruct the path.
*## "t" for String and Array streams (default)
; normal(path String)
*# "charset" may be any IANA charset identifier, case-insensitive"open" may throw an error if the charset is not supported.  "charset" defaults to a system-specific encoding if the stream is in text mode with no charset provided.
: removes '.' path components and simplifies '..' paths, if possible, for a given path.
*# returns a stream typeWhile the type names need not correspond to these name, these are used as an interal reference in this document for what the minimum interface of those streams must contain.
; absolute(path String)
*## Returns a BinaryReadStream for "b" and "r" modes.
: returns the absolute path, starting with the root of this file system object, for the given path, resolved from the current working directoryIf the file system supports home directory aliases, absolute resolves those from the root of the file system.  The resulting path is in normal form.  On most systems, this is equivalent to expanding any user directory alias, joining the path to the current working directory, and normalizing the result.  "absolute" can be implemented in terms of "cwd", "join", and "normal".
*## Returns a BinaryWriteStream for "b" and "w" or "a" modes.
; canonical(path String)
*## Returns a BinaryRandomStream for "b" and "+" modes.
: returns the canonical path to a given abstract pathCanonical paths are both absolute and intrinsic, such that all paths that refer to a given file (whether it exists or not) have the same corresponding canonical path.  This function must not communicate information about the true parent directories of files in chroot environments. This function is equivalent to expanding a user directory alias, joining the given path to the current working directory, joining all symbolic links along the path, and normalizing the result. "canonical" can be implemented in terms of "cwd", "join", "normal" and "readlink".
*## Returns a TextReadStream wrapper for "t" and "r" modes.
; dirname(path String) String
*## Returns a TextWriteStream wrapper for "t" and "w" modes.
: returns the path of a file's containing directory, albeit the parent directory if the file is a directory.  A terminal directory separator is ignored.
; basename(path String, [extension String]) String
: returns the part of the path that is after the last directory separator. If an extension is provided and is equal to the file's extension, the extension is removed from the result.
; extension(path String) String
: returns the extension of a file.  The extension of a file is the last dot (excluding any number of initial dots) followed by one or more non-dot characters. Returns an empty string if no valid extension exists.  [http://github.com/kriskowal/narwhal-test/blob/master/src/test/file/extension.js unit test].


* BinaryReadStream
===== URL-like =====
** read() -> all:ByteString
** read(max:Number) -> actual:ByteString
** readInto(buffer:Array|ByteArray, [begin:Number, [end:Number]]) -> actual:Number
** available() -> Number -- how many bytes are ready to be read (buffered) without blocking.
** skip(n:Number) -> advance the read head past these bytes.
* BinaryWriteStream
** write(buffer:Array|ByteArray|ByteString)
** truncate(length:Number=0)
** flush()
* BinaryRandomStream < BinaryReadStream < BinaryWriteStream
** tell() -> position:Number)
** seek(position:Number, whence:enumerated)
** rewind() -- returns the read/write head to the beginning of the file
** truncate([size:Number]) -- sets the length of the file.  size defaults to the current position as reported by tell().  If a size is explicated, truncate also seeks to the new end.
* TextReadStream
** raw -> a BinaryReadStream
** read() -> String -- returns all remaining characters
** read(max:Number) -> String -- returns up to max characters, with the length of the returned string reflecting the actual number read.
** readLine() -> String -- returns "" if no data is available before EOF.  Otherwise, includes the "recordSeparator".
** readLines() -> Array * String
** next() -> String -- throws a StopIteration if no data is available before EOF.
** input() -> returns "readLine" without the "recordSeparator".
** available() -> Number -- how many characters are ready to be read (buffered) without blocking.
** skip(n:Number) -> advance the read head past these characters.
* TextWriteStream
** raw -> a BinaryWriteStream
** write(buffer:String)
** writeLine(buffer:String)
** print(...String...) -- writes a "fieldSeparator" delimited and "recordSeparator" terminated line.
** flush()


The File System API would provide the following additional methods:
; resolve(...)
: a function like "join" except that it treats each argument as as either an absolute or relative path and, as is the convention with URL's, treats everything up to the final directory separator as a location, and everything afterward as an entry in that directory, even if the entry refers to a directory in the underlying storage.  Resolve starts at the location "" and walks to the locations referenced by each path, and returns the path of the last file.  Thus, resolve(file, "") idempotently refers to the location containing a file or directory entry, and resolve(file, neighbor) always gives the path of a file in the same directory.  "resolve" is useful for finding paths in the "neighborhood" of a given file, while gracefully accepting both absolute and relative paths at each stage. [http://github.com/kriskowal/narwhal-test/blob/master/src/test/file/resolve.js unit test].
; relative(from, to)
: returns the relative path from one path to another using only ".." to traverse up to the two paths' common ancestor.


* basename(path:String) -> String
==== Tests ====
* copy(from:String, to:String) -> copies a file by reading one and writing the other in binary modes.
* dirname(path:String) -> String
* extname(path:String) -> String
* isDirectory(path:String) -> Boolean
* isFile(path:String) -> Boolean
* isLink(path:String) -> Boolean
* isReadable(path:String) -> Boolean
* isWritable(path:String) -> Boolean
* mkdir(path:String)
* mkdirs(path:String)
* move(from:String, to:String)
* mtime() -> lastModification:Date|null
* remove(path:String)
* rename(path:String, name:String)
* rmdir(path:String)
* rmtree(path:String)
* same(from:String, to:String) -> Boolean -- whether the two files are identical, in that they come from the same file system, same device, and have the same node and corresponding storage, such that modifying one would modify the other.
* size(path:String) -> bytes:Number
* touch(path:String, [mtime:Date])


The Path object closes on both a file system and a path StringThe file system mediates all interaction with the underlying storage, so a Path only provides a convenient interface for chaining operations that manipulate a path string.  All of the methods of path are polymorphic (meaning late-bound) and curried (meaning the enclosed path String is passed to the corresponding file system method).
; exists(path)
:whether a file exists at a given path: receives a path and returns whether that path, joined on the current working directory, corresponds to a file that existsIf the file is a broken symbolic link, returns false.
; isFile(path)
:returns whether a path exists and that it corresponds to a file.
; isDirectory(path)
:returns whether a path exists and that it corresponds to a directory.
; isLink(path)
:returns whether a path exists and that it corresponds to a symbolic link (TODO or shortcut?).
; isReadable(path)
:returns whether a path exists, that it corresponds to a file, and that it can be opened for reading by "fs.open".
; isWritable(path)
:If a path exists, returns whether a file may be opened for writing, or entries added or removed from an  existing directory.  If the path does not exist, returns whether entries for files, directories, or links can be created at its location.


* path(path:String) -> path:Path
==== Metadata ====
* new Path(path:String, [fs:FileSystem]) -> path:Path
*# absolute() -> path:Path
*# basename() -> path:Path
*# canonical() -> path:Path
*# copy(to:Path)
*# dirname() -> path:Path
*# exists() -> Boolean
*# extname() -> String
*# from(path:String|Path) -> path:Path -- an alias for "relative" that finds the relative path from the given path to this one.
*# isDirectory() -> Boolean
*# isFile() -> Boolean
*# isLink() -> Boolean
*# isReadable() -> Boolean
*# isWritable() -> Boolean
*# join(...paths...) -> path:Path
*# list() -> Array * Path
*# mkdir()
*# mkdirs()
*# move(to:String)
*# mtime() -> Date
*# normal() -> path:Path
*# open(mode, [options]) -> stream:{Text,Binary}{Read,Write,RW,Random}Stream
*# read([mode, [options]]) -> {Byte,}String
*# remove()
*# rename(name:String) -- rename is distinct from move only in that the new name is resolved relative to the former path, rather than the current working directory.
*# resolve(...paths...) -> path:Path
*# rmdir()
*# rmtree()
*# same(as:Path|String) -> Boolean
*# size() -> Number
*# split() -> parts:(Array * String)
*# stat() -> stat:Object {mtime:Date, size:Number}
*# to(path:String|Path) -> path:Path -- an alias for "relative" that finds the relative path from this path to the given one.
*# toString() -> String
*# touch([mtime:Date])
*# write(data:ByteString|ByteArray|Array|String, [mode, [options]])


; stat(path String)
:Returns an object that contains the file's metadata, including all of the following that are applicable in the target platform's file system
;; device Number
:device number of the file system
;; inode Number
:virtual node number
;; mode Number
:type and permissions, numeric
;; linkCount Number
:number of hard links to the file
;; uid Number
:numeric id of the owner user
;; rdev Number
:the device identifier for special files
;; size NUmber
:total size in bytes
;; blockSize Number
:preferred block size for file system IO, in bytes
;; blockCount Number
:number of blocks allocated
;; mtime Date
:time of last modification (write)
;; atime Date
:time of last access (read, write, update)
;; ctime Date
:(TODO created vs. stat changed.  is /.time/ really the best pattern for expressing these times?)
;; xattrs
:extended attributes (reserved)
;; acls
:access control lists (reserved)


= Tier 3 =
; size(path String)
:Number
; mtime(path String)
:Date
; atime(path String)
:Date
; ctime(path String)
:Date
; same(pathA String, pathB String) Boolean
:whether the two files are identical, in that they come from the same file system, same device, and have the same node and corresponding storage, such that modifying one would implicitly and atomically modify the other.


This tier adds support for random access IO, locks, canonical IO (non-blocking), more comprehensive stat access and mutation, and temporary files and directories, and symbolic links.
==== Security ====


* open(path, mode, {options}) or open({options})
; chroot(path String)
*# options may include {path, mode, charset, permissions, owner, groupOwner} and more in a later, more detailed specification.
: returns an object that conforms to the File System API, like the "file" module or the "system.fs" variable, that can be safely passed into a sandbox as the "file" module and "system.fs" objects.
*# The permissions must be an integer of Unix style permissions. Other objects may be permitted in a later specification, like a Stat object or duck-type thereof.


* copyStat(from:String, to:String)


Path:
== Path ==


* copyStat(to:String|Path)
; [new] Path(path String|Path|Array, [fs FileSystem]) Path
:


In progressMore information about other potential additions, please refer to the [https://wiki.mozilla.org/ServerJS/API/file/Names proposed names].
The prototype for the Path constructor is a String object.
 
The path constructor accepts as its first argument either a String, Path, or Array.  If the path is an Array (as tested by Array.isArray, not merely typeof path == "array"), it must conform to the specification for values returned by "fs.split".
 
Every path object has the members "normal", "absolute", "canonical", "dirname", "basename", "join", and "resolve".  All of these return new Path objects constructed by converting the path to a string, passing it through the likewise named method of "fs", and converting it back to a Path.  Thus, all of these methods are chainable.  In addition, "join" and "resolve" are variadic, so additional paths can be passed as arguments in either String, Path, or Array form.
 
Every path object has "chroot", "copy", "exists", "extension", "isDirectory", "isFile", "isLink", "isReadable", "isWritable", "mkdir", "mkdirs", "move", "mtime", "open", "read", "remove", "rename", "rmdir", "rmtree", "same", "size", "split", "stat", "touch", and "write".  All of these functions convert themselves to strings and pass the results through the likewise named method of "fs".
 
In addition, paths implement:
 
; toString()
:
; to(path)
: uses "fs.relative" to return a Path from this path to another one.
; from(path)
: uses "fs.relative" to return a Path to this path from another one.
; list()
: returns an iterator of Path objects for the contained directory entries.
 
(TODO resolve whether it's more proper to make "Path" foundational and eliminate "fs".  Wrapping the "Path" object around a central "fs" object will be necessary for "chroot" whether we expose that level of the API or not, and having routines that work with strings on one architectural layer and paths on the next up gives the programmer an oppoertunity to program at the level that makes sense for their task.  It also, however, gives the programmer more to learn.)
 
 
== Streams ==
 
The "open" function mediates the construction of various kinds of streams.  As "open" is the only method with the authority to manipulate files, it constructs these types on behalf of a potentially unpriviledged caller.  Stream constructors are not directly callable in a secure sandbox, so where and how these stream types are implemented is beyond the necessary scope of this specification.  The "open" function always creates a byte level stream, and by default wraps that in a textual IO wrapper.
 
* create a "raw" byte stream.  If "x" mode (with either "w" or "a" mode), only open if the file does not already existCreate the file and open the stream atomically.
** if "r" mode, make "raw" a ByteReader
** if "w" mode, make "raw" a ByteWriter
** if "u" mode, make "raw" a ByteUpdater
* if "+", seek to end.
* if not "b" mode, return "raw".
* return a wrapper encoded/decoded string stream around "raw" with specified buffering, line buffering, and charset.
** if "r" mode, wrap "raw" in a TextReader
** if "w" mode, wrap "raw" in a TextWriter
** if "u" mode, wrap "raw" in a TextUpdater
 
; ByteReader
: appropriate for reading binary opaque struct records or other binary data or protocols
; ByteWriter
: appropriate for writing binary opaque struct records or other binary data or protocols
; ByteUpdater
: appropriate for a database
; ByteReaderWriter
: appropriate for sockets
; TextReader
: appropriate for standard input
; TextWriter
: appropriate for standard output
; TextUpdater
:
; TextReaderWriter
: appropriate for a TTY
 
(TODO: resolve whether it is necessary for all stream types to have a common prototype, or whether it makes sense for them to be arranged in a class hierarchy.)
 
=== *Reader, *Updater ===
 
TextReader, ByteReader, TextUpdater, ByteUpdater, TextReaderWriter, and ByteReaderWriter have the following properties and methods:
 
; read() *String
: reads and returns all of the file until EOF is encountered.  Return ByteStrings for Byte* types, and Strings for Text* types.
; read(max Number) *String
:
; readInto(buffer *Array, [begin Number], [end Number]) Number
:
; canRead() Boolean
:
; skip(n Number) Number
:
 
=== *Writer, *Updater ===
 
TextWriter, ByteWriter, TextUpdater, ByteUpdater, TextReaderWriter, and ByteReaderWriter have the following properties and methods:
 
; canWrite() Boolean
:
; flush()
:
 
=== ByteUpdater ===
 
; tell() Number
:
; seek(position Number, whence Number)
:
; truncate([length Number=0])
:
; rewind()
: a shortcut for seek(0)
 
=== Text* ===
 
TextReader, TextWriter, TextUpdater, and TextReaderWriter have the following properties and methods:
 
; raw Byte*
: the underlying byte stream, ByteReader for TextReaders, ByteWriter for TextWriters, and so on.
 
=== TextReader ===
 
; readLine() String
: reads a line from the reader.  If EOF is encountered before any data is gathered, returns "".  Otherwise, returns the line including the "newLine".
; readLines() Array*String
: returns an Array of Strings accumulated by calling readLine until an empty string turns up.  Does not include the final empty string, and does include "newLine" at the end of every line.
; next() String or throws StopIteration
: returns the next line of input without its "newLine".  Throws StopIteration if EOF is encountered.
; iterator() Iterator
: returns the reader itself
 
=== TextWriter ===
 
; writeLine(line String)
:
; print(...)
: writes a "delimiter" delimited array of Strings terminated with a "newLine"
 
=== TextUpdater ===
 
; tell() OpaqueCookie
:
; seek(position OpaqueCookie)
:
; truncate([position OpaqueCookie)
:
; rewind()
: seeks to the beginning of the file.
 
 
== Notes ==
 
* Path separators, and other file-system-specific constants are not included in this specification.
 
= Todo =
 
* <s>random access IO</s>
* locks
* <s>canonical IO (non-blocking)</s>
* comprehensive stat <s>access</s> and modification
* temporary files and directories
* symbolic links
* more open options: permissions, owner, groupOwner
* copy with metadata
* glob
* other [https://wiki.mozilla.org/ServerJS/API/file/Names proposed names].

Latest revision as of 10:45, 28 August 2009

Proposal A, Draft 5 (in development)

The "file" module supports the File System API, providing an interface for path, directory, file, link, file stat, and file stream manipulation.

For the purpose of this documentation, "fs" is a variable name for any object that implements the File System API, including the exports of the "file" module, and objects returned by "fs.chroot(path)".

Discussion Thread.


Security

Objects implementing the File System API, including the "file" module object, are capability bearing objects that carry and mediate authority to read and write to the underlying storage. As such, the "file" module can return other objects that implement and attenuate the File System API for sandboxing. Furthermore, streams returned by the file system object are implicitly attenuated to only give the receiver authority to manipulate the given file, without knowledge of the path on which it resides or access to references that would permit it to manipulate other parts of the file system.


Interface

The "file" module must export the following:

Files

open(path String, mode|options)
returns a stream object that supports an appropriate interface for the given options and mode, which include reading, writing, updating, byte, character, unbuffered, buffered, and line buffered streams. More details follow in the [#Stream] section of this document.
  • path
  • mode: "rwa+bxc", or
  • options
    • mode String
    • charset String
    • read Boolean
    • write Boolean
    • append Boolean
    • update Boolean
    • binary Boolean
    • exclusive Boolean
    • canonical Boolean
read(path String, [mode|options])
opens, reads, and closes a file, returning its content.
write(path String, content String|Binary, [mode|options])
opens, writes, flushes, and closes a file with the given content. If the content is a ByteArray or ByteString, the binary mode is implied.
copy(source String, target String)
reads one file and writes another in byte mode.
move(from String, to String)
remove(path String)
rename(path String, name String)
touch(path, [mtime Date])

Directories

(TODO Should the methods in this section use "mkdir" or "makeDir" style names? discussion)

list(path String) Iterator
returns an iterator that yields the names of the entries in a directory. Throws an error if the directory is inaccessible or does not exist.
mkdir(path String)
Create a single directory specified by path. If the directory cannot be created for any reason an exception will be thrown. This includes if the parent directories of "path" are not present.
mktree(path String)
TODO Decide upon the name of this function. "mkdirs" or "mktree" or "makeTree"
Will create the directory specified by "path" including any missing parent directories.
rmdir(path String)
Removes a directory if it is empty. If it is not, or cannot be removed for another reason an exception will be thrown.
rmtree(path String)
This does its best to remove the directory "path", akin to "rm -r" on Unix or "del /s" on Windows. That is it will recursively delete all directories and files under "path".

Links

Paths

[new] Path(path String|Path|Array, [fs FileSystem]) Path
returns a Path object that closes on a File System object and a "path" representation. The path object is a chainable shorthand for working with paths in the context of the "file" module. "Path" objects have no more or less authority to manipulate the file system than the FileSystem object that they are attached to, as any path string is reachable by chaining operations on a path instance. The FileSystem object defaults to the "file" module if the argument is omitted or undefined. More details follow in the [#Path Path] section of this document.
path(path String|Path, fs FileSystem) Path
"fs.path(path)" is a shorthand for "new fs.Path(path, fs)".
Working Path

The current working directory is used by all routines that resolve relative paths to absolute file system paths, including "open", and "absolute".

cwd() String
returns the current working directory.
chdir(path String)
changes the current working directory.
Traditional
join(...)
takes a variadic list of path Strings, joins them on the file system's path separator, and normalizes the result. details
split(path String) Array
returns an array of path components. If the path is absolute, the first component will be an indicator of the root of the file system; for file systems with drives (such as Windows), this is the drive identifier with a colon, like "c:"; on Unix, this is an empty string "". The intent is that calling "join.apply" with the result of "split" as arguments will reconstruct the path.
normal(path String)
removes '.' path components and simplifies '..' paths, if possible, for a given path.
absolute(path String)
returns the absolute path, starting with the root of this file system object, for the given path, resolved from the current working directory. If the file system supports home directory aliases, absolute resolves those from the root of the file system. The resulting path is in normal form. On most systems, this is equivalent to expanding any user directory alias, joining the path to the current working directory, and normalizing the result. "absolute" can be implemented in terms of "cwd", "join", and "normal".
canonical(path String)
returns the canonical path to a given abstract path. Canonical paths are both absolute and intrinsic, such that all paths that refer to a given file (whether it exists or not) have the same corresponding canonical path. This function must not communicate information about the true parent directories of files in chroot environments. This function is equivalent to expanding a user directory alias, joining the given path to the current working directory, joining all symbolic links along the path, and normalizing the result. "canonical" can be implemented in terms of "cwd", "join", "normal" and "readlink".
dirname(path String) String
returns the path of a file's containing directory, albeit the parent directory if the file is a directory. A terminal directory separator is ignored.
basename(path String, [extension String]) String
returns the part of the path that is after the last directory separator. If an extension is provided and is equal to the file's extension, the extension is removed from the result.
extension(path String) String
returns the extension of a file. The extension of a file is the last dot (excluding any number of initial dots) followed by one or more non-dot characters. Returns an empty string if no valid extension exists. unit test.
URL-like
resolve(...)
a function like "join" except that it treats each argument as as either an absolute or relative path and, as is the convention with URL's, treats everything up to the final directory separator as a location, and everything afterward as an entry in that directory, even if the entry refers to a directory in the underlying storage. Resolve starts at the location "" and walks to the locations referenced by each path, and returns the path of the last file. Thus, resolve(file, "") idempotently refers to the location containing a file or directory entry, and resolve(file, neighbor) always gives the path of a file in the same directory. "resolve" is useful for finding paths in the "neighborhood" of a given file, while gracefully accepting both absolute and relative paths at each stage. unit test.
relative(from, to)
returns the relative path from one path to another using only ".." to traverse up to the two paths' common ancestor.

Tests

exists(path)
whether a file exists at a given path: receives a path and returns whether that path, joined on the current working directory, corresponds to a file that exists. If the file is a broken symbolic link, returns false.
isFile(path)
returns whether a path exists and that it corresponds to a file.
isDirectory(path)
returns whether a path exists and that it corresponds to a directory.
isLink(path)
returns whether a path exists and that it corresponds to a symbolic link (TODO or shortcut?).
isReadable(path)
returns whether a path exists, that it corresponds to a file, and that it can be opened for reading by "fs.open".
isWritable(path)
If a path exists, returns whether a file may be opened for writing, or entries added or removed from an existing directory. If the path does not exist, returns whether entries for files, directories, or links can be created at its location.

Metadata

stat(path String)
Returns an object that contains the file's metadata, including all of the following that are applicable in the target platform's file system
device Number
device number of the file system
inode Number
virtual node number
mode Number
type and permissions, numeric
linkCount Number
number of hard links to the file
uid Number
numeric id of the owner user
rdev Number
the device identifier for special files
size NUmber
total size in bytes
blockSize Number
preferred block size for file system IO, in bytes
blockCount Number
number of blocks allocated
mtime Date
time of last modification (write)
atime Date
time of last access (read, write, update)
ctime Date
(TODO created vs. stat changed. is /.time/ really the best pattern for expressing these times?)
xattrs
extended attributes (reserved)
acls
access control lists (reserved)
size(path String)
Number
mtime(path String)
Date
atime(path String)
Date
ctime(path String)
Date
same(pathA String, pathB String) Boolean
whether the two files are identical, in that they come from the same file system, same device, and have the same node and corresponding storage, such that modifying one would implicitly and atomically modify the other.

Security

chroot(path String)
returns an object that conforms to the File System API, like the "file" module or the "system.fs" variable, that can be safely passed into a sandbox as the "file" module and "system.fs" objects.


Path

[new] Path(path String|Path|Array, [fs FileSystem]) Path

The prototype for the Path constructor is a String object.

The path constructor accepts as its first argument either a String, Path, or Array. If the path is an Array (as tested by Array.isArray, not merely typeof path == "array"), it must conform to the specification for values returned by "fs.split".

Every path object has the members "normal", "absolute", "canonical", "dirname", "basename", "join", and "resolve". All of these return new Path objects constructed by converting the path to a string, passing it through the likewise named method of "fs", and converting it back to a Path. Thus, all of these methods are chainable. In addition, "join" and "resolve" are variadic, so additional paths can be passed as arguments in either String, Path, or Array form.

Every path object has "chroot", "copy", "exists", "extension", "isDirectory", "isFile", "isLink", "isReadable", "isWritable", "mkdir", "mkdirs", "move", "mtime", "open", "read", "remove", "rename", "rmdir", "rmtree", "same", "size", "split", "stat", "touch", and "write". All of these functions convert themselves to strings and pass the results through the likewise named method of "fs".

In addition, paths implement:

toString()
to(path)
uses "fs.relative" to return a Path from this path to another one.
from(path)
uses "fs.relative" to return a Path to this path from another one.
list()
returns an iterator of Path objects for the contained directory entries.

(TODO resolve whether it's more proper to make "Path" foundational and eliminate "fs". Wrapping the "Path" object around a central "fs" object will be necessary for "chroot" whether we expose that level of the API or not, and having routines that work with strings on one architectural layer and paths on the next up gives the programmer an oppoertunity to program at the level that makes sense for their task. It also, however, gives the programmer more to learn.)


Streams

The "open" function mediates the construction of various kinds of streams. As "open" is the only method with the authority to manipulate files, it constructs these types on behalf of a potentially unpriviledged caller. Stream constructors are not directly callable in a secure sandbox, so where and how these stream types are implemented is beyond the necessary scope of this specification. The "open" function always creates a byte level stream, and by default wraps that in a textual IO wrapper.

  • create a "raw" byte stream. If "x" mode (with either "w" or "a" mode), only open if the file does not already exist. Create the file and open the stream atomically.
    • if "r" mode, make "raw" a ByteReader
    • if "w" mode, make "raw" a ByteWriter
    • if "u" mode, make "raw" a ByteUpdater
  • if "+", seek to end.
  • if not "b" mode, return "raw".
  • return a wrapper encoded/decoded string stream around "raw" with specified buffering, line buffering, and charset.
    • if "r" mode, wrap "raw" in a TextReader
    • if "w" mode, wrap "raw" in a TextWriter
    • if "u" mode, wrap "raw" in a TextUpdater
ByteReader
appropriate for reading binary opaque struct records or other binary data or protocols
ByteWriter
appropriate for writing binary opaque struct records or other binary data or protocols
ByteUpdater
appropriate for a database
ByteReaderWriter
appropriate for sockets
TextReader
appropriate for standard input
TextWriter
appropriate for standard output
TextUpdater
TextReaderWriter
appropriate for a TTY

(TODO: resolve whether it is necessary for all stream types to have a common prototype, or whether it makes sense for them to be arranged in a class hierarchy.)

*Reader, *Updater

TextReader, ByteReader, TextUpdater, ByteUpdater, TextReaderWriter, and ByteReaderWriter have the following properties and methods:

read() *String
reads and returns all of the file until EOF is encountered. Return ByteStrings for Byte* types, and Strings for Text* types.
read(max Number) *String
readInto(buffer *Array, [begin Number], [end Number]) Number
canRead() Boolean
skip(n Number) Number

*Writer, *Updater

TextWriter, ByteWriter, TextUpdater, ByteUpdater, TextReaderWriter, and ByteReaderWriter have the following properties and methods:

canWrite() Boolean
flush()

ByteUpdater

tell() Number
seek(position Number, whence Number)
truncate([length Number=0])
rewind()
a shortcut for seek(0)

Text*

TextReader, TextWriter, TextUpdater, and TextReaderWriter have the following properties and methods:

raw Byte*
the underlying byte stream, ByteReader for TextReaders, ByteWriter for TextWriters, and so on.

TextReader

readLine() String
reads a line from the reader. If EOF is encountered before any data is gathered, returns "". Otherwise, returns the line including the "newLine".
readLines() Array*String
returns an Array of Strings accumulated by calling readLine until an empty string turns up. Does not include the final empty string, and does include "newLine" at the end of every line.
next() String or throws StopIteration
returns the next line of input without its "newLine". Throws StopIteration if EOF is encountered.
iterator() Iterator
returns the reader itself

TextWriter

writeLine(line String)
print(...)
writes a "delimiter" delimited array of Strings terminated with a "newLine"

TextUpdater

tell() OpaqueCookie
seek(position OpaqueCookie)
truncate([position OpaqueCookie)
rewind()
seeks to the beginning of the file.


Notes

  • Path separators, and other file-system-specific constants are not included in this specification.

Todo

  • random access IO
  • locks
  • canonical IO (non-blocking)
  • comprehensive stat access and modification
  • temporary files and directories
  • symbolic links
  • more open options: permissions, owner, groupOwner
  • copy with metadata
  • glob
  • other proposed names.