ServerJS/Filesystem API/A: Difference between revisions

Draft 5 beta
(Oops. Moved to ServerJS/Filesystem_API/A, the more established name space.)
 
(Draft 5 beta)
Line 1: Line 1:
= Tier 1 =
Proposal A, Draft 5


This tier establishes a basis with binary IO and basic string manipulation for paths.
The "file" module supports the File System API, providing an interface for path, directory, file, link, file stat, and file stream manipulation.


The require('file') module would provide support the File System API.  In turn, the File System API would provide all methods that deal with path strings.  File System API functions would return file stream objects, directory objects, and file stat objects. File stream objects would, for securability, be unaware of their corresponding path, and keep their underlying file system connection, albeit a file descriptor or FILE*, and their parent directory link, secret.
For the purpose of this documentation, "fs" is a variable name for any object that implements the File System API, including the exports of the "file" module, and objects returned by "fs.chroot(path)".




The File System API would provide the following constants:
=== Security ===


* SEPARATOR "/" or system specific analog, like ":" or "\".
Objects implementing the File System API, including the "file" module object, are capability bearing objects that carry and mediate authority to read and write to the underlying storage.  As such, the "file" module can return other objects that implement and attenuate the File System API for sandboxing.  Furthermore, streams returned by the file system object are implicitly attenuated to only give the receiver authority to manipulate the given file, without knowledge of the path on which it resides or access to references that would permit it to manipulate other parts of the file system.


The File System API would provide the following methods:


* join(...paths:Strings...): takes a variadic list of path Strings, joins them on the directory separator, and normalizes the result.
=== Interface ===


* split(path:String): returns a string split on the file system's directory separator. The split() method will return an empty string as the first element for absolute paths that do not contain drives. For file systems with drives (such as windows), the first element will be the drive with colon if the path contained a drive specification. The intent is so that join with the SEPARATOR will correctly join the elements to reconstitute the path.
The "file" module must export the following:


* resolve(...paths:Strings...): marches through a list of variadic absolute or relative path Strings, resolving the next path relative to the previous and returning the ultimate destination path.  This is analogous to navigating to the fully qualified URL of a relative URL on a given page.  Unlike "join", which presumes that the base path is always a directory, "resolve" considers a directory separator at the end of a path an indication that the path must be resolved off of the directory rather than the leaf "file".  For example, resolve("a", "b") returns "b", but resolve("a/", "b") returns "a/b" on a system with "/" as its directory separator.  Resolving a fully qualified path relative to any base path returns the fully qualified path, like resolve("a", "/") == "/".  Resolve is purely a string manipulation routine and does not use information about the underlying file system.
Files:


* relative(from, to): returns the relative path from one path to another using only ".." to traverse up to the two paths' common ancestor.
; open(path (String)|options Object, [mode (String|Array|Object)], [options Object|mode (String|Array)]): returns a stream object that supports an appropriate interface for the given options and mode, which include reading, writing, updating, byte, character, unbuffered, buffered, and line buffered streams.  More details follow in the [#Stream] section of this document.
* path
* mode: "rwa+bxc", ["read", "write", "append", "update", "binary", "exclusive", "canonical"], {read, write, append, update, binary, exclusive, canonical}
* options
** path String
** mode Object
** charset String
** newLine String
** delimiter String
; read(path String|options Object, [options (Object)|mode (String|Array)]): opens, reads, and closes a file, returning its content.
; write(path String|options Object, content String|ByteString|ByteArray, [options (Object)|mode (String|Array)]): opens, writes, flushes, and closes a file with the given content. If the content is a ByteArray or ByteString, the binary mode is implied.
; copy(source String, target String): reads one file and writes another in byte mode.
; move(from String, to String)
; remove(path String)
; rename(path String, name String)
; touch(path, [mtime Date])


* normal(path:String): removes '.' path components and simplifies '..' paths, if possible for a given path.  If the file system is case sensitive, transforms all letters to lower-case to make them unambiguous.  "normal" may be implemented in terms of "resolve".
Directories:


* absolute(path:String): returns the absolute path, starting with the root of this file system object, for the given path, resolved relative to the current working directory.  If the file system supports home directory aliases, absolute resolves those from the root of the file system. The resulting path is in normal form.  On most systems, this is equivalent to expanding any user directory alias, joining the path to the current working directory, and normalizing the result.  "absolute" can be implemented in terms of "resolve" and "cwd".
; list(path String) Iterator: returns an iterator that yields the names of the entries in a directory.  Throws an error if the directory is inaccessible or does not exist.
; mkdir(path String)
; mkdirs(path String)
; rmdir(path String)
; rmtree(path String)


* canonical(path:String): returns the canonical path to a given abstract path.  Canonical paths are both absolute and intrinsic, such that all paths that refer to a given file (whether it exists or not) have the same corresponding canonical path.  This function may not communicate information about the true parent directories of files in chroot environments.  This function is equivalent to expanding a user directory alias, joining the given path to the current working directory, joining all symbolic links along the path, and normalizing the result.  "canonical" can be implemented in terms of "cwd", "resolve", and "readlink".
Links:


* exists(path): whether a file exists at a given path: receives a path and returns whether that path, joined on the current working directory, corresponds to a file that exists.  If the file is a broken symbolic link, returns false.
Paths:


* stat(path): returns an object that represents a snapshot of the information about given file.
; [new] Path(path String|Path|Array, [fs FileSystem]) Path
*# The stat file must contain the following properties:
: returns a Path object that closes on a File System object and a "path" representation.  The path object is a chainable shorthand for working with paths in the context of the "file" module. "Path" objects have no more or less authority to manipulate the file system than the FileSystem object that they are attached to, as any path string is reachable by chaining operations on a path instance. The FileSystem object defaults to the "file" module if the argument is omitted or undefined.  More details follow in the [#Path Path] section of this document.
*## mtime: the time that the file was last modified
; path(path String|Path, fs FileSystem) Path: "fs.path(path)" is a shorthand for "new fs.Path(path, fs)".
*## other properties will be specified in a future specification, including other times, ownership, permissions, and size.
*# The path is resolved relative to the current working directory.
*# Throws an error if the corresponding file does not exist or is inaccessible.


* list(path): returns an Array of file name strings.
; cwd() String: returns the current working directory.
*# The returned object may be immutable.
; chdir(path String): changes the current working directory.
*# The path is resolved relative to the current working directory.
*# Throws an error if the directory does not exist or is inaccessible.


* open(path, mode) or open({options})
Traditional path manipulation:
*# options may include {path, mode} and more in a later, more detailed specification.
*# The mode is a String and may contain:
*## "r" that means that the stream may be read, and implies that the "read", "readLine", "readLines", "next", and "iter" methods must be supported by the returned stream object.  "readLine" returns "" on EOF, and "next" throws an error instead.  "iter" returns the stream object itself.  "readLine" returns a string excluding the newline character.
*## "w" that means that the stream may be written to, and implies that the "write", "writeLine", and "writeLines" methods must be supported by the stream object.  Also implies that the file will be created or truncated if necessary, if not overridden by the "a" or "+" flags.
*## "a" that means that the stream's position will begin at the end of the stream, and that the file will be created but not truncated.
*## "+" that means that the stream will not be truncated.
*# The path is resolved relative to the current working directory.
*# Throws an error if the specified stream cannot be created.
*# close(): Closes the stream. Should be called automatically when the mode is changed, and when the garbage collector collects its last reference.


* read(path, [mode, [options]]): opens, reads, and closes a file, returning its content.
; join(...): takes a variadic list of path Strings, joins them on the file system's path separator, and normalizes the result.
; split(path String) Array: returns an array of path components.  If the path is absolute, the first component will be an indicator of the root of the file system; for file systems with drives (such as Windows), this is the drive identifier with a colon, like "c:"; on Unix, this is an empty string "".  The intent is that calling "join.apply" with the result of "split" as arguments will reconstruct the path.
; normal(path String): removes '.' path components and simplifies '..' paths, if possible, for a given path.
; absolute(path String): returns the absolute path, starting with the root of this file system object, for the given path, resolved from the current working directory.  If the file system supports home directory aliases, absolute resolves those from the root of the file system.  The resulting path is in normal form.  On most systems, this is equivalent to expanding any user directory alias, joining the path to the current working directory, and normalizing the result.  "absolute" can be implemented in terms of "cwd", "join", and "normal".
; canonical(path String): returns the canonical path to a given abstract path.  Canonical paths are both absolute and intrinsic, such that all paths that refer to a given file (whether it exists or not) have the same corresponding canonical path.  This function must not communicate information about the true parent directories of files in chroot environments.  This function is equivalent to expanding a user directory alias, joining the given path to the current working directory, joining all symbolic links along the path, and normalizing the result.  "canonical" can be implemented in terms of "cwd", "join", "normal" and "readlink".
; dirname(path String) String: returns the path of a file's containing directory, albeit the parent directory if the file is a directory.  A terminal directory separator is ignored.
; basename(path String, [extension String]) String: returns the part of the path that is after the last directory separator.  If an extension is provided and is equal to the file's extension, the extension is removed from the result.
; extension(path String) String: returns the extension of a file.  The extension of a file is the last dot (excluding any number of initial dots) followed by one or more non-dot characters. Returns an empty string if no valid extension exists.  [http://github.com/kriskowal/narwhal-test/blob/master/src/test/file/extension.js unit test].


* write(path, content, [mode, [options]]): opens, writes, flushes, and closes a file with the given content.
URL-like path manipulation:


= Tier 2 =
; resolve(...)
: a function like "join" except that it treats each argument as as either an absolute or relative path and, as is the convention with URL's, treats everything up to the final directory separator as a location, and everything afterward as an entry in that directory, even if the entry refers to a directory in the underlying storage.  Resolve starts at the location "" and walks to the locations referenced by each path, and returns the path of the last file.  Thus, resolve(file, "") idempotently refers to the location containing a file or directory entry, and resolve(file, neighbor) always gives the path of a file in the same directory.  "resolve" is useful for finding paths in the "neighborhood" of a given file, while gracefully accepting both absolute and relative paths at each stage. [http://github.com/kriskowal/narwhal-test/blob/master/src/test/file/resolve.js unit test].
; relative(from, to): returns the relative path from one path to another using only ".." to traverse up to the two paths' common ancestor.


This tier adds support for encoded and buffered text IO and a chainable Path type.
Tests:


* open(path, mode, {options}) or open({options})
; exists(path): whether a file exists at a given path: receives a path and returns whether that path, joined on the current working directory, corresponds to a file that exists. If the file is a broken symbolic link, returns false.
*# options may include {path, mode, charset, recordSeparator, fieldSeparator} and more in a later, more detailed specification.
; isFile(path): returns whether a path exists and that it corresponds to a file.
*# The mode is a String and may additionally contain:
; isDirectory(path): returns whether a path exists and that it corresponds to a directory.
*## "b" for ByteArray and ByteString streams.
; isLink(path): returns whether a path exists and that it corresponds to a symbolic link (TODO or shortcut?).
*## "t" for String and Array streams (default)
; isReadable(path): returns whether a path exists, that it corresponds to a file, and that it can be opened for reading by "fs.open".
*# "charset" may be any IANA charset identifier, case-insensitive.  "open" may throw an error if the charset is not supported.  "charset" defaults to a system-specific encoding if the stream is in text mode with no charset provided.
; isWritable(path): If a path exists, returns whether a file may be opened for writing, or entries added or removed from an existing directory. If the path does not exist, returns whether entries for files, directories, or links can be created at its location.
*# returns a stream type.  While the type names need not correspond to these name, these are used as an interal reference in this document for what the minimum interface of those streams must contain.
*## Returns a BinaryReadStream for "b" and "r" modes.
*## Returns a BinaryWriteStream for "b" and "w" or "a" modes.
*## Returns a BinaryRandomStream for "b" and "+" modes.
*## Returns a TextReadStream wrapper for "t" and "r" modes.
*## Returns a TextWriteStream wrapper for "t" and "w" modes.


* BinaryReadStream
Metadata:
** read() -> all:ByteString
** read(max:Number) -> actual:ByteString
** readInto(buffer:Array|ByteArray, [begin:Number, [end:Number]]) -> actual:Number
** available() -> Number -- how many bytes are ready to be read (buffered) without blocking.
** skip(n:Number) -> advance the read head past these bytes.
* BinaryWriteStream
** write(buffer:Array|ByteArray|ByteString)
** truncate(length:Number=0)
** flush()
* BinaryRandomStream < BinaryReadStream < BinaryWriteStream
** tell() -> position:Number)
** seek(position:Number, whence:enumerated)
** rewind() -- returns the read/write head to the beginning of the file
** truncate([size:Number]) -- sets the length of the file.  size defaults to the current position as reported by tell().  If a size is explicated, truncate also seeks to the new end.
* TextReadStream
** raw -> a BinaryReadStream
** read() -> String -- returns all remaining characters
** read(max:Number) -> String -- returns up to max characters, with the length of the returned string reflecting the actual number read.
** readLine() -> String -- returns "" if no data is available before EOF.  Otherwise, includes the "recordSeparator".
** readLines() -> Array * String
** next() -> String -- throws a StopIteration if no data is available before EOF.
** input() -> returns "readLine" without the "recordSeparator".
** available() -> Number -- how many characters are ready to be read (buffered) without blocking.
** skip(n:Number) -> advance the read head past these characters.
* TextWriteStream
** raw -> a BinaryWriteStream
** write(buffer:String)
** writeLine(buffer:String)
** print(...String...) -- writes a "fieldSeparator" delimited and "recordSeparator" terminated line.
** flush()


The File System API would provide the following additional methods:
; stat(path String): Returns an object that contains the file's metadata, including all of the following that are applicable in the target platform's file system
;; device Number: device number of the file system
;; inode Number: virtual node number
;; mode Number: type and permissions, numeric
;; linkCount Number: number of hard links to the file
;; uid Number: numeric id of the owner user
;; rdev Number: the device identifier for special files
;; size NUmber: total size in bytes
;; blockSize Number: preferred block size for file system IO, in bytes
;; blockCount Number: number of blocks allocated
;; mtime Date: time of last modification (write)
;; atime Date: time of last access (read, write, update)
;; ctime Date: (TODO created vs. stat changed.  is /.time/ really the best pattern for expressing these times?)
;; xattrs: extended attributes (reserved)
;; acls: access control lists (reserved)


* basename(path:String) -> String
; size(path String):Number
* copy(from:String, to:String) -> copies a file by reading one and writing the other in binary modes.
; mtime(path String):Date
* dirname(path:String) -> String
; atime(path String):Date
* extname(path:String) -> String
; ctime(path String):Date
* isDirectory(path:String) -> Boolean
; same(pathA String, pathB String) Boolean: whether the two files are identical, in that they come from the same file system, same device, and have the same node and corresponding storage, such that modifying one would implicitly and atomically modify the other.
* isFile(path:String) -> Boolean
* isLink(path:String) -> Boolean
* isReadable(path:String) -> Boolean
* isWritable(path:String) -> Boolean
* mkdir(path:String)
* mkdirs(path:String)
* move(from:String, to:String)
* mtime() -> lastModification:Date|null
* remove(path:String)
* rename(path:String, name:String)
* rmdir(path:String)
* rmtree(path:String)
* same(from:String, to:String) -> Boolean -- whether the two files are identical, in that they come from the same file system, same device, and have the same node and corresponding storage, such that modifying one would modify the other.
* size(path:String) -> bytes:Number
* touch(path:String, [mtime:Date])


The Path object closes on both a file system and a path String.  The file system mediates all interaction with the underlying storage, so a Path only provides a convenient interface for chaining operations that manipulate a path string.  All of the methods of path are polymorphic (meaning late-bound) and curried (meaning the enclosed path String is passed to the corresponding file system method).
Security:


* path(path:String) -> path:Path
; chroot(path String)
* new Path(path:String, [fs:FileSystem]) -> path:Path
*# absolute() -> path:Path
*# basename() -> path:Path
*# canonical() -> path:Path
*# copy(to:Path)
*# dirname() -> path:Path
*# exists() -> Boolean
*# extname() -> String
*# from(path:String|Path) -> path:Path -- an alias for "relative" that finds the relative path from the given path to this one.
*# isDirectory() -> Boolean
*# isFile() -> Boolean
*# isLink() -> Boolean
*# isReadable() -> Boolean
*# isWritable() -> Boolean
*# join(...paths...) -> path:Path
*# list() -> Array * Path
*# mkdir()
*# mkdirs()
*# move(to:String)
*# mtime() -> Date
*# normal() -> path:Path
*# open(mode, [options]) -> stream:{Text,Binary}{Read,Write,RW,Random}Stream
*# read([mode, [options]]) -> {Byte,}String
*# remove()
*# rename(name:String) -- rename is distinct from move only in that the new name is resolved relative to the former path, rather than the current working directory.
*# resolve(...paths...) -> path:Path
*# rmdir()
*# rmtree()
*# same(as:Path|String) -> Boolean
*# size() -> Number
*# split() -> parts:(Array * String)
*# stat() -> stat:Object {mtime:Date, size:Number}
*# to(path:String|Path) -> path:Path -- an alias for "relative" that finds the relative path from this path to the given one.
*# toString() -> String
*# touch([mtime:Date])
*# write(data:ByteString|ByteArray|Array|String, [mode, [options]])




= Tier 3 =
== Path ==


This tier adds support for random access IO, locks, canonical IO (non-blocking), more comprehensive stat access and mutation, and temporary files and directories, and symbolic links.
; [new] Path(path String|Path|Array, [fs FileSystem]) Path


* open(path, mode, {options}) or open({options})
The prototype for the Path constructor is a String object.
*# options may include {path, mode, charset, permissions, owner, groupOwner} and more in a later, more detailed specification.
*# The permissions must be an integer of Unix style permissions.  Other objects may be permitted in a later specification, like a Stat object or duck-type thereof.


* copyStat(from:String, to:String)
The path constructor accepts as its first argument either a String, Path, or Array.  If the path is an Array (as tested by Array.isArray, not merely typeof path == "array"), it must conform to the specification for values returned by "fs.split".


Path:
Every path object has the members "normal", "absolute", "canonical", "dirname", "basename", "join", and "resolve".  All of these return new Path objects constructed by converting the path to a string, passing it through the likewise named method of "fs", and converting it back to a Path.  Thus, all of these methods are chainable.  In addition, "join" and "resolve" are variadic, so additional paths can be passed as arguments in either String, Path, or Array form.


* copyStat(to:String|Path)
Every path object has "chroot", "copy", "exists", "extname", "isDirectory", "isFile", "isLink", "isReadable", "isWritable", "mkdir", "mkdirs", "move", "mtime", "open", "read", "remove", "rename", "rmdir", "rmtree", "same", "size", "split", "stat", "touch", and "write".  All of these functions convert themselves to strings and pass the results through the likewise named method of "fs".


In progressMore information about other potential additions, please refer to the [https://wiki.mozilla.org/ServerJS/API/file/Names proposed names].
In addition, paths implement:
 
; toString()
; to(path): uses "fs.relative" to return a Path from this path to another one.
; from(path): uses "fs.relative" to return a Path to this path from another one.
; list(): returns an iterator of Path objects for the contained directory entries.
 
(TODO resolve whether it's more proper to make "Path" foundational and eliminate "fs".  Wrapping the "Path" object around a central "fs" object will be necessary for "chroot" whether we expose that level of the API or not, and having routines that work with strings on one architectural layer and paths on the next up gives the programmer an oppoertunity to program at the level that makes sense for their task.  It also, however, gives the programmer more to learn.)
 
 
== Streams ==
 
The "open" function mediates the construction of various kinds of streams.  As "open" is the only method with the authority to manipulate files, it constructs these types on behalf of a potentially unpriviledged caller.  Stream constructors are not directly callable in a secure sandbox, so where and how these stream types are implemented is beyond the necessary scope of this specificationThe "open" function always creates a byte level stream, and by default wraps that in a textual IO wrapper.
 
* create a "raw" byte stream.  If "x" mode (with either "w" or "a" mode), only open if the file does not already exist.  Create the file and open the stream atomically.
** if "r" mode, make "raw" a ByteReader
** if "w" mode, make "raw" a ByteWriter
** if "u" mode, make "raw" a ByteUpdater
* if "+", seek to end.
* if not "b" mode, return "raw".
* return a wrapper encoded/decoded string stream around "raw" with specified buffering, line buffering, and charset.
** if "r" mode, wrap "raw" in a TextReader
** if "w" mode, wrap "raw" in a TextWriter
** if "u" mode, wrap "raw" in a TextUpdater
 
Types:
 
; ByteReader
: appropriate for standard input
; ByteWriter
: appropriate for standard output
; ByteUpdater
: appropriate for a database
; ByteReaderWriter
: appropriate for sockets
; TextReader
; TextWriter
; TextUpdater
; TextReaderWriter
: appropriate for a TTY
 
*Reader, *Updater, *ReaderWriter:
 
; read():*String
; read(max Number) *String
; readInto(buffer *Array, [begin Number], [end Number]) Number
; canRead():Boolean
; skip(n Number) Number
 
*Writer, *Updater, *ReaderWriter:
 
; canWrite() Boolean
; flush()
 
ByteUpdater:
 
; tell() Number
; seek(position Number, whence Number)
; truncate([length Number=0])
; rewind(): a shortcut for seek(0)
 
TextUpdater:
 
; tell() OpaqueCookie
; seek(position OpaqueCookie)
; truncate([position OpaqueCookie)
; rewind(): seeks to the beginning of the file.
 
Text*:
 
; raw Byte*
 
TextReader:
 
; readLine() String
: reads a line from the reader.  If EOF is encountered before any data is gathered, returns "".  Otherwise, returns the line including the "newLine".
; readLines() Array*String: returns an Array of Strings accumulated by calling readLine until an empty string turns up.  Does not include the final empty string, and does include "newLine" at the end of every line.
; next() String or throws StopIteration: returns the next line of input without its "newLine".  Throws StopIteration if EOF is encountered.
; iterator() Iterator: returns the reader itself
 
TextWriter:
 
; writeLine(line String)
; print(...): writes a "delimiter" delimited array of Strings terminated with a "newLine"
 
 
== Deliberate Omissions ==
 
* Path separators, and other file-system-specific constants are not included in this specification.
 
 
= Todo =
 
* <s>random access IO</s>
* locks
* <s>canonical IO (non-blocking)</s>
* comprehensive stat <s>access</s> and modification
* temporary files and directories
* symbolic links
* more open options: permissions, owner, groupOwner
* copy with metadata
* other [https://wiki.mozilla.org/ServerJS/API/file/Names proposed names].
171

edits