Domesday/Directory Design: Difference between revisions

no edit summary
No edit summary
No edit summary
 
Line 1: Line 1:
The current plan is to make the back-end of Domesday an LDAP directory. This page lists the requirements.
The back-end of Domesday is an LDAP directory. This page lists the requirements for its schema and configuration.


===Overall Design===
===Overall Design===


We would like to make it so that all the security and access control is enforced by the directory rather than by any client software.
All security and access control should be enforced by the directory rather than by any client software.


This allows us to expose the directory server to the Internet and offer direct LDAP access so that other people can write clients which consume directory information.
This allows us to expose the directory server to the Internet and offer direct LDAP access so that other people can write clients which consume directory information, or authenticate against it.


This would also mean that the web interface code can operate only with the permissions of the user currently querying it, and does not even know the RootDN password. This makes it impossible for badly-written web front end code to have a privacy leak.
This also means that the web interface code can operate only with the permissions of the user currently using it, and does not even know the RootDN password. This respect of the Principle of Least Privilege makes it much harder for badly-written web front end code to have a privacy leak or security problem.


However, users have the ability to change the privacy controls on their data. So therefore, they must have the ability to update the ACLs. It seems to me (Gerv) that this means we would need access control on the dynamic access control attributes! However, in OpenLDAP at least, these are all stored in the cn=config database rather than in the data tree, and are multiple instances of the same attribute. So it may not be possible to keep the web UI from knowing 'more' than the user using it knows.
===Authentication===


We may have to compromise on a system which allows other clients to read and update data, but only the web UI has the special access to alter privacy levels for a particular data item.
Users will authenticate against the directory in a fairly normal way, by specifying a unique identifier (uid or email address) and their password. Once authenticated, they will have permission to modify their own data and, if they have the necessary permissions, to view additional data about other people.


===Access Control===
===Access Control===


Access control is based on the tagging system (see the main documentation). Each user is tagged with a number of tags, and there are a handful (say 3-6; not dozens) of tags which are special - they denote privacy levels (e.g. "trusted", "community-engagement-team"). A user can move any individual item of data about themselves from one privacy level to another. "Public", i.e. world-readable, is a valid privacy level.
Access control to user data is based on the tagging system, as described in the main documentation. Each user is tagged with a number of tags, and there are a handful of them (say 3 to 6) which are special - they denote privacy levels. Examples of such tags from the Mozilla deployment might be "mozillian" or "community-engagement-team". A user can move any individual item of data about themselves from one privacy level to another. "public", i.e. world-readable, is a hard-coded privacy level which does not correspond to a tag.


The access control complexities come from the facts that:
The access control complexities come from the facts that:


* Each item of user data (attribute) can have different access control (although you could group them into sets, as there are only a handful of levels)
* Users can individually control the access to their own data items, rather than it being defined centrally or by an administrator
* Users can individually control the access to their own data items, rather than it being defined centrally or by an administrator
* Public tags have their own access control for adding and removing them
* Each item of user data (attribute) can have different access control (although I guess you could group them into sets based on their level)
* Tags have their own access control system for adding and removing them - some tags are not self-awardable, but require the awarder to have a different tag (the 'bless tag' for that tag)
* Users should be able to add new tags, but not delete existing ones, unless they are the only person tagged
* Users should be able to add new tags, but not delete existing ones, unless they are the only person tagged
* Each user has their own private set of tags on other users, which are effectively in their own namespace
* Each user has their own private set of tags on other users, which are effectively in their own namespace. (The status of this feature is in question.)


===Attributes===
===Attributes===
Line 29: Line 29:
See [[Domesday#Data_Fields|the main Domesday page]] for a list of data items to be stored, which has been annotated with possible attribute names from inetOrgPerson or elsewhere.
See [[Domesday#Data_Fields|the main Domesday page]] for a list of data items to be stored, which has been annotated with possible attribute names from inetOrgPerson or elsewhere.


We will need an additional auxiliary objectClass for our own attributes, such as "Call Me" and "Year Started".
We will need an additional auxiliary objectClass, domesdayPerson, for our own attributes, such as "Call Me" and "Year Started". This also gives us flexibility to add additional fields as required.
 
As noted above, users have the ability to change the privacy controls on each attribute independently. This might suggest that the ACLs will need to be in some way dynamic or rewritten on the fly. However, an LDAP consultant considering this design has told us that it would be possible to implement it in OpenLDAP (which has the most advanced access control facilities) using only static ACLs. Therefore, this is the goal, as using static ACLs would be significantly more robust and safe, and less complex.
 
Sometimes, legal constraints may require us to prevent certain attributes being set to certain levels (e.g. "address" can't be "public" for minors). If that were implementable at the directory level, that would be great, but we are happy to implement it at the app level. (Minors have easier ways of publicising their home address to the world than firing up an LDAP client.)


===Tagging===
===Tagging===
Line 35: Line 39:
See the full explanation of the [[Domesday#Tagging|tagging system]]; familiarity with that is essential for considering this design question.  
See the full explanation of the [[Domesday#Tagging|tagging system]]; familiarity with that is essential for considering this design question.  


We can use [http://www.openldap.org/doc/admin24/access-control.html#Managing%20access%20with%20Groups OpenLDAP "groups"] and entries with the groupOfNames objectclass to support tags. The groups system is baked into the access control mechanism, which helps a great deal.  
If a tag has no bless tag, the user is the only person who can award it to themselves. If it has a bless tag, only those with the bless tag can award it, and they can give it to or remove it from anyone.


Each user has, as it were, their own private set of tags for other users, which can be searched independently of the public ones. This would be implemented by having a subtree, of the same structure as the public tags tree, under each user's entry.
Some tags come in name:value pairs, colon-separated. The bless tag must apply only to the name. (In other words, there is one bless tag which allows someone to award "l10n:<anything>".) This is to avoid the admin hassle of having to explicitly define e.g. 100 l10n tags, one for each language, all with a bless tag.
 
(The status of this feature is in question.) Each user has, as it were, their own private set of tags for other users, which can be searched independently of the public ones. It is suggested that this would be implemented by having a subtree, of the same structure as the public tags tree, under each user's entry.
 
There will need to be controls on the valid character set for tag names - they cannot contain a comma or whitespace, and colon is special (see above).
 
===New Accounts===
 
There are two possible models for the creation of new user accounts:
 
# Invitations, and then after acceptance and entry creation it is immediately visible
# Immediate entry creation, but then finding a voucher is required for it to be visible
 
It would be helpful to support either method of operation. Either way, there will need to be a "invisible" flag and a field to store the ID of the voucher. We also need to figure out how to delegate creation authority here; anonymous creation of entries needs to be disallowed at the directory level, because of the spam risk.


===Queries===
===Queries===
Line 44: Line 61:


* All data (including public tags and private tags added by the user doing the search) for a particular individual, specified by an exact attribute value match (e.g. IRC nick, email address) - must be particularly fast, as will happen very, very often
* All data (including public tags and private tags added by the user doing the search) for a particular individual, specified by an exact attribute value match (e.g. IRC nick, email address) - must be particularly fast, as will happen very, very often
* All data for all individuals with a particular tag
* All data, for all individuals with a particular tag
* All data for all individuals with a certain value or range of values for an attribute or attributes (e.g. everyone in London, everyone on Facebook, everyone whose family name starts with Q)
* All data, for all individuals with a certain value or range of values for an attribute or attributes (e.g. everyone in London, everyone on Facebook, everyone whose family name starts with Q)


===Updates===
===Updates===
Line 52: Line 69:


* Users will update their own entry and personal tags (fairly infrequent)
* Users will update their own entry and personal tags (fairly infrequent)
* Users will add user tags to other users (reasonably frequent)
* (The status of this feature is in question.) Users will add user tags to other users (reasonably frequent)
 
But I expect the number of reads will dwarf the number of writes. Hence using a directory.


===More Ideas===
I expect the number of reads to dwarf the number of writes.


Gerv speculates wildly: might it work if we had several entries for each user, one for each privacy level, under a common root named after that level? You would move attributes between entries based on the privacy level the user assigned. Then when another user did a search, we would just search inside the subtrees which matched the privacy bits that user had. Would that be more performant than having individual access control for each attribute?
===Quotes===


The difficulty is that inetOrgPerson has a few mandatory fields... we may need a placeholder "not present" value to use.
An external LDAP consultant with significant LDAP ACL expertise has suggested that designing the directory tree structure and ACLs, to the spec given here, would take him around 10 man-days. This includes provision of:


Of course, this might be premature optimization...
* DIT design
* Object classes and Attribute Types
* ACLs and other OpenLDAP config items
* Test suite (to prove the ACLs work)
* Client Programmers Guide
Account confirmers, Anti-spam team, Confirmed users, Bureaucrats and Sysops emeriti
4,925

edits