Tuesday, January 13, 2009

Improving Sakai's Authz Performance

Sakai's authz has much room for improvement in how it uses the database. The queries are frequent and complex, the tables are large, the indexes are redundant. I propose some changes to make this more efficient. These changes are tuned specifically for MySQL; other databases may or may not benefit from similar work.


More efficient User information

The user id is stored in the authz table which records the user membership in authz groups. This id is a varchar(99), and under utf8, can be ~300 bytes. I propose changing this to an integer key for the user.

To manage the keys, I propose adding a column to the SAKAI_USER_ID_MAP table, in which every user, internal or provided, has an entry. The new definition would be:
CREATE TABLE SAKAI_USER_ID_MAP
(
USER_KEY BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
USER_ID VARCHAR(99) NOT NULL DEFAULT '',
EID VARCHAR(255) NOT NULL DEFAULT '',
UNIQUE KEY (USER_ID),
UNIQUE KEY (EID)
)
Making the USER_KEY primary allows more efficient secondary keys (MySQL includes the primary key value in each secondary key entry). We can add this to the table to establish keys for existing users, and new users will automatically get new keys.

When the user logs in, we already lookup the user's USER_ID from the login provided EID, so we can extend this to lookup the user's key and store it in the session. I'll extend the login and session code to support setting and easy access to the current user's key.

More efficient authz role definition

Each role in each authz group is defined by a set of rows in the SAKAI_REALM_RL_FN table, one for each function allowed to the role. I propose reducing this to a single record for each authz group role, representing the allowed functions as packed bit sets in the row. This greatly reduces the size of this table, and changes the update process to operate on a single row per role per group.

Bit sets are not well supported by older MySQL versions, so I propose we just deal with them in the java code. Instead of asking the database to tell us if the user is allowed to perform a function, we ask the database for the bit set(s) that has this information. We mask the bits based on the desired function, OR together multiple sets from different authz group roles, and come up with a security answer.

The new table would look like this:
CREATE TABLE SAKAI_REALM_RL_FN
(
REALM_KEY BIGINT UNSIGNED NOT NULL DEFAULT 0,
ROLE_KEY TINYINT UNSIGNED NOT NULL DEFAULT 0,
F001 BIGINT UNSIGNED NOT NULL DEFAULT 0,
F002 BIGINT UNSIGNED NOT NULL DEFAULT 0,
F003 BIGINT UNSIGNED NOT NULL DEFAULT 0,
F004 BIGINT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (REALM_KEY, ROLE_KEY)
)
The 4 "F00..." columns hold the function bits. This design supports up to 256 unique functions. It is possible that a Sakai installation could accumulate more than this number of functions; we can add more columns to hold more functions if required.


More efficient grant definitions

We can take advantage of our new user key in our grant table:
CREATE TABLE SAKAI_REALM_RL_GR
(
USER_KEY BIGINT UNSIGNED NOT NULL DEFAULT 0,
REALM_KEY BIGINT UNSIGNED NOT NULL DEFAULT 0,
ROLE_KEY TINYINT UNSIGNED NOT NULL DEFAULT 0,
ACTIVE TINYINT UNSIGNED NOT NULL DEFAULT 1,
PROVIDED TINYINT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (USER_KEY, REALM_KEY),
KEY (REALM_KEY, ROLE_KEY)
)
There are some other changes reflected here, such as smaller fields for our boolean values (TINYINT instead of CHAR(1)). The primary key, which in MySQL is used to cluster records together, makes this efficient for reading all the grants for a user, or for reading the grant for a user for an authz group. (Note: final key design will be refined for access and update needs).


More efficient queries

The current query to find if a user has permission is complex and costly (I won't reproduce it here - check your slow logs, it will be there!). It is also issued every time we need to know, which can be many times per request. Security caching helps reduce the load on the database from all these queries, but we can do better.

The authz process has some functional complexity that we can preserve. Each security decision may involve a set of authz groups, logically ORed. There are also "helper" authz groups which are added to the mix, as well as functions permitted in general to any user or authorized users. Part of the complexity of our current queries is their folding all this into a single DB query.

We can improve this by unwrapping some of the functional complexity into a series of simple queries, optimized with our more efficient tables and indexes.

To answer the question "can user do function in authz group", we can issue this:
SELET GR.ROLE_KEY, FN.F001, FN.F002, FN.F003, FN.F004
FROM SAKAI_REALM_RL_GR GR
JOIN SAKAI_REALM_RL_FN FN USING (REALM_KEY, ROLE_KEY)
WHERE GR.USER_KEY = ? and GR.REALM_KEY = ?
We need to already know the authz group's key, and the user's key (see caching below). This is a very efficient and simple query.

In addition to the above, we need the function bit sets from a couple of other authz groups, for the role the user is granted in the main group. These queries are:
SELECT FN.F001, FN.F002, FN.F003, FN.F004
FROM SAKAI_REALM_RL_FN FN
WHERE FN.REALM_KEY = ? and FN.ROLE_KEY = ?
We have the role key from the first query, so all we need to know is the REALM_KEY for authz groups like "!site.helper", etc. The same query finds us the function bits for the ".anon" and ".auth" roles (usually keys 1 and 2) in the main authz group.

Once we get all the function bit sets, it is not complicated to OR them together in java and select the bit we want for the function we are checking.

These queries are simpler, more efficient to run, and also fit nicely into the MySQL query cache. The role based queries that don't include the grant table are likely to be reusable by many different users, so breaking them out to their own simple queries takes advantage of caching.


Better Caching

By breaking up the process into a set of simple queries, much of the information needed for security checks can cached to be efficiently re-used for different users and over time.

Bit masks for each defined function, role keys for each defined role, and authz group keys can all be cached. This information never changes, except for additions when authz groups are created.

The ".anon", ".auth" and other role definitions for an authz group can also be cached, with invalidation when authz group definitions change.

Caching can be done in the application or in MySQL's query cache. We need to experiment to find the right balance.


Even better caching

If we accept a new functional policy for authz, we can do even more efficient caching:

Any changes to authz made after a user has logged in are not reflected until the user's next session.

We can make an exception for the infrequent changes made by the user, such as creating a new site or setting permissions, so these are immediately reflected to the user making the change.

If we accept this, we can cache a user's grant information in the login session, so we only query once per user per authz group per session. This would greatly reduce the frequency of authz db queries.


Data redundancy for even better performance

If we cache authz information in the user's session, we can make a further change for efficiency. The grants table is clustered by user, so it is just about the same cost to read a user's full set of grants at once as it is to read a user's grants in a single authz group. If we added a redundant copy of the function bit columns to each grant record, copied from the function table, we could read the functions directly from the grant table (no join with the function table), read the user's entire set of grants at login in a single efficient db call, and reduce the frequency of database access for authz to once per user per login.

This redundancy has a cost. The grants table needs 4 extra 64 bit fields, so grows larger. The redundancy has to be written every time an authz group is changed; instead of updating a single SAKAI_REALM_RL_FN table row, we need to update that and some SAKAI_REALM_RL_GR table rows; those where a user has a grant to the modified authz group role. Not only is this more data to write, but it's all random access, since the grants table is clustered by user, not group.

I think that the benefit of virtually eliminating the constant and not inconsiderable database load of reading authz information is worth the cost of more expensive updates (which are done far less frequently).


While I'm in there...

Way back when we started this, this area of Sakai was called "Realms". We changed the java side to be called "Authz Groups", but left the database terms "realm" based. I propose the changed tables for this project change terminology to become "authz" based.

Saturday, March 8, 2008

Does anyone else have this problem? It took me forever to pick a name for my blog. The name has to have meaning to me... and also be available.

"The null pointer" was taken. Most of the good software related terms that sounded cool to me were taken.

I finally flipped through the dictionary (on my mac, of course), and came up with "galvanometer":

noun
an instrument for detecting and measuring small electric currents.

Perfect for those "small electric currents" running through my brain that make up my life.

It's a start.