Monday, March 17, 2008

_Streams of Consciousness

Unlike the confused thoughts falling from my mind, the _Streams table is quite critical to a Windows Installer database.

In fact, it is the "location" of all the binary fields in the database.

The _Streams table, consisting of two fields 'Name' and 'Data', is an abstraction of the underlying OLE structured storage data streams. It provides access to the binary streams for the Windows Installer API.

All binary fields are stored in their own OLE stream in the database file, and the _Streams table is generated when an sql request is made for a binary field. While the _Streams table is temporary and generated on request, changes to the table are persistent.

Every field in the msi database that contains binary data is represented in the _Streams table using the format <table_name>.<row_key>.

So, for a row in the Binary table with a Name field of 'Icon', the _Streams table would contain a row with a 'Name' field of 'Binary.Icon'.

This is a one way relationship. While all binary fields in tables are accessible via a row in the _Streams table, not all rows in the _Streams table represent another table row.

This is important to understand, since critical information is often stored in the _Streams table that is not accessible via regular tables. The most common example is the cab for a merge module (msm).

The installed files that a merge module contains are stored in an _Streams row with a name of 'MergeModule.CABinet' (case-sensitive). Note that there is no 'MergeModule' table with a row called 'CABinet'.

Other types of binary data can be stored in the _Streams table without having a corresponding table. Any internal cab file can be stored in the _Streams table without requiring it to be represented in a normal table. So the Media table entry might be '#cab1', or '#cab1.cab', with no attending 'cab1' table.

Given the importance of the _Streams table, it is curious that other tools have not provided direct access to it. InstEd provides access to it, allowing quick access to merge module cabinet files, and a central place to access all binary fields.

However editing an _Streams table row that represents a row in another table (the <table_name>.<row_key> format) will jump to that row in the other table. This is to ensure that the user is well aware that the _Streams row is represented by another table row.

Have you considered what happens if there are two tables with a binary field, where one is called (for example) 'Binary', and the other 'Binary.Table'? Can you have a row in 'Binary' called 'Table.Value' and a row in 'Binary.Table' called 'Value'?

It turns out you can, but changing one field, changes the other, since the binary field for both rows is backed in the _Streams table by a row called 'Binary.Table.Value'.

No comments: