Wednesday, September 2, 2009

Speed Demons

There is a whole genre of utilities written as quickly as possible, to satisfy a short term need, never to see the light of day past next week. But somehow they persist eternally as the worst example of the author's code, simply because they are the only thing that fills that need, and no one has the motivation to re-write them.

I suspect that Orca was probably written as such a utility. Somehow it ended up in the SDK, and now is the tool de rigueur for editing msis when you just want to make a quick change, or find some info quickly.

One unfortunate decision when it was written was to use the listview in non-virtualised mode. A virtualised listview basically means the listview is responsible for layout of the items, but not for managing the memory of the data in the listview. The big advantage is that items don't need to be added to the listview. When items are added to the listview in a non-virtualised mode, they are added one by one, and the strings for each cell are copied (i.e. memory is allocated off the heap, the data is copied, and likely the original string is de-allocated by the listview owner). This happens field by field, with the scrollbar details being updated after each row.

If you open a table in Orca that has thousands of rows, you can often see the thumb tab in the vertical scroll bar visibly shrink as the table is loaded into the listview. This is an indication that the rows are being added one by one.

In a virtualised listview, the rows and fields aren't "added". The listview is simply told: "there are 11654 rows and 9 columns".

At that point the listview adjusts the scrollbar, and calls back to the parent window to get the strings for rendering into the visible cells. The parent doesn't have to copy the strings for the listview to render them, it can simply pass a pointer to it's own copy of the string, saving on heap allocation and de-allocation.

So, why am I blogging about this boring topic? While recently testing some new field editors in InstEd Plus, I pulled out my largest handy msi, and ran some speed tests in Orca and InstEd. Orca loads it very quickly, which is nice. My guess is that is simply enumerates the tables and puts them into the listview on the left.

As you scroll down the tables on the left, the memory usage increases. Presumably Orca is loading the table data from the msi to put into the listview. As I clicked on the File table, there was a 12 second wait for it to load approximately 40000 rows. However the rub comes when clicking on another table and back onto the File table. It's another 12 second wait. So, in my test msi, moving from the File table to the FeatureComponents table and back again is a 17 second round trip. Every time you do it. That can take it's toll on the "quick" editing that you want to do.

In comparison, InstEd completely loads the msi into memory ready for near instantaneous rendering of the tables in less than 3.5 seconds (over a network). It calculates the entire range of relationships in 6 seconds. The soon to be available InstEd Plus field editors can generate a smart editor complete with 80000 file table entries and 22000 registry table entries available for auto-completion in about 1 second.

Don't get me wrong. I understand the choice to use a non-virtualised listview when Orca was written. It was probably never intended to become such a popular tool for editing databases. But it was an unfortunate decision, one that doesn't take much work to fix, and probably needs fixing given Orca's popularity over the years.

Or just use this: InstEd.