Surgical registries for software development?

NOTE HXA7241 2011-04-17T09:13Z

Surgical practice has developed an alternative to scientific trials to build knowledge. Perhaps this approach could benefit software development.

Surgery

Surgery, by its craft nature, makes normal scientific trials difficult to perform. As an alternative way to gain good knowledge of effectiveness, ‘registries’ (or ‘registers’) for particular procedures and techniques are set up.

These are a passive form of trial. Instead of setting beforehand the proposition to be tested and the data to be gathered, registries follow work that is done anyway and record data that emerges from it.

Patterns can then be found and deductions made in retrospect or reactively. It is generally viewed as not so good as normal trials, but still valuable, and comparing very well with (the lack of) other practical means.

Software development

Can this registry idea be used or adapted for software development?

Software is less a craft, though still somewhat so, as any engineering. But like surgery, proper strict scientific trials seem not to fit well, and are clearly not popularly done. And even better than surgery, collection of data seems less limited in all ways: in ease, amount, variety, and detail.

The key is that the motive is the same: we do things dependent on knowledge, but lack knowledge we might otherwise be able to have. A good source of info is actual practice, and this data is potentially fairly easy to get.

We can only build at all because we have reliable, logical information – knowledge – about the pieces we intend to assemble. But if we look at common software components, information about them is quite limited. We have their ostensible functionality and representation in the source-code. But we lack much precise info on the actual results and effects of their usage, in situ, and with others, both functionally, and representationally (i.e. as they are for human manipulation). How fast is some component, on various platforms, relative to other components? What parts of a component are used most, or changed most, in various contexts? All this kind of information could be collected – it is out there, just not being recorded yet.

Registries are a formalisation of learning from experience, one that is scalable, that fits the network structure of software. Similar to some of the essential ways open-source code is good, open-data about software would be good too.

Basis of implementation

There are two main parts: function and representation – the performance characteristics, and the manipulation characteristics, of software.

If we use open-source software, we have pretty much all we need for the representation half. Repositories record changes to code, and that data is available for analysis. There is still room for refinement, but the basics are there.

The functional half is very much less developed. We need a load of instrumentation, data standardisation, and publishing. None of this is in any more than inchoate forms for this purpose . . .