Business Matters:

Shedding Light on Building and Implementing Successful Taxonomies

For all the importance assigned to taxonomies, few studies have shed light on what taxonomy design methods are effective, how much should be spent on the design, or how to make designs more successful. A September 2010 survey, “Records Classification Systems – Taxonomies,” included approximately 4,000 members from the United Kingdom (UK), North America, and Records Management Association of Australasia ListServes. The survey results, which included more than 170 responses worldwide, provide clues about how to improve taxonomy design methods and implementation planning to significantly increase their success rate.

James "Jim" Connelly, CRM

Bookmark and Share

A records taxonomy is a corporate-wide schema for the identification, retrieval, and disposition of all business records. It provides an easy-to-use system that allows people to easily store and retrieve the documents they use in their day-to-day business. (See the sidebar below for an example.)

Although the term is often used in a limited sense to describe the hierarchical classification structure to be used for storing documents or records, in the modern and technological environment, it should certainly include more.

Taxonomies should comprise classification schemes and relationship models, as well as detailed naming conventions, as the figure shows. With the advent of enterprise content management (ECM) tools, we can now easily manage “function-based” groups of records within repositories that allow for proper disposition and, at the same time, we can display the documents or records in user-friendly structures that can include personalized taxonomies, faceted relationships, or business team-related arrangements.

Along with policy and procedures, a taxonomy is one of the most crucial elements of a records program. In fact, it is a linchpin. Once it is built, other key program elements can be established, including a records retention schedule and disposition program, a vital records program, and a program to manage personal information banks. Without a taxonomy, these elements are difficult to establish and maintain.

Providing Rationale for Building a Taxonomy

It is interesting that the most commonly selected rationales, “better access and control” and “improved lifecycle management,” are traditional and embody basic records management principles.

Access and Control

During the design of a taxonomy, it is common to identify all business records in the data-gathering process. Simply knowing their location and ownership gives an organization better control over existing records. Also providing a structure to simplify storage and retrieval enhances access to individual records or documents.

Improved Lifecycle Management

A taxonomy is always the foundation of a records retention schedule. Legislation and policies often mandate recordkeeping for records that support specific business functions. By grouping similar business functions, it becomes easier to assess the need for retention, select an appropriate retention period, and make retention periods and “event triggers” more consistent throughout the organization.

Business Case Rationale

On the other hand, creating a business case for taxonomy development appears necessary. Expenditures on taxonomy development do not appear adequate for such a critical component of an organization’s RM infrastructure. (See the section on costs and budgeting.)

It would be logical also to play up the improvements in productivity. A simple return on investment (ROI) study would show that an intuitive taxonomy would considerably reduce the time taken to store and retrieve documents. Here is an example using assumptions:

Assume that each staff member creates about 800 documents a year (a little more than 3 documents x 250 workdays). Each document will have to be stored, so there will be 800 storage actions. If only 20% of documents are ever retrieved, there will be 160 retrievals. With an inadequate classification system, it could take as long as two minutes to name and store each document, and it could easily take 10 minutes to retrieve a document.

An intuitive system will certainly take less time to use. A conservative estimate is that for each of the 960 combined storage and retrieval actions, 15 seconds can be saved, for a total savings of 14,400 seconds – which is 240 minutes, or four hours, per year. At an average wage (including benefits) of $30 an hour, each employee would save $120 each year, so an organization of 1,000 people would potentially save a total of about $120,000.

Choosing a Classification System Type

It is perhaps a tribute to ISO 15489 that functional classification continues to be the choice of most organizations. But in the last few years, the survey shows a trend toward hybrid systems.

Function-Based Taxonomies

Function- and activity-based systems do assist records and archives staff in managing records over long periods of time. They also ensure that all aspects of a business are included within the taxonomy. But, there are limits to strict functional systems.

An analysis of the survey results shows that functional designs are quite effective when implementing large-scale ECM projects in large organizations. It appears that the uniformity of the functional design at high levels creates consistent classification across large organizations. However, it is also evident that smaller and unique business units may not gain much advantage from the ECM project, as these broad functional classification systems appear to cause problems at more granular levels.

Hybrid Taxonomies

In the late 1990s, the limitations of functional systems became apparent in systems that had been recently introduced. This was readily apparent in matrix-based organizations (i.e., those with a number of diverse business functions), but it was even more so in case file systems where a number of business functions related to a particular project, activity, or case file. Taxonomies for projects or programs that are matrix-based may need considerable adaptation. Also, case files, which may not be linear in nature, may cause problems for strict functional designs.

For example, when a purchasing organization’s business unit needs to acquire new software, it may conduct a needs analysis; do market research; prepare a request for proposal; interview prospective vendors; analyze proposals; negotiate contracts; acquire and test software; and finally approve a purchase. Although there are eight distinct business functions, documentation is often retained by purchasing managers in a “projectbased” folder for ease of reference while the software is being acquired. In such a situation, the taxonomy may need to adjust its top-level functions, while retaining function activity structures within a “case” file.

Note that ECM solutions can allow for function-based libraries while allowing individual business units to view documents from a number of functions within a case file folder. This allows for a modified or personal taxonomic display while retaining official records within a strict functional structure.

Subject-Based Taxonomies

Subject systems were popular in the 1970s and 1980s. These were based on library or encyclopedic-style systems where records are grouped by topic rather than by function. In the sidebar example, the risk management function of the organization managed vehicle insurance, accidents, and claims. In a subject-based system, the topic might be “vehicles,” and all six records series cited would be grouped together irrespective of who was using the records or how the records were being used.

The advantage of subject-based systems is that they are easier to understand. But, often they lead to classification errors and poor retention schedules. Also, they tend to be arbitrary and are frequently changed by new staff that prefers to use their own terminology.

Choosing an Approach to Building Taxonomies

From these results and participant comments, it is evident that designs are following an in-house process similar to the business process analysis that was used in the 1980s to assist in the design of computer systems.

Gathering Background Data

Considerable effort is required to gather background data. Designers must have access to organization charts, job descriptions, websites, business models, and strategic plans to help them understand the business functions of the organization they are trying to assist. Several participants also commented on the support received from their IT groups as the design work parallels and often complements business architecture designs.

Shelf or high-level inventories are needed to identify all hard copy records, volumes, subject areas covered, existing identification schemes, and naming conventions. Electronic inventories should identify folder structures, document volumes, and storage space used. This volume information is crucial for planning and costing implementations.

Armed with such information, designers frequently conduct interviews to determine what business functions are supported and how documents should be stored. An interview can certainly elicit information as to business functions, but it can also introduce designer bias. The classification system often changes from supporting users to supporting the records management program, as reflected in this survey comment, which represents a common refrain of many participants:

“The classification system was designed by records managers for records managers, but unfortunately does not provide actual business value to anyone else in the organization. … It is not merely a matter of change management and helping staff learn a new way of working in adopting the classification scheme, it actually works quite actively against the information flows within the organization. We are now just starting to implement our EDRMS … systems, which use the classification scheme, and it has become a problem – the classification scheme will probably work on the EDRMS system, which is managed and controlled centrally by records managers, but on the more dispersed systems, including the file-share network drives, it will struggle.”

Insistence on meeting the needs of the records community leads inevitably to failure or, at best, tepid acceptance of the new system.

This survey shows that relying on user-based focus groups involving management, professional, and technical staff is a more effective process than one-on-one interviews. By working with three or four people from a business unit, the process becomes inclusive and, in some cases, empowers the users. A designer can control structure and format but still give the business unit the opportunity to build its own system.

Adapting Taxonomies

It was somewhat surprising that a considerable number of organizations adapted existing systems from similar organizations. Although some may see this as a time-saver, it is more likely to render the taxonomy unacceptable to staff, who may say such things as, “I wasn’t consulted,” “That’s not the way we do things in this company,” or “I don’t understand the reasoning behind this design.”

A key element of taxonomy design is obtaining buy-in. Those indiviuduals who don’t involve users in the process and give them the opportunity for input will not have their cooperation during implementation.

On the other hand, having knowledge of other organizations’ designs should be part of the research. This allows designers to offer users alternatives, design options, or different ways of organizing their records. But, to take a model and impose it on an organization is often a recipe for failure.

Budgeting for Design and Implementation

A cross tab review reveals that larger organizations (more than 5,000 staff) either planned to or did indeed spend more on their taxonomies (greater than $100,000) and to that end were quite successful with both design and implementation.

However the average organization, with a staff of 1,000 to 5,000, spent less than $25,000 on either design or implementation. The caveat appears to be that organizations should understand that designing a taxonomy is not a clerical process. It includes business process analysis, use of communication strategies and tools, and legislative and compliance reviews, as well as engaging management, professional, and technical staff in a work-altering project.

Design Costs

To complete a design properly in a large organization (i.e., 1,000-plus people) may involve a project coordinator, two or three teams of designers, each of which could include a subject matter expert, as well as an assistant who can ensure that the design is transcribed properly and that individuals’ needs or concerns are addressed.

During a design process or focus group, the designer is often fixed on the task or the results that are being developed and can neglect individual concerns, which can result in an overlooked individual becoming a thorn in the side.

Each business unit may require several days of data gathering, as well as about a half-day focus session and several days to compile the design and provide follow up to portions of the design that are complex. If there are 30 to 40 business units, this could amount to 500-plus person-days of work. Using a mix of existing staff and, perhaps, consultants, could amount easily to $150,000 worth of effort.

This survey suggests that timeframes and budgets for taxonomy designs should be commensurate with the anticipated productivity gains and value-added improvements that a well-designed and user-accepted taxonomy offers. In other words, if the ROI anticipates productivity gains of $100,000, an organization should be prepared to pay at least that for design and a similar amount for implementation.

Implementation Costs

Implementation costs within a server structure are usually low. It takes a team of two usually less than a half-day to work with a business unit to convert its electronic files to a new system. Although some folder or document-naming changes may take longer, they can usually be assigned to clerical staff to ensure the completeness of the change-over. The methodology for such conversions, although technical, is usually simple to arrange with IT staff.

Hard copy implementations normally take one linear foot of records per day with a good classifier at the helm. If case files are involved, speeds of three linear feet per day can be reached. Much of this estimate depends on the complexity of the records, as well as the skills of the assigned staff. So, for every 1,000 linear feet to be converted, assume at least 600 days of implementation time.

Determining how much should be converted brings up the issue of legacy data.

Legacy Data Costs

Of all the survey results, this was the most surprising: 40% of respondents said they were leaving legacy data behind. Of the remainder, 26% were spending $50,000 or more. From a simple cost analysis, it is not surprising that organizations would feel this way and have tried to minimize how much information needs to be brought forward into a new taxonomy.

As long as legacy data is accessible, there may be no problem. For example, if electronic records are moved into a searchable archive drive and past hard copy is carefully boxed, listed, and labeled – and if these repositories can easily use the newly developed records retention schedule – there may be little need to bring forward all the records.

However, there are still dangers. Litigation is often a concern, and since legal discovery processes involve all   records, avoiding the implementation of legacy data could be fatal to some organizations. It is clear that some form of risk assessment must be done. At minimum, each records series should be reviewed. Anything that could put the organization at risk should be assessed as to the cost of bringing it forward versus the potential for financial risk should the organization be unable to produce the document in the event of litigation.

Quite simply, the value of bringing forward documents/records to a new structure or system must be balanced against the costs of such an exercise.

Implementing a Taxonomy

Fifty-eight percent of replying organizations indicated that their plan was to deploy their taxonomy across the entire organization. Although most organizations were simply planning their roll-out, 20% of organizations said they had completed their organizationwide roll-outs.

Once a design has been completed, it is important to validate the design and roll it out to each business unit as soon as possible. Delays in implementation can adversely affect the use of the new classification system. In fact, many anecdotal comments indicated that many of these roll-outs were bumpy at best. Many expressed frustration at the “stop-and-start” nature of the project.

Difficulties in implementation can usually be traced to one of three things: planning, validation, or training.


In the section “Budgeting for Design and Implementation,” there is considerable information as to time, effort, and resources required in an implementation project. Most organizations underestimate the extent of such a project.

At the outset of implementation, the designers should be able to provide a list of supportive business units. Also there should be a list of those business units most in need. From these lists, plan which business units to address and in what order, and then communicate this to all staff so they know when to expect the implementation teams. Delays must also be communicated to all staff. Periodic status reports are an integral part of communication strategies.

There should also be a large project “task map” prominently displayed that identifies each business unit and the status of each implementation – from the completed design, to the validated design, to the initial training, to the support and follow-up needed to ensure a successful implementation. Delays should be noted and the reason for the delays addressed in a “lessons learned” meeting at the end of each business unit implementation.

Validating the Taxonomy

Will the taxonomy work? This is such a fundamental question that it is astounding that few organizations focus on this. The most commonly identified validation method indicated was to send a copy of the taxonomy to a business unit and ask them to review it. Testing the classification structure in a pilot project or sending it to managers was not as effective as a full review process.

Remember that most users are not sure what to look for in the taxonomy.

If the structure is returned for review, ask the focus group or business unit to identify perhaps three to five commonly used documents. For each document, ask the focus group or business unit whether or not it is immediately apparent where these documents would be stored or found. If difficulties are encountered, it is often a result of inadequate or incorrect naming. Adapting records series names at this juncture can significantly improve the implementation success rate.

Training Approach

Training is of fundamental importance when introducing a new taxonomy. Survey respondents used a variety of approaches, but business unit training followed by individual and personal support was the most common approach (27%).

Those 12% of respondents who used the “train the trainer” approach had much more successful implementations than other approaches on their own. (As with many of the statements in this article, this was determined by filtering the responses of each approach and reviewing the success rate of implementations.)

In this process, an individual from each business unit is given extensive training in the new system so he or she has in-depth knowledge of the new structure and a good understanding of:

  • Why records or documents are grouped the way they are
  • The relationships between records series
  • The retention periods for groups of records
  • The reasoning behind the selection of the retention period

That person becomes the “trainer” for that business unit, conducting the introductory session and following up with each staff member. This approach essentially creates a “super user,” a “champion,” or a “go-to” person in each business unit to help anyone who is having difficulty finding or storing documents.

Identifying Critical Success Factors

It is interesting to note that 65% of the “World” users rate their taxonomy development and implementation process as “excellent,” “very good,” or “good,” while just 55% of “North American” users did. There are a number of reasons for this.

Early Involvement in Recordkeeping

The British Empire of the 19th and 20th centuries had a tremendous impact on “records keeping” across the globe, and, in particular, the use of registry (records) offices introduced formal records keeping to many governments. As a result, global records classification system management has been slightly ahead of North American systems for some time.

More specifically, Australia was a key driving force in the development of ISO 15489-1 Information and Documentation – Records Management – Part 1: General, the international standard for recordkeeping. Both Australia and the UK were quick to adopt the standard’s strong recommendation for functional classification schema, and their longer experience with taxonomy designs and implementations has apparently led them to a better knowledge of what works and what doesn’t.

Also, the survey revealed key differences in the approach the World took in designing and implementing taxonomies versus those North America took.

User Participation and Management Involvement

The need for user participation is clear; the global figure of 71% of users participating in the design vs. 43% in North America is telling. By filtering responses, the survey shows that organizations had a 65-70% success rate where user participation was a success factor.

By filtering responses again, the survey shows that organizations had a 75-80% success rate where management involvement was a success factor. In a more detailed question regarding management involvement, almost 60% of respondents said they had received management approval of their taxonomy outline before proceeding with design and implementation.

When both user participation and management involvement were identified as critical success factors, the success rate jumped to 85%.

Although employing a communications strategy and design validation did not significantly increase the success rate of taxonomies much above 85%, it appears these approaches also contribute to the success of projects.

Communications Strategy

A communication strategy involves more than sending a periodic newsletter to tell people what is happening. It must include change management techniques and allow the project to respond to business units where problems occur. It should also focus on successes and the satisfaction of users. Through peer pressure, organizations can move stalled implementations forward.

Design Validation

If users are not convinced that the new system can work and is working, they will subvert the system. Sending a design for review is useful, but working with groups to look at specific examples of records and document retrieval is better. This will allow the team to adjust terminology so it reflects what users expect. Also, it allows designers to assess the ability of users to “browse” the structure.

Summing it Up

One survey respondent listed the following five critical success factors:

  1. A detailed roadmap or implementation plan
  2. An executive steering committee, chaired by an executive sponsor
  3. Committed and budgeted resources, both capital and human
  4. External and unbiased subject matter experts/consultants
  5. Detailed training/communications/ change management strategies

Although reflective of an individual organization’s approach, this is a wonderfully concise explanation of how to succeed in designing and implementing a taxonomy.

Download the PDF version here.

View the survey in its entirety here.

James "Jim" Connelly, CRM, can be contacted at

From May - June 2011