Long form investigation
While the short questionnaire gives us a high level overview of challenges and potential solutions, it lacks detail. To compliment this information we asked project participants to describe how they addressed six key questions. We received detailed responses from 4 organizations, the UK Office of National Statistics (ONS), the Australian Bureau of Statistics (ABS), Statistics Flanders, and the U.S. Bureau of Labor Statistics (BLS), and related comments from many others. The questions, and a high level overview of the responses are below.
Where should machine learning fit in a statistical organization?
Participants indicated 4 broad approaches:
- Machine learning as a branch of methodology - In Statistics Flanders, machine learning is an experimental branch of methodology. Machine learning techniques are clearly related to traditional statistical techniques so methodology is a reasonable starting point, especially for organizations still determining whether they want to use ML. Several other NSO’s reported similar models at least early in their investigation. It is of course not a complete solution to production deployment but not all organizations are at that stage yet.
- Machine learning as a multidisciplinary collaboration - The Australian Bureau of Statistics’ approach emphasizes the importance of multidisciplinary collaboration. In this model different pieces of the organization play lead roles on different aspects of the project. Methodology or research often develop initial prototypes which are then handed off or co-owned by information technology and subject matter experts. An advantage is that many different pieces of the organization are involved. A frequent challenge is coordination. For example, the tools preferred by researchers and methodologists, such as R and Python, are often quite different from those preferred by software engineers. Another challenge can be in getting alignment with the needs and interests of subject matter experts, who are often the most direct users of the technology and often must also assume key roles in creating training and evaluation data.
- Machine learning as a decentralized process - Although the Bureau of Labor Statistics generally follows the multidisciplinary approach, in the case of machine learning it has instead adopted a largely decentralized approach in which the program offices assume primary ownership of machine learning systems and consult with methodologists to verify the integrity of the system, IT to integrate the system with existing infrastructure, and field staff to facilitate data collection and processing activities as needed. This reduces the difficulty of aligning different divisions, but at the cost of the program office assuming a more active role in methodology, systems development and maintenance.
- Centers of excellence - For the Office of National Statistics, a key aspect of machine learning strategy is the Data Science Campus, a separate division made up of experts in data science and machine learning which provides advice on machine learning projects not just to ONS, but to many parts of the UK government and even other countries. This allows the sharing of often limited machine learning expertise across many areas. A number of NSO’s have recently developed their own versions of this approach, including INEGI (Mexico), Stats Canada, Statistics Finland, and Statistics Sweden. One variation of this approach is the “hub and spoke” model, in which limited machine learning expertise is initially concentrated in the hub (the center of excellence) with the goal of ultimately transferring much of it to the spokes (the specific business areas).
What should the machine learning pipeline look like in regards to organizational structure? Where should projects start, who should control what aspects when?
Interestingly, the responses to this question resulted in two seemingly opposite ideas. One emphasized the importance of starting with a business need, moving to R&D, producing a prototype and then bringing in other areas like IT. The other emphasized the importance of building ML experience first, which in turn allows one to identify suitable business problems which might be solved by machine learning.
In retrospect, it is clear that both are needed. An organization cannot determine whether machine learning is suitable if it knows nothing about machine learning, but it is also clear that the ultimate goal is to serve business needs.
What machine learning skills are needed and where are they needed in the organization?
On this question, there was general agreement among the responses. In organizations that distribute machine learning responsibilities across many divisions, machine learning requires new skills in many areas. Specifically:
- Everyone must understand the basics, such as the key ideas and common terminology. This allows effective communication between parties.
- Research and methodology often must become familiar with new algorithms and new tools, like R and Python, which are popular for machine learning.
- Information technology must learn how to integrate these tools and processes in existing systems. In some cases they must also support new hardware needs, such as powerful Graphical Processing Units for training deep neural networks.
- Subject matter and clerical workers must understand their role in supporting, using, and maintaining these systems as they often play a lead role in creating the training and evaluation data.
- Senior management must understand the needs of ML teams, including the need for careful alignment and coordination across these activities.
Because of the difficulty of coordinating broadly distributed activities, another increasingly popular approach is to rely on positions and operational units that increasingly blur the distinctions between research, methodology, information technology, and subject matter. See, for example, Google’s Hybrid Approach to Research, and Data Scientist: The Sexiest Job of the 21st Century. In some organizations, a data scientist spends some of their time researching and evaluating different machine learning solutions to a problem (R&D, methodology), some of it building and running the model in production (IT), and some of it assisting with use and maintenance (subject matter). This blurring of boundaries reduces the extent to which machine learning skills need to be distributed across the organization, but requires individuals and teams with a broad range of skills and the organizational and IT infrastructure necessary to make it work.
How can organizations efficiently acquire the ML skills they need?
Responses identified 4 strategies:
- Acquire and train internally - In this strategy, an outside expert is hired permanently or temporarily and used to train additional experts internally. Statistics Flanders, ONS, and ABS all report using some variant of this approach.
- External training - In the case of machine learning, many high quality trainings are available (often for free), and many NSO’s report using these extensively. There are also increasingly suitable trainings available through academia.
- Communities of practice - A community of practice is a group of individuals with a shared interest and willingness to share what they know. The HLG-MOS ML project is partly a community of practice, but many NSO’s also have internal communities. The BLS, for example, has a popular data science user’s group that frequently features machine learning work.
- Research projects - At some point learning requires doing. Research projects play an important role in supporting skill acquisition.
How should organizations demonstrate and communicate the value-added of ML techniques?
One of the recurring challenges of working on projects involving many parties is the need to convince others to adopt or support new techniques. This is supported both by numerous anecdotes among participants in the ML group, and by questionnaire responses indicating coordination and resistance issues from internal stakeholders. Responses identified 3 potential strategies.
- Clearly demonstrate value added - When replacing or augmenting an existing process, it is often easy to demonstrate speed and cost improvements with machine learning but quality is also an important consideration and frequently much harder to evaluate. In many cases the most readily available evaluation data for a machine learning project is just a subset of the data currently produced by the existing process. In this case, standard quality metrics (accuracy, mean squared error, etc.) only measure how closely the machine learning approach matches the existing process, not the more relevant question of whether one is better or worse. One solution is to construct the evaluation data in such a way that it is independent of all processes being evaluated. This can be accomplished, for example, by asking a trusted panel of experts to reprocess the evaluation data without knowledge of how either the machine learning or existing processes would handle it. The resulting “gold standard” can then be used to evaluate and directly compare both the existing process and the machine learning process. In the case of the BLS injury and illness coder, this comparison played a critical role in justifying the use of the machine learning option.
- Use ML as a decision-support, at least initially - Replacing an existing process with something new is also a potentially dangerous task. There is always the potential for some unanticipated issue to occur, and this is especially concerning to stakeholders who might have little familiarity with machine learning. One solution is to instead use machine learning as an assistive tool, at least initially. If we are automating an occupation classification task which was previously done manually, for example, we might start by only using machine learning to provide suggestions to a human coder. This allows stakeholders to get hands-on experience working with the machine learning model in a low-risk setting.
- Use ML for things that aren’t otherwise possible - Another way to introduce machine learning is to use it for new projects where no other option is feasible. Analysis of satellite imagery is a good example, it simply is not possible to do this at scale and frequency without prohibitive amounts of labor. Here, machine learning can make an otherwise impossible task possible.
How should statistical organizations identify the right problems for machine learning?
Our investigation uncovered 3 strategies.
- Learn from others. Learning from the successes and failures of others working on machine learning is a relatively cheap and easy way to identify promising areas and avoid less promising ones. By organizing and promoting the sharing of this information, the HLG-MOS ML project greatly facilitates this.
- Look for tasks that meet machine learning friendly criteria. Machine learning tends to be well suited for tasks that have certain characteristics. These often include the following:
- Stable over time, i.e. the task is largely the same task year to year. This is important because machine learning learns from previously processed data and when things change, adjustments are often required. Processes that are more stable over time will thus require less frequent adjustments to continue operating correctly.
- Lots of training data showing all relevant inputs to a task and the desired outcomes. Ultimately machine learning requires data to learn. The more that’s available and the better the quality, the more effective it tends to be.
- Start with lightweight research projects. Pilot studies provide a relatively low cost and low risk way to explore and test initial ideas.