Artificial intelligence integration roadmap for numerical weather and environmental predictions

Executive summary

This initiative is the first element of ECCC’s strategic response to the recent swift evolution of Artificial Intelligence (AI) and Machine Learning (ML) in the domain of weather and environmental predictions. It presents an overview of how AI can be integrated into ECCC’s Research-Development-Operation (R-D-O) production chain for weather and environmental predictions. This initiative provides a path by which ECCC can embrace AI in a prompt and agile manner, considering its potential, while taking into account its limitations, current scientific assets, and the rapid pace of AI technological innovation. The integration of AI technology within the production chain is expected to go beyond the numerical prediction models, and include observation/data assimilation processes, post-processing, and expert products.

In anticipation of a growing place towards AI, it will be essential that ECCC has sufficient capacity in specialized AI accelerators. We therefore recognize the necessity for planning the transition to a High-Performance Computing (HPC) solution that integrates AI accelerators into traditional computing infrastructure. This proactive approach is essential to address the increasing shift towards AI and to ensure that the computing infrastructure is equipped to support evolving technologies and wider adoption.

Recognizing the critical role of people in this transition, we identify the need for early staff training to facilitate smooth AI integration. Additionally, recruiting AI experts will be essential for building an AI organization based on leading edge AI expertise, technology, and best practices. Effective collaborations and partnerships also form a critical part of a prompt response. In effect, strengthening ties with sister organizations, academia, and the private sector will be imperative for realizing our AI objectives, fostering innovation, and sharing best practices.

In this overview, we have identified areas for AI application within the ECCC’s R-D-O production chain, along with key considerations for a successful outcome. Our next steps involve developing a detailed implementation plan to drive action. This first review not only charts a path for technological advancement but also signifies our commitment to remaining at the forefront of scientific innovation in our field.

Download the alternative format
(PDF format, 1.02 MB, 19 pages)

Context

Artificial Intelligence (AI) has established itself as a transformative force across numerous parts of our society, including in the weather and environmental prediction domains. In light of these rapid developments, the Canadian Centre for Meteorological and Environmental Predictions (CCMEP) and the Atmospheric Science and Technology Directorate (ASTD), both from Environment and Climate Change Canada (ECCC), are working together to integrate these emerging technologies in maintaining and extending our leadership in the weather and environmental fields which include forecasts from minutes to seasons.

In response to the swift evolution of AI and Machine Learning (ML),Footnote 1 a review of our current status and potential pathways forward are presented. This reflection was facilitated through two one-day workshops and an internal forum conducted in the Fall of 2023.

A focused dialogue with leading national meteorological services was held to deliberate on the integration and implications of AI in their Research, Development and Operations (R-D-O) activities. This workshop highlighted the: (a) recognition of the speed of progress; (b) re-examination of existing AI strategies in major NWP Centres and (c) opportunities for product and service improvements. It also underscored the necessity for our organization to remain agile as we integrate AI technologies.

The second dedicated workshop, engaging private technology enterprises and academia, brought to light the impressive strides made in AI technologies, such as graph neural networks and transformer-based data-driven weather models. These technologies have demonstrated capabilities that challenge, and in certain aspects surpass, some of the best Numerical Weather Prediction (NWP) models. While the private sector offers a wealth of AI expertise and superior access to requisite hardware, their need to access weather centre data and collaborate with domain experts was recognized.

Finally, an internal workshop was held to present an overview of ongoing activities related to AI as well as to reflect on and discuss external insights gathered so far. The discussions underscored:

Building on these discussions, an interdisciplinary expert team was assembled, drawing talent from the relevant R-D-O divisions, and encapsulating all scientific domains, including atmospheric, oceanic, and land surface disciplines, as well as Information Technologies (IT). This team was tasked with elaborating this AI roadmap to foster the integration of AI across the entire prediction production chain, from the minute to the seasonal timescale.

Roadmap objectives

AI integration perspective

ECCC’s first effort toward an AI roadmap aims to offer an exhaustive overview of how AI can be integrated into the R-D-O workflow and production chain for weather and environmental predictions. This includes the adaptation of AI methodologies to existing systems, the development of new AI-driven tools for enhanced predictive accuracy, and the seamless fusion of AI technologies with traditional meteorological and environmental prediction approaches. It aims to identify potential activities, challenges, and opportunities where AI can bring about transformative change, ensuring that every step from R&D to operational deployment is rigorously executed to benefit from AI innovations.

Listing and prioritizing potential activities

The review aims to present key areas where AI can be leveraged, such as in data assimilation, numerical prediction modelling, post-processing, as well as ensemble forecast systems. It also intends to propose prioritization considerations to guide future projects based on feasibility, impacts on services and computational resource savings.

Timeline with milestones

A timeline is proposed, with critical milestones important for AI integration. This planned timeline is intended to guide ECCC through both short-term and long-term goals. The fast pace of change in AI technology is recognized.

Resource requirements

A high-level analysis of the resources required to achieve AI integration goals is also included, with particular attention to the need for specialized AI hardware such as Graphical Processing Units (GPUs), software, and data infrastructure. Additionally, it covers the human expertise needed to drive AI initiatives and the additional staff required to support the new AI-related R&D activities.

Collaborations and partnerships

The initial reflections on implementation emphasize the importance of strategic partnerships with academic institutions, AI-related companies, and other meteorological centers. The intention behind these partnerships is to share knowledge, resources, and best practices, thereby fostering an environment of innovation and shared progress.

Communication

A plan is presented to maintain internal and external engagement by establishing a communication strategy to manage change effectively. It also intends to keep the organization’s personnel informed about the AI integration process, including sharing successes and challenges.

Integrating AI into our organizational vision

ECCC is committed to explore and gradually employ AI, ensuring its application adds value to our prediction processes. The vision is guided by a problem-solving mindset, consistently targeting areas that can significantly enhance services through the application of AI capabilities.

Our organization is engaged in promoting open access to data, software, and algorithms, which is essential for fostering collaboration, ensuring transparency, and facilitating the replication of studies, all of which are cornerstones of public trust and scientific advancement.

In the integration of AI technologies, ECCC places paramount importance on ethical considerations and transparency. Our roadmap’s primary aim is the enhancement of services to our society while mitigating risks associated with AI. It is expected that AI integration will bring benefits in the form of computational and energy savings. In every step, the organization shall adhere to the Government recommendations, ensuring that the adoption of these technologies aligns with the government values for public services and accountability.

Strategic considerations and key priorities

By incorporating AI, ECCC aims to enhance our capacity to address key priorities and improve the methodologies and tools used in our weather and environmental forecasting production chain. Among the priorities in the integration of AI are 1) the enhancement of data assimilation and modelling framework, including through hybrid approaches that combines numerical methods and machine learning 2) the acceleration of the transition towards a fully ensemble-based forecasting approach at all scales and 3) helping to improve very short-term forecasts through nowcasting and rapid cycling numerical prediction systems. With those goals in mind, the following will be considered:

AI prioritization strategy

To integrate AI within the R-D-O workflow, the proposed prioritization strategy balances three central factors:

  1. Degree of feasibility
  2. Added value from a service perspective
  3. Added value from a cost and energy perspective

The feasibility and expertise

The sophistication of algorithms and computing environments needed will be assessed by ensuring that they align with our current capabilities and future potential. Foster expertise will begin by experimentation with currently available AI technologies, which will further our know-how, tailored to our needs. The availability and quality of datasets for training and validating our AI models will be taken into consideration to guarantee robust and reliable inference and evaluation systems. AI proficiency will be evaluated within the organization to prioritize training, mentoring and recruiting, when possible, to fill the gaps in expertise.

Potential for added value from a service perspective

The primary driver of our AI integration endeavors is the improvement of forecasts and the enhancement of our weather and environmental related products and services, aiming to improve timeliness, and to deliver greater accuracy and reliability to our users and stakeholders.

Potential benefits from a cost and energy perspective

An essential component of the strategy is the consideration of economic and environmental impacts. ECCC will strive to limit the computational costs and energy consumption of our forecasting systems to make its operations more sustainable and cost-effective, while continuing to deliver high-quality information for Government-wide mission critical activities.

In addition, to speed up the experimentation and familiarization with AI, ECCC will prioritize technologies that are readily available for our domain of applications to ensure some quick start AI integration into our workflow. Some of the tools have already been developed by other external organizations, and our aim is to leverage and learn from these systems. For example, as of Fall 2023, our meteorologists and scientists have started installing, validating, and evaluating open-sourced data-driven weather models, such as GraphCast (from Google Deepmind) and FourCastNet (from Nvidia). This experimentation aligns with the practices of other major Numerical Weather Prediction (NWP) centres.

In recognition of the dynamic nature of AI and the evolving landscape of weather and environmental predictions, it is acknowledged that several parameters in this proposed prioritization strategy remain unknown. This necessitates a flexible approach, where the activity prioritization shall be revisited and adjusted periodically as the organization delves deeper into various AI activities. The proposed approach is thus fluid, adaptable, and responsive to new insights and technological advancements, ensuring that our operations remain at the forefront of meteorological and environmental services.

Integration of AI within our production chain

This section goes through the Numerical Weather and Environmental Prediction (NWEP) production chain and gives an overview of areas where ECCC believes improvements can be gained by the integration of AI. Additionally, recommendations are provided about how R-D-O technology transfer should be performed in the context of AI integration.

Observations and data assimilation (DA)

Earth observations are central to the accuracy of NWEP and their improvements. They allow us to determine the initial conditions of forecast models and provide, after hand, reference variables to estimate the errors of the forecasts. Integrating AI in the NWEP production chain will not alleviate our dependance on the amount and the quality of Earth observations. AI could however play an important role in improving observation quality control and error estimation, as well as helping to estimate some key parameters that are not directly observed by instruments.

DA, the process of estimating the state of the atmosphere and/or the environment at a given time by combining Earth observations with recent numerical forecasts, shares many similarities with today’s advanced AI approaches.

Therefore, the optimal path appears to embed AI approaches in current and forthcoming DA algorithms. In terms of observations, in addition to the benefits previously listed, AI could help to speed up, enhance, or enable new observation operators (the component that links the observed parameters to the model parameters and who is essential for the assimilation of every observation). In terms of forecast uncertainty estimation obtained from ensemble forecasts (another key component in DA), new and computationally cheaper modelling approaches could allow us to create large ensembles and thus to reduce the noise in the forecast error estimations. If it becomes feasible to increase the ensemble size by several orders of magnitude, it could facilitate the adoption of non-traditional DA methods, such as the particle filter, who are better suited for non-linear and non-gaussian models.

On the other hand, current and future DA algorithms will struggle with significantly larger ensemble sizes. It is therefore essential to invest in code optimization. The rise of AI is facilitated by new computing architectures that speed up operations on matrices such as GPUs. DA also relies heavily on matrix operations and should benefit by executing some of its computation on GPUs instead of central processing units (CPUs). With DA algorithms adapted for hybrid CPU-GPU platform, ECCC will be better equipped to handle the increasing computational demands from advanced numerical models and the growing volume of observational data.

Finally, DA approaches have mostly been used so far to create only the initial conditions for numerical models, but these approaches can be extended to also update the numerical model equations from observation information content (so-called parameter estimation). Therefore, DA could enable data-driven model or parameterization adjustments from observations in real-time, optimizing the recommended hybrid AI- and physics-based modelling approach and reducing the need to perform massive offline training as well as relaxing the requirement for reanalysis datasets.

Numerical predictions

Numerical prediction systems lie at the very core of our production chain. By evolving the current state of the Earth System into the future, they provide data necessary to issue public services and advisories, as well as other Government-wide critical activities. This area is also home to the recent, high-profile advancements in artificial intelligence-based forecast models, which show that currently available AI models, with some caveats, meet or exceed the quality of physics-based model at similar resolutions, while using a fraction of the time and energy. ECCC’s mandate to provide the best service to Canadians demands that it explore these AI forecasting systems and integrate them into its production chain as their usefulness is demonstrated.

At present, and until an AI system can provide the benefits, accuracy, and dependability of current physics-based models, physics-based models will remain at the core of our forecast systems. During this transition period, research will be necessary to alleviate the various limitations of AI models without compromising the cost or quality of forecasts. It is thus advised to exercise caution in integrating AI into operational forecasts until they attain comparable levels of accuracy and reliability to their physics-based counterparts. AI-based systems are based on different foundations to traditional numerical forecasting systems, so we will need to adapt our approaches to analysis and performance evaluation. In this sense, we will also have to adapt our verification system applied to numerical predictions. As a first step, initiatives that adopt existing open-source AI systems as they are published and become publicly available will be prioritized, with the goals of:

Next, ECCC intends to adapt published AI systems to accommodate our operational needs. This will happen through “fine-tuning,” where a published and trained model is re-trained with our own data without repeating the “from scratch” training process. Fine-tuning will allow us to correct for systematic differences between our forecasting systems and those used to originally train the published models, and we intend to use fine-tuned models to enhance our ensemble prediction and data assimilation systems. In parallel, research on hybrid approaches that seamlessly integrate numerical and scientific machine learning methods will be undertaken. The aim will be to leverage all available information, from data and from scientific knowledge, to ensure optimal accuracy and performance on cutting-edge computing architectures.

Additional research opportunities related to physical parameterizations could be explored as we advance. For example, machine learning could optimize current schemes, emulate individual parameterizations, or improve forecast accuracy and efficiency while reducing computational expenses.

This may initially require modest human and computational resources however, but it is anticipated that the demand for these resources will grow over time, especially as AI becomes more deeply integrated into research and development.

To conclude this section, it is worth mentioning that the application of AI in weather and environmental prediction has been the subject of numerous scientific publications and media attention in recent months. Consequently, this aspect of the Earth System is receiving a considerable amount of effort in research and development at this moment. The integration of AI into other environmental prediction systems is nevertheless important, and such work will continue to progress as research in these fields advances.

Post-processing and expert products possibilities

Post-processing generally consists of the steps taken after the production of numerical prediction outputs to correct systematic bias or to generate extra diagnostic variables not included in the prediction step. Traditionally, post-processing techniques were based on simple multiple linear regression and decision trees of limited complexity. With the advent of AI, newer promising techniques are becoming available to greatly increase the resolution of forecasts (downscaling), to correct forecast errors (statistical calibration) based on weather regime, to generate or boost the number of ensemble members to represent forecast uncertainty, or to generate complex diagnostics for high-impact event.

Site-based systematic error adjustments using AI techniques are currently being considered to reduce biases and the variance of raw numerical weather and environmental forecasts. Similar gridded applications could further benefit products and services if additional investments are made in human resources and training.

Downscaling using AI tools has been performed with success in different organizations. It can be used for example to increase the resolution of ensemble members, thereby improving the accuracy of probabilistic forecasts. It could eventually be used to increase the resolution of all systems internally up to the resolution of the Canadian High Resolution Deterministic Prediction System (HRPDS; with a grid-spacing of 2.5km) or even higher. Work has started internally and with sufficient resources downscaling could offer important gains, by reducing compute resources needed to achieve fine resolution for our weather services.

Nowcasting can be considered as a special type of post-processing activity that also considers observations as inputs. Classical nowcasting techniques based on extrapolation of radar data generally perform better than numerical weather predictions for the first hours. The pace of development of AI-based nowcasting has been staggering, passing from blurry and unrealistic predictions to highly realistic and skillful high-resolution ensemble predictions of heavy precipitation for up to three hours. These advancements allow for the prediction of surface temperature, humidity, wind and precipitation at a 24-hour lead time with better scores (for both deterministic and ensemble predictions) than numerical predictions in the first 12 to 18 hours. Furthermore, this makes possible the production of updated forecasts every minute at high resolution. AI-based nowcasting is thus disrupting rapid-update forecasts and there are obvious advantages in integrating and evaluating these innovations into our production chain.

The main variables produced by NWEP systems often do not include all the variables of interest from a service delivery perspective. Additional weather elements such as lightning, hail, tornado, blowing snow, fog and visibility need to be derived with physics-based or semi-empirical diagnostics. In fact, ensemble decision trees, a more rudimentary machine learning technique, is already implemented in ECCC’s operational post-processing systems to produce various high-impact weather diagnostics. A challenge for this specific application of machine learning is that these phenomena are often rare and poorly observed, limiting the amount of data available. This lack of available data could be partially alleviated by using AI techniques to improve the detection of high-impact weather and other environmental conditions from remote sensing.

The last step is to translate weather into impact on society, where vigilance maps or customized weather and environmental impact products could also benefit from AI.

Datasets

The AI-adapted data (DataCubes) should be made available for training, model development, testing and validation purposes. Currently, data for these applications are provided by other centers, such as ECMWF’s ERA reanalysis dataset. ECCC will need to assess our capacity and the corresponding value to invest in data generation for its specific, AI related, research and development activities.

At the same time, DataCubes from the ECCC’s NWEP operational systems may be required for a variety of AI activities. These Canadian datasets must therefore be easily accessible and quality-controlled for these applications.

It should be noted that the management of these AI-related datasets will require an appropriate data infrastructure, which is discussed later in this document.

Factors to consider for technology transfer

In the rapidly evolving landscape of forecasting science, where AI will become increasingly integrated into operations, the process of technology transfer demands careful consideration. Ensuring that forecast quality remains the top priority is essential, and this requires an operational approval process that is both robust and adaptable. The traditional process of operationalizing elements in the production chain may need fine-tuning to accommodate the unique challenges and capabilities of AI technologies.

Whilst it would be premature at this stage to decide on which AI framework should be used, it is recommended for the researchers and developers of internal AI projects to be transparent and open about their projects, so that methods and software can be reused and improved upon across projects. This will enable the definition of implementation standards that can be used for all AI systems to be seamlessly transferred to operations. The IT environment within which these AI systems operate shall be tailored and standardized to support their specific requirements. Given that these methods are heavily reliant on statistical approaches, they require extensive data processing and significant periods for training and model development. Thus, implementing a modern approach to managing their lifecycle, known as Machine Learning Operations (MLOps), becomes crucial. This approach should streamline the deployment and maintenance of AI systems, ensuring they are both reliable and efficient.

Moreover, operations teams should be trained to respond to issues swiftly and efficiently with AI components. This necessitates a well-structured technology transfer approach, including a contingency plan, to enable these systems to work smoothly in an operational environment. Additionally, seamless and ready access to support from R&D teams is vital. This close collaboration between R&D and operational groups is key to the successful integration and ongoing maintenance of AI-assisted forecasting systems.

To close this section on the integration of AI within our R-D-O workflow and production chain, we provide in the Appendix the result of a categorization and prioritization exercise of proposed thematic AI activities across the production chain and domains of application.

Infrastructure

Computing infrastructure

The current HPC infrastructure responds to the requirements of physics driven modelling, massively parallel CPU calculations and high-performance interconnect. There will always be a need for physics driven requirements (non-AI code, producing training data, analysis/reanalysis, verification, …) but the footprint may eventually be smaller than our current requirements.

AI approaches require a different infrastructure; namely one based on AI accelerators. In this new paradigm, heavy processing could be displaced from the forecast (OPS) to the training (DEV) so the operational requirements would diminish greatly, and the training could be done on a different non high availability infrastructure, lowering the platform support cost significantly. Furthermore, the training processes by nature being infrequent and scattered over time, cloud solutions could be a good alternative, although issues relating to data storage, hosting, and availability will have to be solved or accounted for.

In 2022, ECCC acquired several GPU nodes for our supercomputers, that should be enough to provide for initial work that consists of:

However, current resources are unlikely to be sufficient for intensive machine learning development. For example, Google Deepmind used 32 accelerator-months in its final training cycle of GraphCast, and it undoubtedly spent considerable computing resources exploring model architecture choices. This level of commitment would be impossible with our current GPU resources, limiting our ability to conduct large-scale, groundbreaking research in the short term.

In addition, as more projects take advantage of machine learning and GPU computing, it is expected that the utilization of our GPU resources will naturally increase. Currently, GPU resources are generally accessible for research and development work, but as the organization progresses towards HPC renewal (~2028) contention is likely to increase.

One opportunity to relieve this pressure might be to migrate early-stage development tasks to cloud computing infrastructure. During this stage, researchers and developers need rapid access to GPU resources to write, benchmark, and debug codes, but they are not yet conducting large training cycles. Cloud computing might relieve contention for our organic GPU resources, allowing us to more effectively use the supercomputers for operational use and planned training cycles.

It is very difficult at this moment to evaluate the requirements for the next 3 to 5 years. A comprehensive analysis will be required in subsequent updates to this document, as projects ramp-up. The requirements for the next HPC replacement are already being devised, as this document is written, which implies that ECCC will have to be flexible to account for these quickly evolving needs.

Efforts are on-going to migrate physics-based code to GPU architecture to benefit from their higher performance and energy efficiency. Despite this move toward GPUs, it is likely that future systems will be composed of heterogeneous nodes.

Software infrastructure

In addition to the ‘hard’ HPC infrastructure, the software ecosystem is very different for AI workflows than for traditional HPC. Academic and commercial machine learning projects are typically developed with an assumption of nearly unlimited software flexibility, where a project or its developers can install or update drivers, libraries, and packages as needed. Different machine learning projects might require incompatible software packages or versions.

In contrast, traditional HPC assumes a well-tested and stable environment, with software updates made rarely, applied to the entire system, and implemented by professional system administrators only rather than individual users or developers.

AI projects deal with this requirement for flexibility through containerized environments, which give users their own ‘virtual system’ that still has nearly direct access to CPU, GPU, and network resources. To remain current with the state of the art in AI, first-class support for containerized environments on our supercomputer systems will be required.

Work has already progressed toward creation of these standard environments with efforts from CCMEP development, by providing Python environments creation and versioning tools, exploration of containerized deployment of AI tool technologies and testing of an internal cloud-like infrastructure being implemented by Shared Services Canada (SSC). Material is also being produced to provide clear documentation and training for our employes on these novel approaches.

Data infrastructure

AI training requires large volumes of data, which ECCC has but the current archive infrastructure does not provide the simple and efficient access required to efficiently feed AI algorithms. This archive is also only inward facing and cannot currently provide access to partners to foster collaborations. Also, the various data formats used in our production chain are either proprietary or unsuited for easy ingestion into an AI workflow.

A more agile, AI adapted (DataCube) front-end approach and open access for external collaborations with access through standard interoperable formats and protocols would be needed. It will be important to define the requirements for transitioning the current archive system into an externally accessible data platform. A model could be the current one used by the European Centre for Medium-Range Weather Forecasts (ECMWF) Meteorological Archival and Retrieval System (MARS) system. The possibility of leveraging the MSC GeoMet platform, and/or the Information Pool projects, to provide standard data access application programming interfaces (API) front end should also be explored.

Our people

ECCC’s commitment to AI integration extends beyond technology and innovation to encompass the well-being and development of our employees. This includes investing in our people and bringing in new experts in the fields required to develop and integrate AI technology. The organization is dedicated to providing employees with the necessary training, mentoring, and resources to thrive in this rapidly evolving environment. Our commitment to transparency and ethics in decision-making remains a core value, and ECCC pledges to uphold the highest standards. By placing the employees at the centre of our AI integration efforts and fostering a culture of continuous learning and collaboration, the organization is confident in its ability to harness the power of AI to enhance services, advance scientific knowledge, better serve our society, and create a thriving workplace.

Human resources

The cornerstone of advancing our AI integration efforts lies in increasing our AI expertise. It should be noted that there is a competitive market, particularly within the private sector, for skilled AI personnel, with or without meteorology expertise. Consequently, recruitment challenges and retention are anticipated. However, overcoming these challenges is imperative for the organization’s progress. Creative solutions shall be explored to attract and hire AI experts who can contribute significantly to our mission. ECCC participation in university career events will persist, leveraging the appeal and significance of the environmental cause among students. Additionally, expanding our visibility by organizing hackathons, such as the 2019 ECCC’s METEOHACK, is part of the plan.

Resources reallocation

ECCC recognizes the importance of efficiently prioritizing and reallocating resources to support AI integration within our production chain. By optimizing the use of available resources, the organization can make substantial progress in AI integration. Significant decisions regarding these considerations shall be made jointly by implicated parties during the implementation of this roadmap to ensure prioritization and reallocation of resources enables a timely innovation pace, continued delivery of our mandate while balancing change management.

Training

Integration of AI into our production chain will require that AI expertise become an integral part of training for existing scientists and IT personnel. AI integration will therefore depend on the development of a comprehensive training program, with the medium-term goal that a significant fraction of workers acquire basic level of competence in the development and analysis of AI systems. Advanced training opportunities could be offered to employees who are heavily involved in AI activities, ensuring they possess the specialized skills required for their roles. The organization also recognizes the importance of continuous feedback channels from employees to tailor the training program to their evolving needs.

Partnerships with academia and the private sector will need to be strengthened, as this should be seen as a training opportunity aligned with collaboration and R&D objectives.

Mentoring

In addition to our training program, ECCC understands the value of mentorship in fostering the growth and development of our team members in the world of AI. The organization could suggest the creation of a mentorship team comprising experienced AI practitioners who will provide guidance and support to employees as they navigate the complexities of AI integration. This mentorship would facilitate knowledge transfer, share best practices, and create a supportive environment for learning and professional development. By pairing experienced mentors with individuals seeking to expand their AI expertise, ECCC aims to accelerate skill acquisition and ensure that our AI integration efforts benefit from the collective wisdom and experience of our team. It should be noted that knowledge exchange goes in both directions between AI specialists and our weather and environment domain experts.

Collaborations and partnerships

Establishing robust collaborations and partnerships across various sectors is paramount to our success. These collaborations, spanning from intra-governmental bodies to international centres, academia, and private enterprises, are instrumental in pooling resources, and sharing expertise, and data. They offer a pathway to not only enhance predictive accuracy but also to innovate and push the boundaries of what is currently possible. Forging these partnerships will require careful navigation of logistical, strategic, and ethical considerations.

Intra-governmental collaboration

Intra-governmental collaborations involve various government partners working together, Shared Services Canada being one of them. These partnerships are crucial for streamlining our work and enhancing resource utilization, directly benefiting our efforts in improving our product and services. It will be important to address challenges in coordination and aligning strategies across different departments, each with its own mandates and objectives.

National meteorological and hydrological services (NMHS)

Collaboration with other NMHSs would be vital, for example, for sharing knowledge and experiences in this AI integration journey. Also, sharing of AI training tasks and experimentation data would be beneficial. These partnerships offer substantial benefits in terms of shared knowledge and resources, as well as in the establishment of effective data sharing agreements and integrating diverse methodologies.

Academic collaboration

Partnerships with academic institutions offer access to cutting-edge research and opportunities for joint projects, driving innovation in AI application and bridging the gap between AI expertise and weather prediction. The main challenges here include aligning academic research goals with practical forecasting applications and coordinating joint efforts. A detailed and well-organized approach to academic collaboration would help to maximize efficiency with this important partner.

Private enterprise collaboration

Engaging with the private sector can introduce innovative technologies and new approaches to AI in forecasting. Some companies making AI accelerators and now involved in weather forecasting could surely help implementing AI algorithms. This type of collaboration, however, often faces issues related to data privacy, alignment of organizational goals, and intellectual property rights.

Efforts to establish new partnerships, particularly with AI experts, are currently ongoing. Initiatives like meetings and collaborations between academic and private sector entities are crucial steps towards this.

Communication

ECCC’s goal as AI is integrated into our production chain is to foster an environment of open communication and collaboration, both internally and externally. The organization aims to build trust, encourage knowledge sharing, and promote active participation in our AI roadmap’s updates and implementation. Our strategy includes:

The plan involves diverse channels like internal communication tools to reach a broad audience. These channels will be used to announce events, share updates, and disseminate key takeaways. The effectiveness of these initiatives will be evaluated through feedback surveys, attendance tracking, and engagement analysis. Insights and learnings will be shared organization-wide, and the roadmap will be updated based on the feedback and new ideas generated from these interactions.

Milestones and timeline

Despite many unknowns and a rapidly changing landscape, the following timeline with high level milestones to guide the organization in the upcoming transition is proposed. After the publication of this roadmap and the in-house evaluation of various data-driven models publicly available (milestone #1; M1), the next step will be to transform the vision proposed here into concrete R&D activities, with appropriate resources and governance. This exercise should be done together with the review of our historical activities as well as the estimation of the needed HPC infrastructure for the next computer upgrade (M2). A tailored entry-level AI training should be offered to interested staff members by mid-2024 and an AI practitioners’ group to support R&D activities should be assembled in parallel (M3). As mentioned earlier, the roadmap will necessitate frequent updates. The first review is proposed after one year, in the fall of 2024 (M4). Given the rapid progress of AI technologies, milestones beyond 2025 are estimated on the assumption of current knowledge and will be adjusted accordingly in line with advances in the field.

To share and reflect on the progress made within and outside the organization, a new scientific forum for the second roadmap review in 2025 (M5) will be set up. This will also be the opportunity to identify the near-maturity applications that could be part of the next major technology transfer exercise (innovations cycle 5; IC-5) anticipated scheduled for 2026. With the expected gradual rise in AI capacity and expertise in the organization, it is expected that the number of applications with an AI component will rapidly grow in the upcoming innovation cycles (M6 and M7). By 2030, AI should simply be a common tool that is used throughout our production chain (M8).

Appendix: Mapping AI activities across the production chain and scientific domains

This table illustrates, as of January 2024, the main thematic activities of Artificial Intelligence (AI) at each stage of ECCC’s R-D-O production chain, with each scientific group undertaking a categorization exercise. Each activity has been assessed and categorized based on its level of criticality, current status (already initiated or about to be), and the potential need to add human resources. This analysis aims to provide an overview of ECCC commitment in the field of AI, highlighting key opportunities, identifying human resource needs, and outlining strategic next steps to maximize efficiency. It is important to note that this assessment is not final, and that the organization recognizes that, despite our current assets, some activities may require additional resources for optimal implementation. Finally, each domain application will provide more detailed information on their respective AI activities in the upcoming implementation plan, offering a deeper understanding of the goals, methodologies, and anticipated outcomes of each initiative.

Table legend

A: Atmosphere O: Ocean H: Hydrology L: Land Surface Q: Air Quality S: Subseasonal to Seasonal

  1. Critical, already initiated or about to be, regardless of the addition of human resources
  2. Critical, already initiated or about to be, but need additional human resources
  3. Critical, but in need of additional human resources to initiate this activity
  4. Interesting, but less critical
  5. Not applicable
Mapping AI activities across the production chain and scientific domains
Chain Thematic activities A O H L Q S
Observations Quality control and error estimation 3 3 4 3 4 4
Derived observations 2 4 4 2 1 4
Nontraditional data sources 2 3 5 3 4 4
Data assimilation Make data assimilation algorithms compatible with hybrid CPU-GPU platforms 3 3 5 2 4 4
Enhance observation operators 4 3 5 2 4 4
Improve forecast error statistics estimation through ensemble size expansion 3 3 4 4 1 4
Novel data assimilation approaches 4 4 4 2 3 3
Observation-informed AI model/parametrization parameter estimation 3 3 2 4 1 3
Numerical predictions Parameterization optimization and discovery 4 3 4 3 1 4
Parameterization or full model emulation 3 3 4 3 3 3
Installation and validation of new open-source data-driven models 1 3 1 3 3 1
Adaptation of existing data-driven models to our predictions systems 1 3 1 3 3 3
Development of a data-driven model 2 3 4 4 3 1
Combining data-driven ML approaches and physics-based model (hybrid approach) 3 3 3 3 1 3
Explore the use of AI to generate ensemble members at all scales 3 3 4 3 1 3
Development of explainable AI (XAI) models 4 4 4 3 1 1
Post-processing Monitoring, diagnostic and prediction of high impact weather and environmental events 2 4 4 3 4 4
Point- and grid-based systematic error adjustments (calibration) 2 4 1 4 4 1
Downscaling from low to high resolution forecasts 2 3 4 3 4 1
Improving observations-extrapolation-based nowcasting 3 5 4 5 4 5
Optimization of the transition from obs-extrapolation-based to model-based forecasts 1 5 5 5 5 5
Expert products Impact and vigilance maps 4 4 4 4 1 4
Customized products 3 4 4 3 4 4

Page details

Date modified: