Written by: Monisha M , Sneha S (1st year MCA)
ABSTRACTWeather prediction is dominated by high dimensionality, interactions on many different spatial and temporal scales and chaotic dynamics. This makes many problems in the field quite complex ones, and also state of the art numerical models are - despite their immense computational costs - not sufficient for many applications. Therefore, it is appealing to use emerging new technologies such as artificial intelligence to tackle these problems. One of the methods used are emsemble methods, Ensemble forecasting is a method used in or within numerical weather prediction. Instead of making a single forecast of the most likely weather, a set (or ensemble) of forecasts is produced. This set of forecasts aims to give an indication of the range of possible future states of the atmosphere. Ensemble forecasting is a form of Monte Carlo analysis. The multiple simulations are conducted to account for the two usual sources of uncertainty in forecast models:
The errors introduced by the use of imperfect initial conditions, amplified by the chaotic nature of the evolution equations of the atmosphere, which is often referred to as sensitive dependence on initial conditions.
The errors introduced because of imperfections in the model formulation, such as the approximate mathematical methods to solve the equations. Ideally, the verified future atmospheric state should fall within the predicted ensemble spread, and the amount of spread should be related to the uncertainty (error) of the forecast. In general, this approach can be used to make probabilistic forecasts of any dynamical system, and not just for weather prediction.
KEYWORDS
Monte Carlo analysis: Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.
Sensitive dependence on initial conditions: In chaos theory, the butterfly effect is the sensitive dependence on initial conditions in which a small change in one state of a deterministic nonlinear system can result in large differences in a later state.
Dynamical system:In mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space, such as in a parametric curve. Examples include the mathematical models that describe the swinging of a clock pendulum.
Numerical weather prediction: NWP uses mathematical models of the atmosphere and oceans to predict the weather based on current weather conditions.
INTRODUCTION
Weather forecast is to use modern science and technology to predict the state of the Earth’s atmosphere at a certain place in the future. Since prehistory, human beings have begun to predict the weather to arrange their work and life accordingly (such as agricultural production and military operations). Today’s weather forecast mainly uses the collection of a large number of data (temperature, humidity, wind direction and wind speed, air pressure, etc.), and then uses the current understanding of atmospheric processes (meteorology) to determine future air changes. Because of the disorder of atmospheric process and the fact that science does not know it thoroughly, there are always some errors in weather forecast. Due to the inherent practical uncertainty in weather forecasts, the forecasts are never perfect. In many contexts in which weather forecasts are used it is not sufficient to only have a forecast, but also a measure of uncertainty is needed. The standard method that has been developed to accomplish this is ensemble forecasting . Here, not a single forecast is made with a NWP model, but a whole range (often 10-50) forecasts, which thus form an "ensemble". The different runs are made with slightly different starting conditions, slightly different model formulations, random components in the model, or a combination of these. The difference between the individual forecasts can then be used to get the range (including probabilities) of future weather states. While harder to interpret than a single forecast, for many applications, this makes the forecasts much more valuable than single forecasts , and is now standard practice in weather services around the world. The uncertainty associated with every forecast means that different scenarios are possible, and the forecast should reflect that. Single ‘deterministic’ forecasts can be misleading as they fail to provide this information. Take agriculture as an example: a farmer needs to know the range of possible conditions the crops may experience so that they can be protected. Ensemble forecasts show how big that range is at different forecast times. By generating a range of possible outcomes, the method can show how likely different scenarios are in the days ahead, and how long into the future the forecasts are useful. The smaller the range of predicted outcomes, the ‘sharper’ the forecast is said to be. Good ensemble forecasts are not just sharp but also reliable. If a reliable forecast says that there is a 70% chance of top temperatures rising above a certain threshold, then in 70% of cases when such a forecast is made temperatures will indeed rise above that threshold.Lack of knowledge does significantly increase uncertainty in the forecast. This is why there is much work going into improving our knowledge of initial conditions and of atmospheric processes that computer models need to mirror. In addition, the atmosphere is a chaotic system. This means that it is sensitively dependent on initial conditions. In a chaotic system, a slight change in the input conditions can lead to a significant change in the output forecast. In a non- chaotic system, small differences in initial conditions only give small differences in output. Hence, it is important in weather forecasting to investigate how sensitive the atmosphere is at any stage to initial conditions. Ensemble forecasting does this by looking at a spread of possible outcomes.
METHODOLOGIES
K-Nearest Neighbour:
k-Nearest Neighbours algorithm for predict whether through previous data to determine the expected temperature and humidity the prediction results were compared with real results, the comparison was good and acceptable.
Data Mining is a technology that facilitates extracting relevant and which have factors in common from the set of data. It is the process of analysis data from different perspectives and discovering problems, patterns, and correlations in data sets that are useful for predicting outcomes that help you make a correct decision. Weather Prediction is a field of meteorology that is created by collecting dynamic data related to the current state of the weather such as temperature, humidity, rainfall, wind. In this paper, we designed a system using a classification method by k-Nearest Neighbours algorithm for predict whether through previous data to determine the expected temperature and humidity the prediction results were compared with real results, the comparison was good and acceptableData Mining is a technology that facilitates extracting relevant and which have factors in common from the set of data. It is the process of analysis data from different perspectives and discovering problems, patterns, and correlations in data sets that are useful for predicting outcomes that help you make a correct decision. Weather Prediction is a field of meteorology that is created by collecting dynamic data related to the current state of the weather such as temperature, humidity, rainfall, wind. In this paper, we designed a system using a classification method by k-Nearest Neighbours algorithm for predict whether through previous data to determine the expected temperature and humidity the prediction results were compared with real results, the comparison was good and acceptableData Mining is a technology that facilitates extracting relevant and which have factors in common from the set of data. It is the process of analysis data from different perspectives and discovering problems, patterns, and correlations in data sets that are useful for predicting outcomes that help you make a correct decision. Weather Prediction is a field of meteorology that is created by collecting dynamic data related to the current state of the weather such as temperature, humidity, rainfall, wind. In this paper, we designed a system using a classification method by k-Nearest Neighbours algorithm for predict whether through previous data to determine the expected temperature and humidity the prediction results were compared with real results, the comparison was good and acceptable.
Support Vector Machine: Key research interest of weather prediction using support vector machine is to analyse the accuracy of the result forecasted and compare it with the forecasted result using multilayer perception network. Compared to traditional methods, both techniques produce highly accurate results. Support vector machine is the most concerned algorithm in machine learning. It comes from statistical learning theory. From the practical application, SVM is very good in all kinds of practical problems. It is widely used in handwritten digit recognition and face recognition and plays an important role in text and hypertext classification because SVM can greatly reduce the needs of standard inductive and transductive settings for marker training examples. At the same time, SVM is also used to perform image classification and image segmentation system. Experimental results show that, after only three or four rounds of correlation feedback, SVM can achieve much higher search accuracy than the traditional query refinement schemes. In addition, biology and many other sciences are the favourites of the SVM. SVM has been widely used in protein classification, and the industry average level of the compound classification can reach more than 90% accuracy. In the cutting-edge research of biological science, support vector machine is also used to identify various features used for model prediction, so as to find out the influencing factors of various gene expression results. From the academic point of view, SVM is a machine learning algorithm close to deep learning. Linear SVM can be regarded as a single neuron of neural network (although loss function is different from the neural network), while nonlinear SVM is equivalent to a two-layer neural network. If multiple kernel functions are added to the nonlinear SVM, multilayer neural network can be imitated.
Gradient Boost: Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, i.e., models that make very few assumptions about the data, which are typically simple decision trees. It builds predictive models by combining an ensemble of weak learners in a sequential manner. It aims to create a strong learner by iteratively minimizing the errors made by the previous models. The core idea is to fit subsequent models to the residuals of the previous models, gradually improving predictions with each iteration.
CASE STUDY
The agricultural sector's day-today operations, such as irrigation and sowing, are impacted by the weather. Therefore, weather constitutes a key role in all regular human activities. Weather forecasting must be accurate and precise to plan our activities and safeguard ourselves as well as our property from disasters. Rainfall, wind speed, humidity, wind direction, cloud, temperature, and other weather forecasting variables are used in this work for weather prediction. Many research works have been conducted on weather forecasting. The drawbacks of existing approaches are that they are less effective, inaccurate, and time-consuming. To overcome these issues, this paper proposes an enhanced and reliable weather forecasting technique. As well as developing weather forecasting in remote areas. Weather data analysis and machine learning techniques, such as Gradient Boosting Decision Tree, Random Forest, Naive Bayes Bernoulli, and KNN Algorithm are deployed to anticipate weather conditions. A comparative analysis of result outcome said in determining the number of ensemble methods that may be utilized to improve the accuracy of prediction in weather forecasting. The aim of this study is to demonstrate its ability to predict weather forecasts as soon as possible. Experimental evaluation shows our ensemble technique achieves 95% prediction accuracy. Also, for 1000 nodes it is less than 10 s for prediction, and for 5000 nodes it takes less than 40 s for prediction.
FORECAST RANGES
The atmosphere exhibits variability over a large range of different timescales. Since different physical mechanisms are behind the changes over different timescales, and also for practical reasons, the field of weather prediction is usually split into several time-horizons. While these time-horizons of course don't have "sharp" boundaries, both the techniques and the type of forecast quantities are different for the different regimes. The further ahead the fore- cast, the larger the spatial scales for which the forecast makes sense (from km to continental scale), and the longer the temporal averaging of the forecast, from minutes over seasonal means up to multi-decade statistics.
Nowcasting : Nowcasting deals with forecasting the weather (mainly precipitation) over the next minutes, with maximum a couple of hours ahead. Many methods rely solely on precipitation radar images and extrapolation of the radar fields into the near-future ,but also integrated systems with NWP models exist. The essential difference to the other regimes is that the weather is influenced only very locally at this range.
Short-range : The short-range goes up to -48 hours. In this time-horizont, even the regional weather is already influenced globally or at least near-globally, meaning a purely local approach is not feasible anymore. In operational practice, forecasts for this range are usually made with regional high-resolution models that are nested in coarser global models.
Medium range:The medium range covers the evolution and lifetime of mid latitude weather systems (high and low pressure systems), which govern the weather on the scale of a couple of days to up to two weeks. The main goal is to forecast trends (for example in temperature and in weather patterns) and the occurrence of strong cyclones. Forecasts for this range are made by global NWP models. This range is also were the theoretical predictability limit of the atmosphere is thought to be in ,For current NWP systems, from roughly one week onward, deterministic (single) forecasts are not useful any- more. Therefore, the concept of ensemble forecasting has been developed first for the medium range and is thus in widespread used here.
Sub-seasonal: The field of sub-seasonal prediction deals with forecasts up to 60 days ahead .This field is relatively young, but has gained a lot of attention over the last years. For the tropics, the focus lies on predicting the Madden-Julian Oscillation, and in the mid-latitudes on the identification of situations in which the predictability horizon is longer than usual due to specific states of the atmosphere. Sub-seasonal forecasting is done with the same (or similar) models as medium range forecasting.
Seasonal: Seasonal prediction aims to predict the mean weather of the next seasons, roughly up to 1 year ahead, even though some progress has also been made on predictions more than 1 year ahead. Since this is far beyond the predictability limit of the atmosphere, the goal is not to predict the weather at a certain day, but only the statistics, for example mean temperature or precipitation anomalies over the season (e.g. the season will be wetter than usual). Seasonal forecasts are nearly exclusively done in a probabilistic way. For most of the regions of the world, the potential of seasonal forecasts is closely related to the El-Nino southern oscillation phenomenon. In contrast to the shorter time ranges, the exact initial state of the atmosphere is not as crucial for seasonal predictions. Seasonal prediction is done both with global numerical models and with statistical models. The global numerical models are very similar to NWP models, and the transition to climate models is fluent.
CONCLUSION
Ensemble model outputs can include a range of possible outcomes from parallel base models, instead of a single number. Real-world events have many possibilities, to which models only provide an ‘estimate’ for what is likely to happen. In this light, different models rely on different assumptions , that provide different perspectives for prediction. The performance of models are measured in probabilities of being correct, so even the lowest performing model still has a small chance of being correct. The job of ensemble models is to incorporate these uncertainties into an ensemble forecast.