An IoT-Big Data Based Machine Learning Technique for Forecasting Water Requirement in Irrigation Field

. Efficient water management is a major concern in rice cropping. Controlling the use of excessive water in irrigation field is essential for the protection of underground water that will also be the part of climate change adaptation. The sustainable use of water resources is the prior task in Bangladesh. Im-balances between demand and supply are the main region for degradation of surface and groundwater. The human readability of checking the water level on irrigation field is considerable for these circumstances. In this paper I discussed the procedure for monitoring of surface water level in irrigation field, continuous monitoring of weather condition like temperature, air pressure, sunlight, rainfall etc. by using sensor network. The aim is to create a machine learning mechanism for farmers that can be given a forecast of water demand of irrigation field by the collection of IoT based data. In turn, this will help the farmer to prepare them to give water and on the other hand it will be helpful to use appropriate ground water and also it can be used for predict energy utilization. In this research Multiple linear regression algorithm is used for this prediction. Data from the irrigation field of North-West part in Bangladesh is used here to find the result of prediction.


Introduction
Water management is important for the adaptation of climate change.Shortage of water resources are directly affects the vulnerability of ecosystems, socio-economic activities and human health.On the other hand climate change is likely to lead to major changes in water availability across Bangladesh with increasing water scarcity and droughts mainly in North-West part of this country.
It's assessed that as much as 50 per cent of irrigation water is wasted due to evaporation or runoff.This happens because most irrigation systems still rely upon simple human reading.However, Internet of Things technologies can provide "Smart" irrigation systems.It can be useable for monitoring soil conditions, surface water level in real time with low power, wireless sensor networks.The wireless sensor networks send the data to a central network gateway, and the network gateway sends the data to the cloud platform.The gateways have the ability to connect via both wired and cellular data connections, so that can be point them from anywhere.In the internet cloud platform machine learning applications can be used for sending the application result to the end users mobile phone or personal computer.
This research mainly focused on the utilization of ground water in weather-based irrigation field.Weather-based irrigation determines the amount of water needed by the landscape based on the current weather conditions, such as precipitation, solar radiation, temperature, relative humidity, and wind speed.Weather data is provided by IoT based land weather station.Data from the weather station matching with measuring the level of water by using distance sensor where mainly measure the level of water loss from the soil due to evaporation and plant transpiration.This water level data in millimeter will be as a class data with other weather perimeter.A machine learning technique multiple linear regression algorithm is used here for prediction of water loses due to this weather condition in near future.

Related works
In Bangladesh, mainly in the north-west part of this country, ground water is the main source of irrigation.Shahid & Hazarika (2010) investigated groundwater scarcity and drought in three northwestern districts of Bangladesh.They proposed a Cumulative Deficit approach from a threshold groundwater level has been used for the computation of severity of groundwater droughts.Their research shows that groundwater scarcity in 42% area is an every year in the region.The daily evapotranspiration from rice field will increase by an average of 31.3 mm and 0.33 mm/day respectively by the year of 2100 (Shahid, 2011).The main finding of this research is that climate change will increase the daily use of water for irrigation by an amount of 0.8 mm/day in the end of this century.
In  (2003) shows that paddy soils throughout Bangladesh showed that arsenic levels were elevated in zones where arsenic in groundwater used for irrigation was high, and where these tube-wells have been in operation for the longest period of time.The finding of another research of Meharg ( 2004) is "Arsenic is sequestered in iron plaque on root surfaces in plants, regulated by phosphorus status, and that there is considerable varietal variation in arsenic sequestration and subsequently plant uptake, offers a hope for breeding rice for the new arsenic disaster in South-East Asiathe contamination of paddy soils with arsenic".
For reducing the wastage of ground water smart irrigation system is now the most priotize topic in agriculture research.Mathurkar & Chaudhari (2013) focused on optimizing water management for agriculture through the physical and socioeconomic conditions that inspired the success of an "indigenous technology" which has for spanning exploited the potential for excess harvesting.Monda, Basu, & Bhadoria (2011) described a Precision Agriculture (PA) concept was initiated for site specific crop management as a grouping of locating system.By using this way of proper resource utilization and management, to a environmental friendly sustainable agriculture is possible that they focused.Nandurkar & Thool (2012) designed a sensing system is based on a "feedback control mechanism" with a integrated control unit which standardizes the flow of water on to the field in the real time based on the rapid temperature and moisture values.They also prepared a table that discover the amout of water needed by that crop.Roy & Ansari (2014) and Awasthi & Reddy (2013) developed the irrigation control system to avoid wastage of water and increase irrigation efficiency by using a PLC based irrigation system with the help of soil moisture sensor, water level sensor, and GSM controller.Their system can be used for sending message to farmer on mobile through GSM network for controlling actions.3 For example, in the built-in data set stackloss from observations of a chemical plant operation, if we assign stackloss as the dependent variable, and assign Air.Flow (cooling air flow), Water.Temp (inlet water temperature) and Acid.Conc.(acid concentration) as independent variables, the multiple linear regression model is:

Hardware specifications
To read real-time data is typical of a weather station, using different sensors, and capable of communicating via LoRa.After a review of all known hardware available on the market, all the components strictly necessary to the solution were defined, which in turn fulfilled the requirements of the above: The hardware chosen was:

Core System controller
In terms of the core system within the Weather Station solution, it is composed by the Feather32u4, which that takes a specialized role in the system where it performs control functions through software, with processing power, enabling the sensory devices to gather data from the environment, using specific libraries.This system also has built-in communication capabilities.

Data Acquisition
The weather shield is an integrated module with several built-in sensors capable of collecting data, such as temperature, humidity, luminosity, barometric pressure and altitude.Along these sensors, the weather shield also enables the integration of three more different sensors to collect data regarding wind direction, wind speed and amount of rain.Based on the proposed model, it becomes clear the connection between the controller and the Weather Shield.This connection is established with the I2C protocol that allow this digital integrated circuit to communicate with one or more masters.It is used this type of protocol because it's only intended short distance communications within a single device and only requires two signals to exchange information.The software controller uses the library "Wire.h"that is dedicated to the I2C logic protocol.The embedded software requires the "SparkFunHTU21D.h"and "SparkFunMPL3115A2.h"libraries in order to call all the functions responsible for activating and reading the sensors signals coupled to the weather shield.

Data Communication
Like as expected in the proposed system model, the controller will send data to the outstation, based on the information collected from the weather shield module.For this it makes use of the "SPI.h"library to run the communication with the radio module RFM9x LoRa 868/915.The LoRa radio must communicate with the LoRa gateway, specified by the system, and for that will interact with the "featherLora.h"library.The data will be collected according to the time windows described, already considered in the project.At the end of each time window will be sent the package with the message containing the information collected.In data sharing with outstation it was established to send an acknowledge information packet like a result of the incoming data from the different sensors.The typical message to be sent from one gateway to another is based on the type message as described in the following example: Example of the message send in the package: \\!TC/18/HU/85/LU/0.56/WD/90/WC/5.55 The following table outlines the type and content of the information sent in each package:

Result and analysis
The following result has been found from this testing dataset: The intercept shows the estimated mean Y value when all Xs are 0. We can associate with increase some values of wind direction, speed or other values water level adjusting or controlling the for Luminous or humidity.The hypothesis test that the slope for WD or others is 0.
The collinearity between WD and and WC means that we should not directly interpret the slope, as the effect of WD on DM adjusting for WC.The high correlation between two values suggests that these two effects are somewhat bounded together.
Here confint values shows the slope for the level of 95%.

Conclusion
This IoT based machine learning works was my sample hands-on experience with real time data.Behind this task, it has a lot of preparation for a big part: it involved data understanding, sorting and reframing.That is beyond of this research work.A sample small scale data just used here to show the result for predicting data model.It is definitely challenging to work with this type big data.And finally, as I tried to understand the different correlation relationships between the parameters and the forecasts, I surprisingly also got a better understanding of prediction from the information of weather perimeters and IoT data collection point of view.
Many machine learning techniques have been developed for learning rules and relationships automatically from various agricultural data sets.McQueen, Garner, Nevill-Manning & Witten (1995) and Ozdogan, Yang, Allez & Cervantes (2010) described a project that is applying a range of machine learning strategies to problems in agriculture and horticulture.They experimented and described some software requirements on real-world data sets.They also explored the value of archived data that enable comparison of images through time.Ozdogan & Gutman (2008) presented a dryland irrigation mapping methodology that relies on remotely sensed inputs from the MODerate Resolution Imaging Spectroradiometer (MODIS) instrument.They proposed different steps for mapping expected patterns where the dividing of majority of irrigated areas is concentrated in the dry lowland valleys.Image processing is an effective tool for analysis of the agriculture data sets(Vibhute & Bodhe 2012).This paper focussed on the survey of application of image processing in agriculture field such as imaging techniques, weed detection and fruit grading.A machine learning technique Support Vector Machines (SVMs) was used for classified various crop types in a complex cropping system in the Phoenix Active Management Area (Zheng, Myint, Thenkabail & Aggarwal 2015).They used "Landsat timeseries Normalized Difference Vegetation Index (NDVI)" data using training datasets selected by two different approaches: stratified random approach and intelligent selection approach using local knowledge.For weather prediction(Radhika & Shashi 2009), long-term prediction of lake water levels(Khan & Coulibaly, 2006), SVM is the most promising technique for better expectation.SVM can also be used for time series application in many application areas from financial market prediction to electric utility load forecasting to medical and other scientific fields(Sapankevych & Sankar 2009).

A
multiple linear regression (MLR) model that describes a dependent variable y by independent variables x1, x2, ..., xp (p > 1) is expressed by the equation as follows, where the numbers α and βk (k = 1, 2, ..., p) are the parameters, and ϵ is the error term.
Multiple R-squared: Approximately 79% variation in water level (Distance in Millimetre-DM) can be explained by this model.(Wind direction -WD, Wind Speed-WC, Humidity-HU, Luminous-LU and Temperature-TC) F-statistics: These tests are null hypothesis and all the model coefficients are 0. Residual standard error gives the idea of how far observed water level -DM (Yvalues) are from the predicted or fitted DM(the Y-hats).This gives us an idea of a typical size of residual or error e= y -y'.
Safiuddin & Karim (2001)oisoning-related diseases because the ground water in these wells is contaminated with arsenic.Alam, M. G. M., et al.(2002)reviews the arsenic infection of ground water, hydrological systems, groundwater potential and utilization and environmental pollution in Bangladesh.They discussed the main actions required to ensure the sustainable development of water resources in Bangladesh.Safiuddin & Karim (2001)also highlighted the causes and mechanism of arsenic contamination and presented several measures to remedy the arsenic contamination in groundwater.Another survey by Meharg & Rahman, M