Amartya Chakraborty
Kendriya Vidyalaya Gangtok, Sikkim, India 737102
SUMMARY
Nitrogen is an essential mineral nutrient that strongly affects crop growth and yield. However, Nitrogen deficiency is a common problem across all types of arable soils. This leads to heavy application of nitrogen fertilizer having detrimental effects on soil and human health. Therefore, accurate diagnosis of crop nitrogen status is crucial to mitigate the overuse of nitrogen fertilizer. The current methods of diagnosing nitrogen levels in plants, such as the soil-nitrate test and tissue analysis, are not capable of providing real-time status which is highly suited for reducing nitrogen overuse as per the plant’s demand. Hence, we have opted to utilize deep learning methods allowing the extraction of multiple intricate features from paddy leaf images. This method tests the hypothesis that paddy leaf features are positively associated with nitrogen deficiency and thereby it has been used for predicting nitrogen deficiency status. We have used the ResNet50 CNN model utilizing a 2D global average pooling layer that drastically reduces the dimensionality of feature maps by computing the mean of the height and width dimensions of the input images. This model has been trained with a publicly available dataset of 5258 paddy leaf images associated with nitrogen deficiency levels into 4 critical categories: slight, moderate, severe, and very severe. The model performance has shown the accuracy level of prediction > 85%. An Android app “PaddyVision” has also been developed which utilizes the same capable AI model. There is a large scope to improve the model further by including different datasets and other utilities to form a holistic real-time predictor of plant’s nutrient and health status.
INTRODUCTION
Rice is the primary staple food for more than half the world’s population—with Asia, Sub-Saharan Africa, and South America as the largest rice-consuming regions. Global production of rice has seen an increase reaching about 520.5 million tons in the 2023/2022 crop year [1]. Nitrogen (N) is one of the essential macronutrients for rice growth and one of the main factors to be considered for developing a high-yielding rice cultivar. Yet among the essential nutrients, nitrogen (N) is universally deficient in rice cropping systems worldwide and the main limiting factor in rice production. It, therefore, necessitates nitrogen monitoring as an effective method to combat its deficiency and its overuse which causes soil contamination and widespread health problems. Monitoring crop nitrogen content during the early vegetative growth stages is of major importance for the planning of fertilization measures, assessment of N during mature growth stages provides valuable indication of expected yield quality. Thus, continuous monitoring of crop nitrogen (N) levels throughout the growing season improves yield quality and quantity, as well as economic returns [2]. Current methods of nitrogen monitoring, such as soil monitoring, leaf tissue analysis, and portable rapid analysis systems, have major drawbacks. These methods are invasive, not real-time, and impractical in most cases. The accuracy and accessibility of these tools are also not up to par.
Convolutional Neural Networks (CNNs) offer a robust method for image recognition, making them ideal for analyzing the visual characteristics of rice plants [3,4]. These networks can automatically extract features like leaf color, size, and shape from images, allowing for efficient monitoring of rice crops. The key advantage here is the ability to process and interpret large volumes of visual data quickly and accurately, providing valuable insights into the health and nutrient status of rice plants. In this study, we have efficiently utilized one of these key strengths of CNNs. Besides, in many regions where rice is a staple food, smartphones are widely available. This means that a mobile application like ‘PaddyVision’ based on this CNN approach can be easily downloaded and used by farmers, regardless of their location. Access to the technology becomes democratized, enabling even small-scale farmers in remote areas to benefit from crop monitoring.
Traditional methods of nitrogen monitoring often come with substantial costs. They may require specialized equipment, chemicals, or expert labor, making them financially burdensome for many farmers. Our monitoring app “PaddyVision” can offer a free application toolkit addressing this issue. It eliminates the need for expensive equipment and reduces the financial barriers to entry, making advanced crop monitoring accessible to a broader range of farmers. However, to provide a more comprehensive understanding of crop health, it considers integrating additional data sources. Weather data, for example, can help correlate environmental conditions with crop performance. Information on soil quality can offer insights into nutrient availability [5]. Integrating these data sources with PaddyVision can give farmers a more holistic view of the factors affecting their crop growth. In this regard, collaborating with agricultural experts, research institutions, and local agricultural organizations can be highly beneficial. These partnerships can provide valuable insights into the specific needs of rice farmers and help tailor the app’s features to address those needs effectively and that can also support the development of accurate models and data collection methods.
METHODOLOGY
ResNet-50, an abbreviation for “Residual Network with 50 layers”, AI-CNN model has been used here which is a significant deep convolutional neural network (CNN) architecture designed for image classification and various computer vision tasks [6]. It is a variant of the ResNet architecture that is renowned for its remarkable depth, comprising 50 weight layers. At the core of ResNet-50 are residual building blocks that introduce skip connections, allowing for the learning of residual functions. These blocks come in two main types: identity blocks and convolutional blocks. Identity blocks contain two 3×3 convolutional layers with a skip connection that adds the input to the output, while convolutional blocks include an additional 1×1 convolutional layer at the start to adjust the number of channels. The ResNet-50 architecture is built by stacking multiple residual blocks, grouped in sets of varying block counts. Notably, ResNet-50 shuns fully connected layers in favor of Global Average Pooling (GAP), which condenses the feature maps to a fixed-size vector for classification. Batch normalization is applied to stabilize training, and the Rectified Linear Unit (ReLU) activation function introduces non-linearity in each residual block.
The dataset used for the CNN-model training was a public dataset obtained from Kaggle
(https://www.kaggle.com/datasets/myominhtet/nitrogen-deficiency-for-rice-crop ), which consisted of 5259 images. The images were categorized into 4 gradient levels of nitrogen deficiency, with swap 4 indicating one of the extremes of the gradient with ‘sufficient nitrogen’ and swap 1 indicating ‘severe nitrogen deficiency’. A hundred images were separated from each category of the dataset for evaluation of the model. The corresponding distribution of training and testing images in the dataset was set with 1407 training images considered for each category except swap 2 (i.e., 637) due to the lack of images of this category in the dataset.
The images of the dataset were analyzed thoroughly by taking random samples from each category. Each leaf image’s mean color, surface area, perimeter, and GLCM energy were noted. GLCM energy index reveals the texture intensity of the leaves. Whereas mean color is used to check whether the leaf colors of the same nitrogen deficiency levels are similar or not. The mean color indexes were averaged to obtain the mean color index of each swap. The mean color index of each swap was used for highlighting color discrepancies between the swaps using the Euclidean distance measure in the RGB color space. Images of rice leaves are captured by matching the swap of the Leaf Colour Chart (LCC). The input images were captured using a smartphone camera of 13 Megapixels in daylight as per the user instructions of LCC. The LCC developed by ICAR-National Rice Research Institute (ICAR-NRRI), Cuttack, India has four swaps categorizing the highest to lowest/sufficient nitrogen deficiency levels [7].
Pre-trained models such as Inception V3, VGG16, VGG19, and ResNet50 were tested for accuracy in the categorical classification of rice diseases. ResNet 50 had the highest accuracy of 99.75% with a loss rate of 0.33. Furthermore, ResNet 50 achieved a validation accuracy of 99.69%, precision of 99.50%, F1-score of 99.70, and AUC of 99.83% [5]. We have, therefore, utilized the ResNet50 architecture for this study.
ResNet50 had been initialized with the weights that have been trained on a large dataset from ImageNet. The top fully connected layer of the ResNet50 is trained to classify images into various categories based on the ImageNet dataset. Since there is no requirement for the fully connected layer for nitrogen deficiency classification, it was excluded to achieve a model that is more efficient and less prone to overfitting.
Fig.1. The CNN based-model android app “PaddyVision”, underlying workflow chart: the left panel describes the workflow starting from the input of paddy leaf images into the ResNet50 CNN-model; the right panel describes the tasks that have been carried out by each model steps in the left. The middle panel defines specific algorithmic steps through which actual input tasks are processed.
A Global Average Pooling layer is used to reduce the spatial dimensions of the output of the base model by averaging all values in the feature map. A dense layer with 256 neurons and a ReLU activation function is used to map the feature vector from the previous layer to a lower dimensional space. The final classification layer is a dense layer and SoftMax activation function which outputs the probability distribution over the different classes, which in this case are swap1, swap2, swap3, and swap4. SoftMax also ensures that the sum of the outputs is up to 1. The compiled model was trained for 10 epochs in batches of 32 images.
The model workflow (Fig.1) has been implemented exclusively using TensorFlow, which is not supported by Android devices. However, TensorFlow Lite, a lightweight and efficient version is supported. Therefore, the AI model was converted into TensorFlow Lite. The Android application takes user inputs of Paddy Leaf Images. The images are pre-processed and loaded into the TensorFlow Lite AI model. The category with the maximum probability is given to the user as output. The application was uploaded to the Google Play Store to make it easily accessible to a global audience.
RESULTS
The mean color similarities of samples of swap1, swap2, swap3, and swap4 are 62.97, 21.47, 60.74, and 44.47 respectively. This indicates that samples of swap 2 and swap 4 have similar mean colors. Whereas, swap 1 and swap 3 samples show slight dissimilarities (Table 2). The color distance between the mean colors of swap 1 and swap 2 is 69.64, indicating that the colors are highly dissimilar. Likewise, the color distance between the mean colors of swap 2 and swap 3, and swap 2 and swap 4 are also high, above 56, indicating significant dissimilarity. The mean colors of swap 1 and swap 3, and swap 1 and swap 4 are distinguishable, with a color distance of about 20. However, the mean colors of swap 3 and swap 4 are very similar, with a color distance of only 5.2. In other words, swap 1 and swap 2 are the most different in terms of color, while swap 3 and swap 4 are the most similar. The remaining pairs are somewhere in between. The mean GLCM (Grey Level Co-occurrence Matrix) energy of swap 1 (0.3764) is the highest, followed by swap 4 (0.3154), swap 3 (0.3388), and swap 2 (0.0566). This indicates that swap 1 is the most homogeneous texture, while swap 2 is the most heterogeneous texture. Swaps 3 and 4 have similar GLCM energy indices, suggesting that they have similar texture homogeneity. Furthermore, the AI model can correctly classify 99% of the images in the benchmark, and its performance has remained consistently high across all metrics. This suggests that the model has the potential to be used in a variety of real-world applications, such as medical imaging, product inspection, and security surveillance.
This AI model has been deployed in an Android mobile app “PaddyVision” (Fig.3). The AI runs strictly on the user’s mobile device, therefore removing the need for an internet connection and sharing data. A user can simply upload an image from the internal storage of the device and get quick feedback.
Table 2: Paddy leaf images of features positively associated with the nitrogen deficiency category | ||||||
Sr. No. | Input data ID* (Kaggle) | N-deficiency Category | Mean color hex code | Surface Area in square pixels | Perimeter in pixels | GLCM Energy (texture intensity) |
1. | SWAPT1_001 | Swap 1 | 3523.5 | 269.9 | 0.297 | |
2. | SWAPT1_025 | Swap 1 | 3523.5 | 269.9 | 0.263 | |
3. | SWAPT1_050 | Swap 1 | 5298.0 | 309.7 | 0.467 | |
4. | SWAPT1_075 | Swap 1 | 4748.0 | 306.9 | 0.313 | |
5. | SWAPT1_100 | Swap 1 | 4575.0 | 299.8 | 0.542 | |
6. | SWAPT2_001 | Swap 2 | 1292.0 | 227.0 | 0.044 | |
7. | SWAPT2_025 | Swap 2 | 1263.0 | 226.7 | 0.053 | |
8. | SWAPT2_050 | Swap 2 | 0786.5 | 217.1 | 0.026 | |
9. | SWAPT2_075 | Swap 2 | 1278.0 | 229.3 | 0.116 | |
10. | SWAPT2_100 | Swap 2 | 1292.0 | 227.0 | 0.044 | |
11. | SWAPT3_001 | Swap 3 | 5026.5 | 300.2 | 0.499 | |
12. | SWAPT3_025 | Swap 3 | 2609.0 | 253.7 | 0.063 | |
13. | SWAPT3_050 | Swap 3 | 4346.5 | 291.6 | 0.500 | |
14. | SWAPT3_075 | Swap 3 | 4782.5 | 295.1 | 0.392 | |
15. | SWAPT3_100 | Swap 3 | 4232.0 | 296.4 | 0.240 | |
16. | SWAPT4_001 | Swap 4 | 4055.0 | 294.8 | 0.293 | |
17. | SWAPT4_025 | Swap 4 | 4286.0 | 295.9 | 0.217 | |
18. | SWAPT4_050 | Swap 4 | 4691.5 | 316.1 | 0.314 | |
19. | SWAPT4_075 | Swap 4 | 4508.0 | 303.3 | 0.341 | |
20. | SWAPT4_100 | Swap 4 | 4109.5 | 284.2 | 0.412 | |
Fig.2 Confusion matrix summarizes the results of model evaluation. Random selection of 100 actual images from each nitrogen deficiency category (swap1, swap2, swap3, swap4) evaluated against the model predicted category. While swap4 indicates one of the extremes of the N-deficiency gradient with ‘sufficient Nitrogen’, swap1 represents ‘severe nitrogen deficiency’.
For evaluating the AI model, the confusion matrix as shown in Figure 2 has been utilized. The first evaluating parameter calculated is accuracy:
(1)
The net accuracy of the model is calculated using equation (1). The AI model has achieved an overall accuracy of 0.99 on a benchmark of 400 images. The second evaluating parameter is precision which is calculated using Equation 2:
(2)
In this equation, TP and FP denote ‘True Positive’ and ‘False Positive’ respectively. True cases are scenarios where the observed data matches the predicted data. False cases are scenarios where the observed data does not match the predicted data. Positive cases denote prediction is true, whereas negative denotes prediction is false. The AI model has scored highest precision in swap 1(1.00) and swap 3(1.00), followed closely by swap 4( 0.99), and swap 2( 0.97). Despite this metric accounting for ‘False Positives’ it fails to account for the ‘False Negative’ scenarios.
The next evaluation metric considered is ‘recall’. This metric considers ‘False Negative’ scenarios. To calculate recall equation 3 is used:
(3)
The recall of the AI model is 1.00 in swap 1, swap 2, and swap 3. However, the AI model scored 0.4 lower in the swap 3 class. However, this metric has a major drawback. It fails to account for ‘False Positives’.
To know if our model’s performance is good, we need both of these two measures: Recall and Precision. Therefore, the F1 score has been calculated:
(4)
An ideal situation is when the F1 score is 1. In the case of this AI model, the F1 score is ideal for swap 1 and swap 2, followed closely by swap 2(0.99) and swap 3 (0.98). Table 3 summarizes the model performance evaluated with the use of the metrics (2,3 and 4).
Fig.3 Snapshots of the “PaddyVision” Android mobile app. This app accepts the input of paddy leaf image captured by the mobile camera and then it predicts its nitrogen deficiency/sufficiency category (i.e., swap1, swap2, swap3, and swap4). Moreover, by integrating additional data sources, such as weather and soil quality information, this technology can be improved further to holistically monitor crop health.
(PaddyVision app download link:https://play.google.com/store/apps/details?id=com.paddyvision.paddyvisionnpkv3
PaddyVision AI code repository:https://github.com/AmartyaChakraborty/PaddyVisionAI )
CONCLUSIONS
The use of Convolutional Neural Networks (CNNs) for rice crop monitoring represents a significant advancement in agriculture. It offers a practical solution to a critical issue – the deficiency of nitrogen in rice cropping systems. Traditional monitoring methods, such as soil analysis and leaf tissue examination, have several limitations, including invasiveness, lack of real-time data, and inaccessibility. The proposed CNN-based approach, exemplified by the PaddyVision mobile app, addresses these concerns by allowing farmers to non-invasively and efficiently monitor their rice crops’ health and nutrient status. By analyzing visual characteristics such as leaf color, size, and shape, the CNN system can provide valuable insights. This democratizes advanced monitoring technology, making it accessible even to small-scale farmers in remote areas. Moreover, by integrating additional data sources, such as weather and soil quality information, this technology can offer a holistic view of crop health. Collaboration with agricultural experts and local organizations is essential for tailoring the app to specific needs and ensuring its effectiveness. In conclusion, the CNN-based rice crop monitoring approach is poised to improve agriculture practices by enhancing yield quality, reducing costs, and promoting sustainable practices.
ACKNOWLEDGEMENT
I thank all my teachers and classmates at the KV School Gangtok for their support while completing computational work in the computer lab. I highly appreciate Mr. Digbijay Mahto for assisting me with the manuscript preparation and pointing out some important corrections. I also thank our school principal for extending all kinds of support for executing such an exciting school project.
REFERENCES
“Rice Outlook: May 2023” United States Department of Agriculture (USDA), Foreign Agricultural Service.. Accessed October 13, 2023. https://www.ers.usda.gov/webdocs/outlooks/106554/rcs-23d.pdf?v=2560.5
Good, A. G., and P. R. Sheard. “Nitrogen Loss from Agricultural Systems: Effects on the Environment.” Agricultural and Environmental Science 5, no. 2 (2000): 231–56.
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. Cambridge, MA: MIT Press, 2016.
Wang, Z., Y. Wang, W. Li, L. Gao, and X. Yang. “Leaf nutrient deficiency detection of rice plant based on convolutional neural network.” Sensors 19, no. 10 (2019): 2389.
Lobell, David B., Wolfram Schlenker, and Justin Costa-Roberts. “Climate change and crop yields: Are agricultural impacts emerging earlier in the 21st century?” Proceedings of the National Academy of Sciences 104, no. 32 (2007): 12369-12374.
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep Residual Learning for Image Recognition.” arXiv preprint arXiv:1512.03385 (2015).
Indian Council of Agricultural Research – National Rice Research Institute. “Leaf Colour Chart for Nitrogen Management in Rice.” Cuttack, Odisha, India: ICAR-NRRI, 2019
Interesting