Study of Scalable Deep Neural Network for Wildlife Animal Recognition and Identification

Study of Scalable Deep Neural Network for Wildlife Animal Recognition and Identification

Chapter One

Aim of the research

This thesis aims to provide a scalable, suitable, more generic, and optimized network capable of processing huge amounts of datasets even with images having an imperfect quality or varied deformations in real-time while preserving better test accuracy.

Objectives of the research

Having at hand the different views of people as regards to what learning seems to be and how to attain it. One can perceive how challenging it is to interpret deep learning and even to set out some clear objectives. Although the concept of learning has cleared the air despite that the approach to deep learning by different people differs. The aim of this research is as follows:

Develop an artificial learning system capable of being adaptive and self- improving
Develop a neural network with optimized parameters whose computational performance is unaffected by
Develop a neural network system architecture with reduced complexity for large scale image classification or prediction.

Chapter Two

Literature review

Introduction

In this section, we present supporting theories of the research concept following brief introduction of deep learning concept and learning, forming a link with a classification problem, then give a brief account of the differing digital image classification approaches ranging from per pixel classification to object-oriented classification. Two best classification approaches will be examined, and finally, a brief account of similar works done will follow.

Basic Concept and Terminology

Machine learning is a branch of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the construction and study of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs to make data-driven predictions or decisions rather than following procedural program instructions. Machine learning is most at times, often overlaps with computational statistics; a discipline that also specializes in prediction-making. It has strong ties to mathematical optimization, which deliver methods, theory, and application domains to the field. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms are infeasible. Example applications include spam filtering, optical character recognition (OCR), search engines, and computer vision. Machine learning is sometimes conflated with data mining, although that focuses more on exploratory data analysis. Machine learning and pattern recognition “can be viewed as two facets of the same field.” (Machine Learning Wikipedia full guide, 2017)

Digital image classification

Image classification can be said to be a process of assigning all pixels in the image to particular classes or themes based on spectral information represented by the digital numbers (DNs). The classified image comprises a mosaic of pixels, each of which belongs to a particular theme and is a thematic map of the original image (Anupam Anand, 2018). The main steps of image classification as shown in figure 2.2 may include image pre-processing, feature extraction, training samples selection, selection of suitable classification approaches, post-classification processing, and assessment accuracy (黄正华, 2014). However, Classification will be executed on the base of spectral or spectrally defined features, such as density, texture, etc., in the feature space. It can be said that classification divides the feature space into several classes based on a decision rule (黄正华, 2014). There are basically two approaches to image classification, namely; per pixel image classification and object-oriented classification. Per pixel is the most commonly adopted method as the algorithm categorizes each input pixel into a spectral feature class based solely on its multispectral vector. No context or neighborhood evaluation is involved (Shrivastav & Singh, 2019) while in object-oriented classification, the input pixels are grouped into spectral features (objects features) using image segmentation. These objects are characterized in both the raster and vector domains. The objects are classified using both spectral and spatial cues (Shrivastav & Singh, 2019).

Chapter Three

Design and Methodology

Introduction

This chapter introduces the methodology used with the algorithm adopted and the proposed framework made to achieve the objectives. The chosen algorithm has been selected out of a large list of available ones for several important reasons. The main one is the fact that it has been used by many machine learning researchers and has contributed immensely to the field of computer vision. As a result, they become popular among machine learning researchers. Another reason is that recognition using the algorithm is rugged to distortions, such as a change in shape due to the camera lens and different lighting conditions.

Concept of Classification Technique

A classification technique is a systematic approach to building classification models from an input dataset. Examples include decision trees classifiers, neural networks, support vector machines, and naïve Bayes classifier. Each technique employs a learning algorithm to identify a model that best fits the relationship between the attributes set and class label of input data. The model generated by a learning algorithm should both fit the input data well and correctly predicts the class label of records it has never seen before (Pang-Ning Tan et al.). The general approach for solving a classification problem requires firstly a training set consisting of records whose class labels are known to be provided. Secondly, a test set to whose records of class labels are not known to be applied to the classification model generated by the training set.

Design and requirement phase

There are two ways of solving AI (Artificial intelligence) relate problems namely; hardware and software methods but preference is always giving to software design aspect as it is relatively cheap and does not involve a lot of complexity with regards to its architecture and requirements. Training and learning operations take place in the software design stage. The hardware, like general purpose processors and FPGA serve as an implementation platform for even more complex architectures. The proposed model was implemented using python programming language, particularly in spyder with keras and tensorflow as backend. Tensorflow is an open source deep learning framework created by Google that gives developers coarse control over each neuron so that weights can be adjusted to achieve optimal performance. As the task is relatively not intensive because of the small number of images in the dataset, a GPU was skipped and a CPU core i5 with 2.60GHz processing speed and a dedicated graphics card at the high end was used that can train an average of 94 samples per second.

Chapter four

Results and discussions

Introduction

This chapter presents the results of our proposed design and implementation. Firstly, we examine the overall model accuracy, made comparisons between different architectural design and between different types of images. Secondly, we present the loss and accuracy graph generated together with a summary of the network designed the number of parameters, including the computational time during the training process. Finally, we make a conclusion based on the outcome of our different design models.

Chapter five

Conclusion and future works

Introduction

This chapter summarises the results presented in the thesis and concludes their importance in the context of recognition and identification.

Conclusions

In this research, we proposed a neural network software architecture and performed its evaluation. We employed the use of deep learning techniques to identify wildlife animals while ensuring that a robust system that can generalize our images (from the datasets) is realized. A manually created dataset from raw images fetched from the camera(s), preprocessed using some python libraries was used as inputs to the designed network. Many network architecture design with scalability was performed, but the two most important ones that showed good results was picked for evaluation.

During the evaluation, the proposed architecture with 4 convets and 2 output layer achieved good results while predicting or recognizing wildlife animals. Our approach could be used in both remote and urban areas to help prevent or reduce the number of the animal-vehicle collision, animal-humans attack, and animal crop destruction by detecting the presence of animal so that warning may be issued with a view of safety purpose. This research though, could consolidate other findings made in the recognition of wildlife animals and thus will help show the adaptability of deep CNN to even small datasets in their raw form captured from the camera(s). It will, in turn, lead to the formations of standard design philosophies that will make image recognition algorithms more practicable to solving real-life problems.

Future work

There are several areas this research has not been able to cover due to lack of time chief among other drawbacks and resources (e.g., lack of large datasets, High GPU Processor). However, the following issues are more specifically of interest.

The memory complexity optimization which could be done by changing thefloating point values from the results obtained to fixing points. The goal is to reduce memory complexity and yield faster processing in the network while targeting animal prediction using a convolutional neural
Optimizing Execution time (a function of the number of MAC operations) byreducing the number of operations taking place in the MAC (multiply and accumulate) convolution layer. More so this will also reduce the computation complexity of the network and hence even the power
Accuracy of the network is a function of the amount of dataset fed, and this can be improved and maintained by collecting a large number of datasets which are preprocessed properly or discretized as this will enhance the accuracy of both training and
Further making the network more scalable will make it achieve a good balance between latency, precision (exactness towards prediction) and hardware complexity as these are factors that define how efficient a neural network architecture can perform in hardware(s)

REFERENCES

Alex Krizhevsky, Ilya Sutskever, G. E. H. (2007). ImageNet Classification with Deep Convolutional Neural Networks. Handbook of Approximation Algorithms and Metaheuristics, 60-1-60–16. https://doi.org/10.1201/9781420010749
Anupam Anand. (2018). Unit 13 Image Classification. (May), 41–58.
Guignard, L., & Weinberger, N. (2016). Animal identification from remote camera images. 1–4.
Jacobs, S. A., Dryden, N., Pearce, R., & Van Essen, B. (2017). Towards Scalable Parallel Training of Deep Neural Networks. (Sc 17), 1–9. https://doi.org/10.1145/3146347.3146353
Koprinkova, P., & Petrova, M. (1999). Data-scaling problems in neural-network training. Engineering Applications of Artificial Intelligence, 12(3), 281–296. https://doi.org/10.1016/S0952-1976(99)00008-1

Other Topics