Название: Machine Learning Approach for Cloud Data Analytics in IoT
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Программы
isbn: 9781119785859
isbn:
The number of observations in the considered dataset is 51,290. The considered retail store broadly deals in three types of products, viz., office supplies, technology, and furniture.
First of all, authors attempt to understand the correlation among various features of the dataset. Similarly, authors employ Pearson’s correlation that signifies the measure of correlation between two variables. The value lies between −1 and +1. Here, negative value indicates negative linear correlation; 0 signifies no correlation and +1 indicates the positive linear correlation. The Pearson’s correlation among various attributes of the dataset is shown in Figure 3.4.
Further, authors would like to demonstrate how this dataset can be used to understand its chunk of customers across the country. This helps retailer to understand that its largest market share lies in the country and thus enables it to focus in the weaker market section. It can be performed by region-wise analysis as shown in Figure 3.5. The figure shows the histogram plot for frequency of customers across various states in India. From Figure 3.5, it is evident that Maharashtra has the highest number of customers in the country followed by the Uttar Pradesh. On the contrary, places like Manipur, Tripura, Chandigarh, and Pondicherry have the lowest number of customers.
The analysis can further be drilled down to find best and worst performing city in a state so as to exactly identify the specific region or branch. Such drilled down histogram is shown in Figure 3.6. For Maharashtra, it shows that the top performing cities in the state are Mumbai, Pune, Thane, and Nagpur.
Figure 3.4 Pearson’s correlation among various attributes of dataset.
Figure 3.5 Histogram plot for the frequency of customers in country level (India).
Further, it is evident from above two graphs that Mumbai has the highest number of customers. Hence, further the retailer is interested to find which the best performing product in the city is. Therefore, retailer is interested to find the histogram along the product dimension. Similarly, it is evident that office supply category is the most in the city as shown in Figure 3.7. Further, within the office category, the sub-category which has highest demand is storage supplies and labels supplies followed by the art supplies and other stationary products such as envelopes, binders, and papers. This histogram plot along the product dimension is shown in Figure 3.7.
Figure 3.6 Histogram plot for the customers’ frequency at city level in Maharashtra.
Additionally, box plot represents minimum, maximum, and median of sales in each category of every segment. The highest median for technology category is from consumer segment as shown in Figure 3.8. Similarly, the highest median for furniture category is from the corporate segment. Home office segment has the maximum sales in office supplies, and the highest median for the office supplies is from consumer segment.
Figure 3.7 Histogram plot for Mumbai along the product dimension.
Figure 3.8 Box plot for products across consumer segment.
In order to analyze the day that observes highest and minimum sale, authors suggest usage of pivot table as shown in Figure 3.9. From Figure 3.9, it is evident that every Saturday of August from 2011 to 2015 experiences maximum sale. However, the minimum sale is recorded on every Monday of November from 2011 to 2015. This gives an idea to retail to have an idea of its sales forecast.
Finally, the heatmap in Figure 3.10 shows the sales of various countries across the globe. From Figure 3.10, it is clear that United States records maximum sale in comparison to any other country. It is followed by sales of France and Australia. This analysis helps the retail industry to understand that there is a huge potential for increasing in sales in Southeast Asian Region and also in Oceania.
Figure 3.9 Pivot table.
Figure 3.10 Heatmap of the world.
Thus, from the above case study, it is clear that data analytics can be quite helpful for a retail industry, and thus, it has a huge potential in retail apart from various promising fields.
3.5 Conclusion and Future Scope
This chapter has discussed the potential and capability of ML approaches for predictive data analytics in the retail industry. Various models have also been discussed briefly. Few use cases have been presented to give readers a clear idea about the spectrum of its application in the retail industry. Although it has observed widespread applications, it still bears some challenges. These challenges as discussed above must be addressed by taking the research ahead.
First and foremost, researchers must work in the direction of maintaining security and privacy of data as data is the most precious asset for any organization. Work should also be done in the direction of conceptualizing usage of big data so as to benefit retailers and customers. The research must be taken ahead in the direction of efficient customized promotions that basically sends promotional messages for a specific product to a specific customer at specific time. Implementation of customized promotion will further enhance the revenue СКАЧАТЬ