An Efficient Clustering Technique for Big Data Mining

Satish S. Banait; Dr. S. S.  SANE

doi:10.47164/ijngc.v13i3.842

Published Oct 31, 2022

https://doi.org/10.47164/ijngc.v13i3.842

Download

PDF

Statistic

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Volume 13, Special Issue 3, October 2022

Satish S. Banait

Department of Computer Engineering, KKWIEER, Nashik, SPPU Pune

Dr. S. S. SANE

Abstract

Data mining and big data analytics are approaches for analyzing data and extracting hidden information. Because big data is complicated and large in volume, traditional techniques to analysis and extraction do not function effectively. Data clustering is a common data mining approach that divides data into groups and makes it simple to extract information from them. Big data can include both organized and semi structured information, and it's becoming increasingly beneficial for companies. Examples include old organized database of inventory level, transactions, and consumer information, as well as non - structured comprehension from the internet, social media platforms, and embedded systems. Numerous schemes have been developed to reach the needed in relation to efficiency and effectiveness, and much study has been committed to Big Data analytics. Nevertheless, a few methodologies, such as clustering algorithms, require further research in regards to performance, usefulness, and other factors, leading to the development of a model which gives proper Big Data Analytics assessment and the impactful use of this methodology to retrieve relevant knowledge. We recorded and analyzed several big data sets in our proposed work, as well as discovered relevant current approaches. In this paper we proposed a new clustering technique using dimensionality reduction approach. For implementation of this work, we used real time streaming data in unstructured form and noisy sometimes. The proposed hybrid clustering techniques that improve the clustering accuracy as well as time for generate effectives clusters on large unstructured data. We confirm the findings by testing the suggested methodology on available information sets and comparing and analyzing the effectiveness of the developed system with that of current systems.

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Banait, S. S., & SANE, D. S. S. . (2022). An Efficient Clustering Technique for Big Data Mining. International Journal of Next-Generation Computing, 13(3). https://doi.org/10.47164/ijngc.v13i3.842

References

Ankita Saldhi, A. G. e. a. 2014. Big data analysis using hadoop cluster. IEEE. DOI: https://doi.org/10.1109/ICCIC.2014.7238418
Anuradha, G. and Roy, B. 2014. Suggested techniques for clustering and mining of data streams. International Conference on Circuits, Systems, Communication and Information Technology Applications. IEEE. DOI: https://doi.org/10.1109/CSCITA.2014.6839270
Arora, S. and Chana, I. 2014. A survey of clustering techniques for big data analysis. IEEE. pp.391–397. DOI: https://doi.org/10.1109/CONFLUENCE.2014.6949256
Bin, N. 2018. Research on methods and techniques for iot big data cluster analysis. In Interna- tional Conference on Information Systems and Computer Aided Education. ICISCAE, pp. 51–60. IEEE. DOI: https://doi.org/10.1109/ICISCAE.2018.8666889
Bina Kotiyal, A. K. 2020. Big data: Mining of log file through hadoop. International Con- ference on Circuits, Systems, Communication and Information Technology Applications. IEEE.
Bordogna, G. and Frigerio, L. 2016. Clustering geo-tagged tweets for advanced big data analytics. International Congress on Big Data, IEEE Vol.12, No.4 (May), pp. 697–701. IEEE. DOI: https://doi.org/10.1109/BigDataCongress.2016.78
Charalampos Chelmis, J. K. e. a. 2015. Big data analytics for demand response: Clustering over space and time. International Conference on Big Data (Big Data) Vol.2, No.1, pp. 36–54. SP. DOI: https://doi.org/10.1109/BigData.2015.7364011
Dajung Lee, A. e. a. 2017. A streaming clustering approach using a heterogeneous system for big data analysis. IEEE Vol.4, No.4, pp. 57–71.
Qureshi, S.R. and Gupta, A., 2014, March. Towards efficient Big Data and data analytics: A review. In 2014 Conference on IT in Business, Industry and Government (CSIBIG) (pp. 1-6). IEEE. DOI: https://doi.org/10.1109/CSIBIG.2014.7056933
Dave, D. M. and Gianey., R. 2016. Different clustering algorithms for big data analytics: A review. 5th International Conference on System Modeling and Advancement in Research Trends Vol.2, No.1, pp. 36–54. IEEE. DOI: https://doi.org/10.1109/SYSMART.2016.7894544
Disha D N, S. e. a. 2016. An efficient framework of data mining and its analytics on massive streams of big data repositories,. In Journal Of Information Security And Applications., IEEE, Ed. IEEE, pp. 8–12. DOI: https://doi.org/10.1109/DISCOVER.2016.7806259
Doaa.Sayed, S. e. a. 2020. Enhancing clustream algorithm for clustering big data streaming over sliding window. IEEE. ieee. DOI: https://doi.org/10.1109/ICEENG45378.2020.9171705
Dr. Anu Saini, J. M. e. a. 2016. New approach for clustering of big data: Disk-means. In- ternational Conference on Computing, Communication and Automation. Gen 15693:14443 (Oct), pp. 2–7. ICCCA. DOI: https://doi.org/10.1109/CCAA.2016.7813702
et. al., B. S. G. 2020. The survey on approaches to efficient clustering and classification analysis of big data. IEEE Vol.1, No.1, pp. 88–92.
et. al., L. R. S. 2015. Challenges with big data mining: A review. International Conference on Soft-Computing and Network Security. IEEE.
et. al.., P. V. N. 2020. New approach in big data mining for frequent itemset using mapreduce in hdfs. 3rd International Conference for Convergence in Technology. I2CT.
et. al., S. G. 2017. Survey on big data analytics for digital world. International Conference on Advances in Electronics, Communication and Computer Technology. ICAECCT.
et. al., S. S. 2020. Paper review on data mining ,components, and big data. IEEE. ieee.
Fadia Alaeddin, A. e. a. 2020. An overview on big data mining using evolutionary techniques. International Conference on Innovation and Intelligence for Informatics, Computing and Technologies Vol., pp.4–8. DOI: https://doi.org/10.1109/3ICT51146.2020.9312016
Galina Chernyshova, G. S. e. a. 2016. Technique of cluster validity for text mining. IEEE. DOI: https://doi.org/10.1109/CONFLUENCE.2016.7508139
Gheid, Z. and Challal, Y. 2016. Efficient and privacy-preserving k-means clustering for big data mining. IEEE TrustCom/BigDataSE/ISPA. IEEE. DOI: https://doi.org/10.1109/TrustCom.2016.0140
Giannis Spiliopoulos, K. e. a. 2017. Knowledge extraction from maritime spatiotemporal data: An evaluation of clustering algorithms on big data. International Conference on Big Data (BIGDATA) IEEE Vol.1, No.1, pp. 109–1161. DOI: https://doi.org/10.1109/BigData.2017.8258106
Han, J. and Luo, M. 2014. Bootstrapping k-means for big data analysis. In Bootstrapping K-means for Big data analysis. IEEE, pp.9–15. DOI: https://doi.org/10.1109/BigData.2014.7004279
Huang, X. and Gong., S. 2017. Analysis of big-data based data mining engine. IEEE. DOI: https://doi.org/10.1109/CIS.2017.00043
Ishwank Singh, A. S. S. e. a. 2016. Student perfoemance analysis using clustering algorithm. IEEE. IEEE. DOI: https://doi.org/10.1109/CONFLUENCE.2016.7508131
Kogge., P. M. 2013. Big data, deep data, and the effect of system architectures on performance. IEEE Vol.12, No.1 (August), pp. 7–18. IEEE. DOI: https://doi.org/10.1109/CTS.2013.6567201
Lu, L. Y. Y. and Liu., J. S. 2020. The major research themes of big data literature. Interna- tional Conference on Computer and Information Technology. IEEE.
Maitrey, S. and Jha, C. 2015. Handling big data efficiently by using map reduce technique. International Conference on Computational Intelligence and Communication Technology. IEEE. DOI: https://doi.org/10.1109/CICT.2015.140
Mishra, S. and Misra, D. A. 2017a. Structured and unstructured big data analytics. Inter- national Conference on Current Trends in Computer, Electrical, Electronics and Commu- nication Vol.2, IEEE.
Mishra, S. and Misra, D. A. 2017b. Structured and unstructured big data analytics. IEEE Internet of Things Jou International Conference on Current Trends in Computer, Electrical, Electronics and Communication rnal. pp. 15-26. DOI: https://doi.org/10.1109/CTCEEC.2017.8454999
Neha Bharill, A. e. a. 2016. Fuzzy based scalable clustering algorithms for handling big data using apache spark. Proceedings of 16th IEEE International Colloquium on Signal Processing and Its Applications. IEEE. DOI: https://doi.org/10.1109/BigDataService.2016.34
R, S. and R, S. K. 2017. Data mining with big data. International Conference on Intelligent Systems and Control. pp. 1-8.
R.P.S.Manikandan and Kalpana, D. A. 2017. A study on feature selection in big data. In- ternational Conference on Computer Communication and Informatics (ICCCI),. pp.91–97. DOI: https://doi.org/10.1109/ICCCI.2017.8117751
S. Dhanasekaran, R. S. e. a. 2019. Enhanced map reduce techniques for big data analytics based on k-means clustering. IEEE. IEEE. DOI: https://doi.org/10.1109/INCOS45849.2019.8951368
Shafiq., M. O. 2016. Event segmentation using mapreduce based big data clustering. Interna- tional Conference on Big Data (Big Data). IEEE. DOI: https://doi.org/10.1109/BigData.2016.7840804
Tampakis, P. 2020. Big mobility data analytics: Algorithms and techniques for efficient trajec- tory clustering. IEEE International Conference on Mobile Data Management (MDM) Vol., IEEE. DOI: https://doi.org/10.1109/MDM48529.2020.00055
W, A. V. and Kumar., L. D. 2016. Big data and clustering algorithms. , International Conference on Research Advances in Integrated Navigation Systems. RAINS.
Zhuang, Y. 2016. Symmetric repositioning of bisecting k-means centers for increased reduction of distance calculations for big data clustering. International Conference on Big Data (Big Data). IEEE. DOI: https://doi.org/10.1109/BigData.2016.7840916

About Journal

##plugins.themes.academic_pro.article.sidebar##

Downloads

Metrics

##plugins.themes.academic_pro.article.main##

Abstract

##plugins.themes.academic_pro.article.details##

References