COMPARING BIG DATA ANALYTIC TOOLS USING MUSIC DATASET

R. I. Bektemirov; U. T. Nurkey

COMPARING BIG DATA ANALYTIC TOOLS USING MUSIC DATASET

R. I. Bektemirov, U. T. Nurkey

Full Text:

PDF (Eng)

Generate QR code

Abstract

A huge repository of petabytes of data is generated each day from modern information systems and digital technologies such as scientific data analysis, social media data mining, recommendation systems, and analysis on web service logs.The data has a huge power to directly guide us to knowledge detection. Big data in turn requires whole new approach and tools to handle it. Analysing these massive data requires a lot of efforts to extract knowledge for decision making. Huge volumes of data and its unstructured nature raise new challenges and issues regarding its management and processing. This paper covers some of the most popular tools for analyzing big data. Hadoop, Spark and Pig are major and modern tools in big data analytics. Thus and so these tools were chosen for comparison. Results of this research show that various tasks require different tools and there is no all-in-one solution. Any big data problems stand in need developers to use proper tool to make job done in a way better and quicker.

Keywords

big data, Hadoop, Spark, Pig, comparison of big data platforms

About the Authors

R. I. Bektemirov

Университет им. Сулеймана Демиреля
Kazakhstan

U. T. Nurkey

Университет им. Сулеймана Демиреля
Kazakhstan

References

1. Agneeswaran V. S., Tonpay P., Tiwary J. (2013) Paradigms for realizing machine learning algorithms. Big Data 1 (4) : 207-214

2. https://www.kaggle.com/

3. Lee K.-H., Lee Y.-J., Choi H., Chung Y. D., Moon B. (2012) Parallel data processing with MapReduce: a survey. ACM SIGMOD Record 40 (4) : 11-20

4. Big Data Analysis: Comparison of Hadoop MapReduce, Pig and Hive. Available from: https://www.researchgate.net/publication/308074477_Big_Data_Analysis_Comparision_of_Hadoop_MapReduce_Pig_and_Hive

5. MapReduce vs. Pig vs. Hive - Comparison between the key tools of Hadoop, Available article from: https://www.dezyre.com/article/mapreduce-vs-pig-vs-hive/163

6. Dilpreet Singh and Chandan K. Reddy, “A Survey on Platforms for Big Data Analytics”, Journal of Big Data, 1:1, 8, 2014.

7. https://www.scnsoft.com/blog/spark-vs-hadoop-mapreduce

8. https://dzone.com/articles/hadoop-vs-spark-a-head-to-head-comparison

9. https://www.todaysoftmag.com/article/1553/finding-similar-entities-in-bigdata-models

10. https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-jaccard/

11. Szmit R. (2013) Locality Sensitive Hashing for Similarity Search Using MapReduce on Large Scale Data. In: Klopotek M. A., Koronacki J., Marciniak M., Mykowiecka A., Wierzchon S. T. (eds) Language Processing and Intelligent Information Systems. IIS 2013. Lecture Notes in Computer Science, vol. 7912. Springer, Berlin, Heidelberg

12. C. Sadowski and G. Levin. Simhash: Hash-based Similarity Detection. Technical report, Technical report, Google, 2007.

13. Tom Kenter , Maarten de Rijke, Short Text Similarity with Word Embeddings, Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, October 18-23, 2015, Melbourne, Australia

Review

For citations:

Bektemirov R.I., Nurkey U.T. COMPARING BIG DATA ANALYTIC TOOLS USING MUSIC DATASET. Herald of the Kazakh-British Technical University. 2019;16(4):97-104.

JATS XML

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1998-6688 (Print)
ISSN 2959-8109 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Herald of the Kazakh-British Technical University

COMPARING BIG DATA ANALYTIC TOOLS USING MUSIC DATASET

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy