..............................
            ..............................
            ..............................
            
Data Deduplication for Efficient Cloud Storage and Retrieval
        
        Cloud  services  provide  flawless  service  to  the  client  by  increasing  the  geographic  availability  of  the  data. 
Increasing availability of data induces high amount of redundancy and large amount of space required to store that data. Data 
compression techniques can reduce the  amount of space required for that data to be store  at various sites. Data compression 
will  ensure  that  there  is  no  loss  of  availability  and  consistency  at  any  site. As  there  is  huge  demand  for  cloud  services  and 
storage  due  to  this  the  amount  of  investment  also  increases.  By  using  data  compression  we  can  reduce  the  amount  of 
investment required and this will also decrease the amount of physical space and data centers required to store data. Various 
security  protocols  can  be  incorporated  to  secure  these  compressed  files  at  various  sites.  We  provide  a reliable  technique  to 
store deduplicates and its management in a secure manner to accomplish high consistency as well as availability.    
            [1] Biggar H., “Experiencing Data De-Duplication: Improving Efficiency and Reducing Capacity Requirements,” The Enterprise Strategy Group, pp. 902-906, 2012.
[2] Castiglione A., Pizzolante R., De Santis A., Carpentieri B., Castiglione A., and Palmieri F., “Cloud-Based Adaptive Compression and Secure Management Services for 3D Healthcare Data,” Future Generation Computer Systems, vol. 43-44, pp. 120-134, 2014.
[3] Chu X., Ilyas I., and Koutris P., “Distributed Data Deduplication,” Proceedings of the VLDB Endowment, vol. 9, no. 11, pp. 864-875, 2016.
[4] Dolan M., Kochan L., Ram T., Rohr S., Tu K., and Miller S., Patent No. US20160292048, Retrieved from https://www.google.com/patents/US2016029204 8, Data Deduplication Using Chunk Files, Google Patent, Last Visited, 2016.
[5] Douceur J., Adya A., Bolosky W., Simon D., and Theimer M., “Reclaiming Space from Duplicate _Les in A Serverless Distributed _Le System,” in Proceedings of 22nd International Conference on Distributed Computing Systems, Vienna, pp. 617-624, 2002.
[6] Demystifying Data Reduplication: Choosing the Best Solution, FalconStor Software, White Paper Dynamic Solutions International, https://www.varinsights.com/doc/demystifying- data-deduplication-choosing-0002, Last Visited, 2017.
[7] Eastlake D. Jones P., White paper: Description of SHA-1, http://tools.ietf.org/html/rfc3174, Last Visited, 2017.
[8] Estes J., Patent No. US20140258245, Retrieved from https://www.google.ch/patents/US20140258245, Efficient Data Deduplication, Last Visited, 2014.
[9] Harnik D., Pinkas B., and Shulman-Peleg A., “Side Channels in Cloud Services, the Case of Deduplication in Cloud Storage,” IEEE Security and Privacy Magazine, vol. 8, no. 6, pp. 40-47, 2010.
[10] Jiang T., Chen X., Wu Q., Ma J., Susilo W., and Lou W., “Secure and Efficient Cloud Data Deduplication with Randomized Tag,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 3, pp. 532-543, 2017.
[11] Karp R. and Rabin M., “Efficient Randomized Pattern-Matching Algorithms,” IBM Journal of Research and Development, vol. 31, no. 2, pp. 249-260, 1987.
[12] Kleppmann M., A Critique of the CAP Theorem, http://arxiv.org/abs/1509.05393v2, Last Visited, 2017. Data Deduplication for Efficient Cloud Storage and Retrieval 927
[13] Leesakul W., Townend P., and Xu J., “Dynamic Data Deduplication in Cloud Storage,” Service Oriented System Engineering (SOSE), in Proceedings of IEEE 8th International Symposium on Service Oriented System Engineering, Oxford, 2014.
[14] Luo S., Zhang G., Wu C., Khan S., and Li K., “Boafft: Distributed Deduplication for Big Data Storage in the Cloud,” IEEE Transactions on Cloud Computing, pp. 1-1, 2015.
[15] Meyer D. and Bolosky W., “A Study of Practical Deduplication,” ACM Transactions on Storage, vol. 7, no. 4, pp. 14, 2012.
[16] Nelson M. and Gailly J., the Data Compression Book, M&T Books, 1991.
[17] Ngo D. and Muller M., Patent No. US8930306B1, Retrieved from https://www.google.com/patents/US8930306, Synchronized Data Deduplication, Google Patent, Last Visited, 2015.
[18] Park D., Fan Z., Nam Y., and Du D., “A Lookahead Read Cache: Improving Read Performance for Deduplication Backup Storage,” Journal of Computer Science and Technology, vol. 32, no. 1, pp. 26-40, 2017.
[19] Patterson R., Reddy S., Prabhakaran V., Smith G., Bairavasundaram L., and Venkitachalam G., “System and Methods for Storage Data Deduplication,” U.S. Patent No. 20,170,031,994, 2017.
[20] Puzio P., Molva R., Önen M., and Loureiro S., “PerfectDedup: Secure Data Deduplication,” in Proceedings of 10th International Workshop on Data Privacy Management, and Security Assurance, Vienna, pp. 150-166, 2015.
[21] Qinlu H., Zhanhuai L., and Xiao Z., “Data Deduplication Techniques,” in Proceedings of International Conference on Future Information Technology and Management Engineering, Changzhou, 2010.
[22] Ram T., Patent No.US20140095439, Retrieved from https://www.google.com/patents/US20140095439 Optimizing Data Block Size for Deduplication, Google Patent, Last Visited, 2014.
[23] Rehman A. and Saba T., “An Intelligent Model for Visual Scene Analysis and Compression,” The International Arab Journal of Information Technology, vol. 10, no. 13, pp. 126-136, 2013.
[24] Sayood K., Introduction to Data Compression, Morgan Kaufmann, 2006.
[25] Shin Y., Koo D., and Hur J., “A Survey of Secure Data Deduplication Schemes for Cloud Storage Systems,” ACM Computing Surveys, vol. 49, no. 4, pp. 74, 2017.
[26] Slater A. and Pelly S., Patent No.US20110184908, Retrieved fromhttps://www.google.si/patents/US201101849 08, Selective Data Deduplication, Google Patent, Last Visited, 2011.
[27] Stanek J., Sorniotti A., Androulaki E., and Lukas K., “A Secure Data Deduplication Scheme for Cloud Storage,” in Proceedings of International Conference on Financial Cryptography and Data Security, Christ Church, pp. 99-118, 2014.
[28] Storer M., Greenan K., Long D., and Miller E., “Secure Data Deduplication,” in Proceedings of the 4th ACM international Workshop on Storage Security and Survivability, Alexandria, pp. 1-10, 2008.
[29] Xia W., Jiang H., Feng D., Hua Y., “Similarity and Locality Based Indexing for High Performance Data Deduplication,” IEEE Transactions on Computers, vol. 64, no. 4, pp.1162-1176, 2015. Rishikesh Misal graduated from University of Mumbai with a bachelor’s degree in Computer Engineer in 2015. He completed his Master’s in Computer Science and Engineering from VIT University, Vellore. He has been working at General Electric for the past 1 year as a Software Engineering Specialist. His professional works are based on building Cloud applications for IoT based scenarios. His research work interests include Distributed Systems, Cloud Computing, System Programming and Compiler Construction. Boominathan Perumal is an Associate Professor working in VIT University, Vellore, India. He received his B.E in Computer science and Engineering from Barathidasan University, Tirchy, India, M.E in omputer Science and Engineering from Anna University, India and he received his Ph.D. from VIT University, Vellore, India.He has 12 years of teaching experience. He has good number of publications in reputed conference proceedings and journals. His research interests include cloud computing, Network Security, and Evolutionary optimization, etc.
