当前位置:首页 > 云计算 > 正文

云计算与大数据技术成功案例(云计算与大数据技术总结)

What is big data and what are the typical cases of big data?
"Big data" is a data set with a very large volume and a very large data category, and such a data set cannot be used in traditional databases Tools capture, manage and process its content. "Big data" first refers to large data volumes (volumes), which refers to large data sets, usually around 10TB in size. However, in practical applications, many enterprise users put multiple data sets together, forming a PB-level data volume; secondly, it refers to the large variety of data. Data comes from a variety of data sources. Data types and formats are becoming increasingly rich. It has broken through the previously limited scope of structured data and includes semi-structured and unstructured data. ized data. Next is the data processing speed (Velocity), which enables real-time processing of data even when the amount of data is very large. The last feature refers to the high veracity of data. With the interest in new data sources such as social data, enterprise content, transaction and application data, the limitations of traditional data sources are broken, and enterprises increasingly need effective information to Ensure its authenticity and safety.
Data collection: ETL tools are responsible for extracting data from distributed and heterogeneous data sources, such as relational data, flat data files, etc., into the temporary middle layer for cleaning, conversion, integration, and finally loading into a data warehouse or data set. In the city, it has become the basis for online analytical processing and data mining.
Data access: relational database, NOSQL, SQL, etc.
Infrastructure: cloud storage, distributed file storage, etc.
Data processing: Natural language processing (NLP, Natural Language Processing) is a discipline that studies language issues in the interaction between humans and computers. The key to processing natural language is to let the computer "understand" natural language, so natural language processing is also called natural language understanding (NLU, NaturalLanguageUnderstanding), also known as computational linguistics (Computational Linguistics). On the one hand, it is a branch of language information processing. On the other hand, it is one of the core topics of artificial intelligence (AI, Artificial Intelligence).
Statistical analysis: hypothesis testing, significance testing, difference analysis, correlation analysis, T test, analysis of variance, chi-square analysis, partial correlation Analysis, distance analysis, regression analysis, simple regression analysis, multiple regression analysis, stepwise regression, regression prediction and residual analysis, ridge regression, logistic regression analysis, curve estimation, factor analysis, cluster analysis, principal component analysis, factor analysis, Fast clustering and clustering methods, discriminant analysis, correspondence analysis, multivariate correspondence analysis (optimal scale analysis), bootstrap technology, etc.
Data mining: Classification, Estimation, Prediction ), correlation grouping or association rules (Affinity grouping or association rules), clustering (Clustering), description and visualization, Description and Visualization), complex data type mining (Text, Web, graphics images, video, audio, etc.)
Model prediction: prediction Models, machine learning, modeling and simulation.
Result presentation: cloud computing, tag cloud, relationship diagram, etc.
To understand the concept of big data, we must first start with "big". "Big" refers to the scale of data. Big data generally refers to the amount of data above 10TB (1TB=1024GB). Big data is different from the massive data of the past. Its basic characteristics can be summarized by four V (Vol-ume, Variety, Value and Velocity), that is, large volume, diversity, low value density, and fast speed.
First, the volume of data is huge. From the TB level to the PB level.
Second, there are many types of data, such as the previously mentioned web logs, videos, pictures, geographical location information, etc.
Third, the value density is low. Taking video as an example, during continuous and uninterrupted monitoring, the potentially useful data is only a second or two.
Fourth, the processing speed is fast. 1 second rule. This last point is also fundamentally different from traditional data mining technology. The Internet of Things, cloud computing, mobile Internet, Internet of Vehicles, mobile phones, tablets, PCs, and various sensors spread across every corner of the earth are all data sources or carrying methods.
Big data technology refers to technology that quickly obtains valuable information from various types of huge amounts of data. The core of solving big data problems is big data technology. The current term "big data" refers not only to the scale of the data itself, but also to the tools, platforms and data analysis systems that collect the data. The purpose of big data research and development is to develop big data technology and apply it to related fields, and promote its breakthrough development by solving huge data processing problems. Therefore, the challenges brought by the big data era are not only reflected in how to handle huge amounts of data
How to combine cloud computing, the Internet of Things and big data
The Internet of Things, cloud computing and big data have been hot topics in the technology and industry circles in the past two years. What does difference mean? What is the relationship between them? Many people are also very interested. After studying and understanding, consulting the information, and sharing some simple understanding and summary with their friends.
Internet of Things
Simple understanding: the Internet where things are connected, that is, the Internet of Things. The Internet of Things, also known as sensor network internationally, is another wave of the information industry after computers, the Internet and mobile communication networks. Everything in the world, from watches and keys to cars and buildings, can be made intelligent by embedding a micro-sensor chip, and the object can "speak automatically." With the help of wireless network technology, people can "talk" to objects, and objects can also "communicate" with each other. This is the Internet of Things. With the development of information technology, the application landscape of the Internet of Things industry continues to grow.
The current Internet of Things industry is composed of five layers: application layer, support layer, perception layer, platform layer and transmission layer.
Cloud Computing
Cloud computing is a pay-per-use model that provides available, convenient, on-demand network access into a configurable shared pool of computing resources (resources include network , servers, storage, application software, services), these resources can be provided quickly with little management effort or minimal interaction with service providers.
Classic application case: Apple icloud
Apple icloud is not only a cloud drive, it allows you to easily access all content on all your Apple devices, and automatically synchronizes files, pictures, music, etc. Calendar, email, contact directory, and more considerately, after you modify a file, it can automatically synchronize the modifications to all Apple devices and back up old files. You can choose free 5G storage space, or you can purchase the iTunesMatch service for $24.99 per year, so that you can listen to music stored in Apple's cloud servers through any Apple device.
Big Data
Big data is equivalent to the massive amount of knowledge that the human brain memorizes and stores from elementary school to university. This knowledge can only create greater value through digestion, absorption, and reconstruction.
The definition given by McKinsey Global Institute is: a data collection that is so large that its acquisition, storage, management, and analysis greatly exceed the capabilities of traditional database software tools. It has massive data scale, rapid data The four major characteristics are circulation, diverse data types and low value density. The strategic significance of big data technology lies not in mastering huge data information, but in professional processing of these meaningful data. In other words, if big data is compared to an industry, then the key to making this industry profitable is to improve the "processing capabilities" of data and achieve the "value-added" of data through "processing".
The relationship between the Internet of Things and cloud computing
Cloud computing is equivalent to the human brain and the nerve center of the Internet of Things. Cloud computing is an Internet-based model for the addition, use, and delivery of related services that typically involves the provision of dynamically scalable and often virtualized resources over the Internet.
The relationship between big data and cloud computing
The relationship between big data and cloud computing is as inseparable as the two sides of the same coin. Big data cannot be processed by a single computer and must use a distributed architecture. Its characteristic lies in distributed data mining of massive data. But it must rely on distributed processing, distributed database and cloud storage, and virtualization technology of cloud computing.
The relationship between big data, cloud computing and the Internet of Things
The Internet of Things corresponds to the sensory and motor nervous system of the Internet. Cloud computing is a collection of the core hardware layer and core software layer of the Internet, and is also the germination of the Internet's central nervous system. Big data represents the information layer (data ocean) of the Internet and is the basis for the generation of Internet wisdom and consciousness. Including the Internet of Things, traditional Internet, and mobile Internet, they are continuously gathering and receiving data to the Internet big data layer. Cloud computing and the Internet of Things promote the development of big data.
What applications does the combination of big data and cloud computing have in our lives?

Applications of cloud computing:
Cloud music: Before cloud music, due to the storage capacity of the device, when downloading a song, you had to delete part of it before you could save the new one. The emergence of cloud music allows us to listen to music anytime and anywhere without being limited by capacity.
Cloud storage: I believe everyone knows this. Currently, all kinds of APPs or mobile phones come with a certain capacity of cloud space. You can back up your data, so that it can be used whether you change devices or move across regions. Don't worry.
Online office software: I don’t know if you have noticed that since the beginning of cloud computing, the concept of office has gradually become blurred, such as Tencent video conferencing, Huawei welink, etc. Video conferencing or Kingsoft’s collaborative editing, Feishu Collaboration software such as , DingTalk, etc. have allowed offices to overcome geographical barriers and shortened the connection between work rooms.
Application of big data:
Finance: In the financial industry, the following two aspects can be summarized: big data marketing, which makes targeted recommendations based on customers’ consumption habits, consumption frequency and frequent consumption locations; risk Prevention and control, based on the user's consumption habits and flow, conducts a comprehensive assessment to determine the credit situation, and is also applicable to equity financing, etc.
Business: E-commerce data is usually huge and complex. Through this data, trend trends, consumption trends, regional characteristics and habits, etc. can be analyzed.
Medical: The medical device industry has a lot of medical records, pathology reports, recovery plans, drug reports, etc. In the future, with the help of data management platforms, people can collect different medical records and treatment plans, as well as patient characteristics, and create a database of disease characteristics.