Cloud storage application practice based on Hadoop platform
2012/0327/
Cloud Computing is a super computing model based on the Internet. In a remote data center, thousands of computers are connected to servers. into a computer cloud. Users access the data center through computers, notebooks, mobile phones, etc., and perform calculations according to their own needs. Currently, there is still no universally agreed upon definition of cloud computing. Combining the above definitions, some essential characteristics of cloud computing can be summarized, namely distributed computing and storage characteristics, high scalability, user-friendliness, and good management.
1 Cloud storage architecture diagram
The orange one is the storage node (Storage Node) responsible for storing files, and the blue one is the control node ((Control Node)) is responsible for file indexing, and Responsible for monitoring the balance of capacity and load between storage nodes. These two parts together form a cloud storage. The storage nodes and control nodes are pure servers, but the storage nodes have more hard disks. The storage node servers do not need to have RAID. Function, as long as Linux can be installed, the control node needs to have a simple RAID level O1 function in order to protect data.
Cloud storage is not to replace the existing disk array, but to cope with the rapid growth of A new form of storage system arises from the amount of data and bandwidth. Therefore, the following three points are usually considered when designing cloud storage: (1) Is it easy to expand capacity and bandwidth?
Expansion It cannot be shut down and will automatically incorporate the new storage node capacity into the original storage pool. No complicated settings are required.
Figure 1 Cloud storage architecture diagram
(2) Whether bandwidth grows linearly
Many customers who use cloud storage consider the future growth of bandwidth, so the quality of cloud storage product design There will be a big difference. Some nodes will reach saturation with more than a dozen nodes, which will have a negative impact on future bandwidth expansion. This must be clarified in advance, otherwise by the time it is discovered that it does not meet the needs, hundreds of TB will have been purchased. It’s too late to regret.
(3) Is it easy to manage?
2 Key technologies of cloud storage
Cloud storage must have nine major elements: ① performance; ② Security; ③ automatic ILM storage; ④ storage access mode; ⑤ availability; ⑥ primary data protection; ⑦ secondary data protection; ⑧ storage flexibility; ⑨ storage reports.
The development of cloud computing is inseparable The development of core technologies such as virtualization, parallel computing and distributed computing has matured. They are introduced below:
(1) Cluster technology, grid technology and distributed file system
A cloud storage system is a collection of multiple storage devices, multiple applications, and multiple services working together. Any single-point storage system is not cloud storage.
Since it is composed of multiple storage devices, Different storage devices need to use cluster technology, distributed file system, grid computing and other technologies to achieve collaborative work between multiple storage devices, so that multiple storage devices can provide the same service to the outside world and provide greater Stronger and better data access performance. Without the existence of these technologies, cloud storage cannot be truly realized. The so-called cloud storage can only be independent systems one by one and cannot form a cloud-like structure.
(2) CDN content distribution, P2P technology, data compression technology, data deduplication technology, data encryption technology
CDN content distribution system and data encryption technology ensure that data in cloud storage will not be accessed by unauthorized users At the same time, various data backup and disaster recovery technologies are used to ensure that the data in cloud storage will not be lost, ensuring the security and stability of cloud storage itself. If the data security in cloud storage cannot be guaranteed, no one will dare to use cloud storage.
(3) Storage virtualization technology, storage network management technology
There are a large number of storage devices in cloud storage and they are mostly distributed in different regions. How to implement different manufacturers, different models and even Logical volume management, storage virtualization management and multi-link redundancy management between multiple devices of different types (such as FC storage and IP storage) will be a huge problem. If this problem cannot be solved, storage devices It will be the performance bottleneck of the entire cloud storage system, and the structure cannot form a whole, and it will also cause problems such as difficulty in later capacity and performance expansion.
上一篇:云计算具体应用实例分析
下一篇:云计算案例分析及体会