1、 What is CPU? What is a GPU?
Before making clear the difference between GPU server and CPU server, let's recall what is CPU? What is a GPU?
1. The so-called CPU is the central processing unit (CPU), which is the core of the operation and control of the entire computer system and the final execution unit for information processing and program running. CPU is the core component and the most fundamental component of the whole data processing.
2. The so-called GPU is a graphics processing unit (GPU, abbreviated as GPU), also known as display core, visual processor and display chip. It is a microprocessor specialized in image and graphics related operations on personal computers, workstations, game consoles and some mobile devices (such as tablets, smartphones, etc.).
However, just from the literal meaning above, we cannot correctly understand the important role played by GPU and CPU in data computing.
Differences between GPU and CPU:
From the comparison diagram of GPU and CPU architecture, it can be seen that the CPU has fewer logical operation units and the controller has a large proportion; The logical operation unit of GPU is small and many, the controller function is simple, and the cache is less. Many logic operation units of the GPU are arranged in a matrix, which can process a large number of simple processing tasks in parallel. Image operation processing can be disassembled in this way. The processing capacity of a single computing unit of GPU is weaker than that of CPU, but a large number of computing units can work at the same time. When faced with high-intensity parallel computing, the performance of GPU is better than that of CPU.
In short
The CPU is good at complex operations such as channel command, while the GPU is good at simple repetitive operations on big data. CPU is a teaching aid for complex mental work, while GPU is a manual worker for a lot of parallel computing.
Deep learning is a mathematical network model established to simulate the human brain nervous system. The biggest feature of this model is that it needs big data for training. Therefore, the weight requirement of the computer processor is that it requires a large amount of parallel repetitive computing, and GPU just has this expertise. This is also an important reason why GPU servers are now in full swing.
2、 Differences between CPU Server and GPU Server
The statement of CPU server and GPU server is not scientific. A server without a GPU can still calculate and use, but a server without a CPU cannot work. To put it simply, the terms CPU server and GPU server only focus on different aspects of the server.
3、 GPU server
GPU server is a fast, stable and flexible computing service based on GPU, which is applied to video encoding and decoding, deep learning, scientific computing and other scenarios. We provide the same management method as standard cloud servers. Excellent graphics processing capability and high-performance computing capability provide the ultimate computing performance, effectively relieve the computing pressure, and improve the computing efficiency and competitiveness of products.
4、 How to select a GPU server, and the principles for selecting a GPU server:
First of all, we need to understand that the GPU is mainly divided into three types of interfaces. At present, the traditional bus interface, PCIe interface and NV Link interface can be delivered on the market.
The typical representative of the NV Link interface type GPU is NVIDIA V100, which uses SXM2 interface. There is an interface of SXM3 on DGX-2. The GPU servers of the NV Link bus standard can be divided into two categories, one is the DGX supercomputer designed by NVIDIA, and the other is the server with NV Link interface designed by partners. DGX supercomputer not only provides hardware, but also related software and services.
The GPUs with traditional bus interfaces are currently the mainstream products, such as the V100, P40 (P refers to the PASCAL architecture of the previous generation) and P4 of the PCI-e interface, and the latest Turing architecture T4. P4 and T4, which are thin and occupy only one slot, are usually used in Conference. At present, there are mature models for reasoning and recognition.
GPU servers of traditional PCI-e bus can also be divided into two categories. One is OEM servers, such as Dawning, Inspur, Huawei and other international brands; The other type is non OEM servers, including many types. In addition to classification, performance indicators, such as precision, video memory type, video memory capacity, and power consumption, should also be considered when selecting servers. At the same time, some servers need special servers because they need water cooling, noise reduction, or have special requirements for temperature, mobility, and so on.
When selecting a GPU server, you must first consider the business requirements to select the appropriate GPU model. In HPC high-performance computing, you also need to choose according to the precision. For example, some high-performance computing requires double precision. If P40 or P4 is not appropriate, you can only use V100 or P100; At the same time, there will also be requirements for video storage capacity. For example, computing applications for petroleum or petrochemical exploration have high requirements for video storage; There are also requirements for bus standards, so the selection of GPU model depends on the business requirements.
GPU servers are also widely used in the field of artificial intelligence. In the teaching scene, the requirements for GPU virtualization are relatively high. According to the number of classes, a teacher may need to create 30 or even 60 virtual GPUs from the GPU server. Therefore, batch training requires high GPUs. Generally, V100 is used for GPU training. After model training, reasoning is required, so P4 or T4 is generally used for reasoning, and V100 is also used in a few cases.
When the GPU model is selected, consider which GPU server to use. At this time, we need to consider the following situations:
First, on the edge server, you need to select the corresponding servers such as T4 or P4 according to the quantity, and also consider the server use scenarios, such as railway station checkpoints, airport checkpoints or public security checkpoints; The V100 server may be required for the central side to do the Conference, and the throughput, usage scenarios, quantity, etc. need to be considered.
Second, it is necessary to consider customers' own users and IT operation and maintenance capabilities. For large companies such as BAT, their own operational capabilities are relatively strong, so they will choose a generic PCI-e server; For some customers with less strong IT operation and maintenance capabilities, they pay more attention to numbers and data annotation. We call them data scientists, and the criteria for selecting GPU servers will be different.
Third, we need to consider the value of supporting software and services.
Fourth, consider the maturity and engineering efficiency of the overall GPU cluster system. For example, the GPU integrated supercomputer like DGX has a very mature system that drives Dockers from the bottom operating system to other parts that are fixed and optimized. At this time, the efficiency is relatively high.