Distribution-based task scheduling method and system

A task scheduling and distributed technology, applied in the distributed task scheduling method and system field, can solve the problems of large power consumption, inconvenience, rack space occupation rent, etc., and achieve the effect of controllable energy consumption and cost reduction

Inactive Publication Date: 2016-02-17
ALIBABA GRP HLDG LTD
View PDF7 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In general, distributed cluster solutions are more commonly used as solutions based on Hadoop distributed file system, in which directory management nodes usually use large-capacity nearline SATA hard disks and cloud disks or archive disks; these storage media are still micro-precision electronically controlled in essence. The mechanical arm is combined with the traditional hard disk for perpendicular recording of the magnetic storage medium. The power consumption of a single unit is mainly consumed by the motor that drives the disk rotation, the seek operation of the electronically controlled mechanical arm, and the current work consumption of the read and write operation of the magnetic head. The common 3.5-inch 7200rpm The power consumption of the hard disk is about 7W at idle time, and more than 10W at full load; the nominal power consumption of the 5400rpm low-speed hard disk is about 7W, and the power consumption of the idle time is 4.5-5W, while the power consumption of the 10000RPM and 15000RPM hard disk is higher
[0009] Aiming at the background power consumption of mechanical hard disks (it is still necessary to keep the disk rotating when idle, that is, the consumption of electrical energy is converted into mechanical energy) for energy consumption, and the heat generated in the process requires system-level cooling means to take away the heat, so for large-scale use of mechanical hard disks The datanode (directory management node) solution of the magnetic media solution requires actuarial calculation of its Capex (Capital Expenditure, that is, capital expenditure) and Opex (Operating Expense, operating cost), but for near-line clusters, there is no 24*7 real-time access, more reads and fewer writes, no Planned random reads have the usage characteristics of planned sequential writes. In this part of the overall plan, continuing to use mechanical hard disk media requires a large amount of equipment purchase costs at the capex level, and a large amount of rack space needs to be paid during the cluster life cycle. rent, while consuming a large amount of electricity
[0010] In summary, the existing technology obviously has inconvenience and defects in actual use, so a new solution is needed to meet the new system's demand for low power consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distribution-based task scheduling method and system
  • Distribution-based task scheduling method and system
  • Distribution-based task scheduling method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0028] In a typical configuration of the present application, the terminal, the device serving the network and the trusted party all include one or more processors (CPUs), input / output interfaces, network interfaces and memory.

[0029] Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and / or nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM). Memory is an example of computer readable media.

[0030] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is suitable for the technical field of data storage, and provides a distribution-based task scheduling method and system. The method comprises the following steps: setting an accessed datanode threshold value of each rack in a distributed cluster; obtaining an amount of the datanodes accessed in each rack, and judging whether the amount of the datanodes accessed in each rack at present exceeds the threshold value or not; and if the amount of the datanodes accessed in each rack at present exceeds the threshold value, distributing a new distributed task to other racks, or scheduling the new task to a task queue to wait. On the basis of a control algorithm based on IO (Input / Output) access, the single-cabinet power of cold storage data access and the energy consumption of an integral cold data center can be controllable, the relationship between flash memory media service and energy consumption is fully utilized, and the characteristics of data distribution type storage / access are combined to lower cold data backup cluster expense.

Description

technical field [0001] The invention relates to the technical field of data storage, in particular to a distributed task scheduling method and system. Background technique [0002] With the concept of "big data" and the evolution and commercialization of related technologies, data has become one of the most important assets of Internet companies. There are several important features in the concept of big data that are highly related to the design of storage and backup clusters, namely, the relatively low density of data value, the relatively high uncertainty of data value, and the large amount of data. This determines that data storage needs to provide targeted data service capabilities based on data importance, access performance, access frequency, data redundancy requirements and other characteristics. The backup cluster is the last guarantee to prevent all data loss, and needs to fully consider the actual requirements in terms of data content, application characteristics...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/46G06F3/06G06F11/16
Inventor 武鹏王森茂李世伟邹巍郑灏张颖杰张磊刘拴林
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products