Reverbrain wiki

Site Tools


elliptics:mastermind

Mastermind

Mastermind is a Cocaine application that makes life of Elliptics administrator easier.

It:

  • collects information about all of your groups and their current state;
  • works as a balancer and points where the client data should be written;
  • provide administrators with all of the collected information;
  • performs some operations with group coupling (replica sets).

Two ways of storage expansion

Elliptics supports DHT ring between nodes of a one group. It is convenient to store data in such way because you need to know how to connect to just one node from the group for read/write operations. All other operations Elliptics will perform itself. This method of data storage has the following effect: if our storage is expanded continuously and we need to add new nodes one by one, system will generate high internal load on redistribution of keys and data between nodes. Besides, the hard drives begin to break down from a high load.

This internal load is not proper for some cases. If we want to store user generated content (UGC): messages, photos and so on, we need to understand that the most of this content will be a dead weight. There are no any reasons to redistribute this content over the storage network on its growth. In this case we expand storage by adding of new groups, not nodes in one group. The filled up group will be closed for writing, and there is no need to move its data somewhere. The result of it is a greatly reducing of a load in storage system with the rapid expansion.

When we use expansion by groups, we should remember where our data can be found (which group used). Also, we should have actual information about groups, which are corrupted or not available, how much free space we have and so on. All of these tasks can be solved with Mastermind.

How it works

At the next picture we can see 32 Elliptics groups in 4 datacenters. Groups (2, 12, 18), (6, 22, 29), (11, 19, 30) are coupled to store 3 portions of data in 3 replicas each. Let's call such “trinities” as couples of groups or symmetric groups. Each couple can consists of arbitrary number of groups.

Division for groups and replicas is logical. Using Elliptics interfaces user can implement any custom logic of such division and rules of reading/writing. For example, you can use the Elliptics eventual consistency and write only two or just one of the three copies of data, and the rest copies will be recovered by Elliptics tools later (that is not a good idea, but you can do that).

Group can be created for a one hard disk. It is useful in case of disk corruption, you just need to insert a new one and copy data from another disk of this couple with rsync for example.

Information of coupling contained inside the group data. It couldn't be changed by editing of the configuration file. Each Elliptics node knows where it can find all other nodes. Mastermind gets this information and collects coupling information from all of the groups. Mastermind refreshes this information periodically.

Mastermind knows which groups are offline, and it can suggest to the client which couples are not fully functional (bad couples).

Also, Mastermind collects information about free space in groups and their read/write load. This data allow Mastermind serve as a balancer. Client can ask Mastermind where it should write data and Mastermind will suggest which couple of groups is currently better for this purpose. The best, in this context, is a fully functional couple with the minimum load, though real formula more complex.

Interface

mastermind command

usage: mastermind <command> [options]

commands:

  • bad-groups - get list of broken symmetric groups from balancer;
  • balance - get group info;
  • break-couple - break the couple of groups, couple is an argument;
  • couple-groups - make a couple of groups, number of groups is an argument;
  • get-group-weights - get weights for symmetric groups;
  • group-info - get group info;
  • help - show help for a given help topic or a help overview;
  • next-group-number - get unused group numbers, number of groups is an argument;
  • repair-groups - repair broken symmetric groups;
  • symmetric-groups - get list of symmetric groups from balancer;
  • uncoupled-groups - get list of uncoupled groups from balancer.

Execute mastermind -h to get list of commands. Execute mastermind <command> -h to get a help for actual command. For example, mastermind bad-groups -h.

Cocaine drivers

Mastermind is a Cocaine application, so it provides Cocaine drivers:

  • get_bad_groups - get list of broken symmetric groups from balancer;
  • balance - get group info;
  • break_couple - break the couple of groups, couple is an argument;
  • couple_groups - make a couple of groups, number of groups is an argument;
  • get_group_weights - get weights for symmetric groups;
  • get_group_info - get group info;
  • get_next_group_number - get unused group numbers, number of groups is an argument;
  • repair_groups - repair broken symmetric groups;
  • get_symmetric_groups - get list of symmetric groups from balancer;
  • get_empty_groups - get list of uncoupled groups from balancer.

Mastermind command is a Python program and it uses this interface. So, if you want to create custom tool you can use mastermind command as an example.

More info on Cocaine programming you can find in project documentation.

Configuration

Mastermind has a configuration file (/etc/elliptics/mastermind.conf by default).

Example:

{
    "dnet_log": "/var/log/mastermind/mastermind.log",
    "dnet_log_mask": 3,
    "symmetric_groups": true,
    "elliptics_nodes": [
        ["elliptics1.your.project.net", 1025],
        ["elliptics2.your.project.net", 1025]
    ],
    "metadata": {
        "nodes": [
            ["mastermind-metadata-1.your.project.net", 1025],
            ["mastermind-metadata-2.your.project.net", 1025]
        ],
        "groups": [1,2]
    },
	"balancer_config": {
        "add_units" : 3,
        "min_free_space" : 150000
	}	
}
  • dnet_log - path to log file;
  • dnet_log_mask - level of logging;
  • symmetric_groups - always `true`;
  • elliptics_nodes - list of nodes which Mastermind will connect to. Typically you don't need have more than 5 nodes in this list;
  • metadata - Elliptics nodes which will be used for Mastermind metadata storage. 2 groups is enough;
  • balancer_config - parameters for balancer configuration.

Mastermind balancer has a multiparameter formula, which can be configured very flexible. All of the parameters has default value, but you can change them in balancer_config section of a configuration file. Parameters that can be changed:

Possibility for writing into the groups

  • min_free_space (default: 256 bytes) - minimum amount of a free space required for writing into the group;
  • min_free_space_relative (default: 0.15 percent/100) - minimum quota of a free space required for writing into the group.

Number of groups for writing

Mastermind automatically calculates number of groups required to serve requests for writing. For example, Mastermind calculated that each couple of groups can serve 200 rps, but the load for cluster is 2000 rps. Mastermind will use 10 couples of groups for writing to serve all the requests.

  • add_rps (default: 20) - amount of rps to add to automatically calculated. For our example, Mastermind will use 2020 rps in its calculations by default.
  • add_rps_relative(default: 0.15 percent/100) - minimum quota of load to add to automatically calculated. For our example Mastermind will use 2323 rps in calculations by default.
  • min_units (default: 1) - minimum number of groups that can be used to write independently of the load.
  • add_units (default: 1) - number of groups to add to calculated amount. For our example we have 2323/200 + 1 = 13 groups (with the other parameters).
  • add_units_relative (default: 0.10 percent/100) - quota of groups number to add to calculated. 14 groups for our example.
elliptics/mastermind.txt · Last modified: 2013/12/10 17:41 by masha