mfsscadmin
NAME
mfsscadmin - MooseFS storage class administration tool
SYNOPSIS
mfscreatesclass [-?] [-M mount_point] [-c description] [-p priority] [-g export_group] [-a admin_only] [-m labels_mode] [-o arch_mode] [-d archive_delay] [-s archive_min_file_length] [-t min_trashretention] [ [-l location] [-K keep_labels] [-C create_labels] [-A archive_labels] [-T trash_labels] ... ] sclass_name [sclass_name ...]
mfsmodifysclass [-?] [-M mount_point] [-c description] [-p priority] [-g export_group] [-a admin_only] [-m labels_mode] [-o arch_mode] [-d archive_delay] [-s archive_min_file_length] [-t min_trashretention] [ [-l location] [-K keep_labels] [-C create_labels] [-A archive_labels] [-T trash_labels] ... ] sclass_name [sclass_name ...]
mfsdeletesclass [-?] [-M mount_point] [ [-l location] ... ] sclass_name [sclass_name ...]
mfsclonesclass [-?] [-M mount_point] [-l src_location[:dst_location]] [-s src_state[:dst_state]] src_sclass_name dst_sclass_name [dst_sclass_name ...]
mfsrenamesclass [-?] [-M mount_point] src_sclass_name dst_sclass_name
mfslistsclass [-?] [-M mount_point] [-l] [-i] [sclass_name_glob_pattern]
mfsimportsclass [-?] [-M mount_point] [-r] [-n filename]
DESCRIPTION
This is a set of tools for managing MooseFS storage classes. Storage classes are used to define the redundancy policy within the cluster. They should be later applied to particular MooseFS objects (files or folders) with mfssclass tools (see mfssclass(1)). A storage class is a named set of labels expressions and some other options. Labels expression indicates the redundancy format, the desired redundancy level, and the desired storage policy. The redundancy format may be either copies or erasure-coding (EC). The redundancy level is either the number of copies (from 1 to 9) or the number of data plus parity parts (4+n or 8+n), depending on the indicated redundancy format. The storage policy specifies, through the use of labels, which chunkservers are to be used for the physical storage of data assigned to the particular storage class. If a MooseFS instance is using locations (Pro only, see mfslocadmin(1)), a separate redundancy policy can be defined for each location.
mfscreatesclass creates a new storage class with given options, described below, and names it SCLASS_NAME; there can be more than one name provided, multiple storage classes with the same definition will be created then
mfsmodifysclass changes the given options in a class or classes indicated by SCLASS_NAME parameter(s)
mfsdeletesclass removes the class or classes indicated by SCLASS_NAME parameter(s); if any of the classes is not empty (i.e. it is still used by some MooseFS objects), it will not be removed and the tool will return an error and an error message will be printed; empty classes will be removed in any case; in Pro version this tool allows user to delete a class defitiniton for one location instead of the whole storage class, using the -l option
mfsclonesclass copies class indicated by SRC_SCLASS_NAME under a new name provided with DST_SCLASS_NAME or copies selected state (CREATE, KEEP, ARCHIVE, TRASH) definition from SRC_SCLASS_NAME to a specified state of DST_SCLASS_NAME, in Pro version it is also possible to indicate specific locations for the state copy operation
mfsrenamesclass changes the name of a class from SRC_SCLASS_NAME to DST_SCLASS_NAME
mfslistsclass lists classes with names matching the provided glob pattern, or all classes if no pattern is provided
mfsimportsclass imports storage classes definitions from stdin or a file and creates them; input format should be identical to mfslistsclass -l output.
OPTIONS
-C optional parameter, that tells the system to which chunkservers, defined by the create_labels expression, the chunk should be first written just after creation; if this parameter is not provided for a class, the keep_labels chunkservers will be used
-K optional parameter, that tells the system on which chunkservers, defined by the keep_labels expression, the chunk(s) should be kept always, except for special conditions like creating, archiving and deleting (moving to Trash), if defined; altough this parameter is formally optional, in CE version it needs to be defined always and in Pro version it needs to be defined for at least one defined location
-A optional parameter, that tells the system on which chunkservers, defined by the archive_labels expression, the chunk(s) should be kept for archiving purposes; see ARCHIVE BEHAVIOUR section below for detailed explanation
-d optional parameter that defines after how much time from atime/mtime/ctime (as set by -o) a file (and its chunks) are treated as archive; minimum unit is hours, default is 24, for value formatting see TIME
-o optional parameter that defines archive flags. C - ctime, M - mtime, A - atime, R - reversible, F - fastmode, P - per chunk ; default is C; see ARCHIVE BEHAVIOUR section below for details
-s optional parameter that defines minimum file length in bytes that can be archived; default is 0
-T optional parameter, that tells the system on which chunkservers, defined by the trash_labels expression, the chunk(s) of files in Trash should be kept; see also -t
-t optional parameter, that defines, how much time in Trash must be left for the system to actually use the schema defined in -T for a chunk; minimum unit is hours, default is 0, for value formatting see TIME
-c optional parameter, that defines a class description, for user's convenience (a string, maximum length is 255 bytes)
-p optional parameter, that defines a class priority; default is 0, see STORAGE CLASSES PRIORITY section
-g optional parameter, that defines a class export group; possible values are 0 to 15, default is 0; see mfsexport.cfg(5) for explanation
-a can be either 1 or 0 and indicates if the storage class is available to everyone (0) or admin only (1)
-m label mode used; possible values are l (or L, loose, Loose, LOOSE) for LOOSE mode, d (or D, std, Std, STD) for STANDARD mode and s (or S, strict, Strict, STRICT) for STRICT mode; if no mode is defined, STANDARD mode is assumed; behaviour of label modes is described below in LABEL MODES section
-l for mfslistsclass tool: list also definitions, not only the names of relevant storage classes; for all other tools in Pro version: specify the location
-i case insensitive storage class name matching
-r replace (overwrite) existing classes when importing storage classes
-n use provided filename as the source of storage classes definitions for importing, instead of stdin
-M MooseFS mount point, doesn't need to be specified if a tool is run inside MooseFS mounted directory or MooseFS is mounted in /mnt/mfs/
-? displays short usage message
NOTES
LABELS EXPRESSIONS
Labels are letters (A-Z - 26 letters) that can be assigned to chunkservers. Each chunkserver can have multiple (up to 26) labels. Labels are defined in mfschunkserver.cfg file, for more information refer to the appropriate manpage.
Labels expression is a set of subexpressions separated by commas. For full copies each subexpression specifies the storage schema of one copy of a file. Subexpression can be: an asterisk or a label schema. Label schema can be one label or an expression with sums, multiplications, negations and brackets. Sum means a file can be stored on any chunkserver matching any element of the sum (logical or). Multiplication means a file can be stored only on a chunkserver matching all elements (logical and). Asterisk means any chunkserver. Negation means any chunkserver but the one matching negated subexpression. Identical subexpressions can be shortened by adding a number in front of one instead of repeating it a number of times.
For EC labels expression starts with @ sign, followed by a number of data parts then + sign and a number that says how many parity parts the chunk should have. Possible numbers of data parts are 4 or 8. Possible numbers of parity parts are 1 (CE version) or 1 to 9 (PRO version). So, for example, @4+1 means EC with 4 data parts and 1 parity part, @8+3 means EC with 8 data parts and 3 parity parts. If number of data parts is omitted then the master uses the default value defined by DEFAULT_EC_DATA_PARTS - see mfsmaster.cfg (5). In this case @2 means @8+2 or @4+2. Then, maximum of two subexpressions can follow, separated by commas. If only one is present, it defines where all the parts should be kept. If both are present, the first subexpression defines where data parts should be kept, the second subexpression defines where parity parts should be kept.
Labels expression can be either a regular labels expression or EC labels expression (i.e. EC labels expression cannot be a subexpression). EC labels expression can only be used in place of ARCHIVE_LABELS or TRASH_LABELS in the storage class definition, regular labels expression can be use in any place.
At the end of each label expression one or two extending information, divided with a special separator, can be added. The first possible extension, is the distinguish extension and the separator is the slash (/) sign. Second is labels mode override and this extension is separated by colon (:) sign.
Distinguish extension can be a list of labels or one of the following special strings:
[IP] or [I] - distinguish by IP number
[RACK] or [R] - distinguish by RACK, as defined in topology, see mfstopology.cfg (5)
If present, the distinguish part lets the system know that it should try to distribute full copies so that each copy is either on a different label from the list or on a chunkserver with different IP address or from a different rack. For EC the distinguish part is currently ignored.
NOTICE! If CHUNKS_UNIQUE_MODE is defined in mfsmaster.cfg to a value other than 0, it will override any distinguish setting in storage classes. For more information about this parameter refer to mfsmaster.cfg (5) manual.
Labels mode override extension can be one of three characters: d (alternatively D or in string form std or Std or STD), s (alternatively S or in string form strict or Strict or STRICT) or l (alternatively L or in string form loose or Loose or LOOSE) and they mean that the STANDARD, STRICT or LOOSE label mode, respectively, should be applied only to this one labels expression. For explanation about label modes see the LABEL MODES section.
One or both extensions can be present for each labels expression, each has to start with their separator and if both are present, the order has to be kept, i.e. the distinguish extension has to be first and the label mode extension needs to be second.
For the purpose of creating or modifying a storage class in a specific location (Pro only) it is sometimes necessary to indicate a zero labels expression. An zero labels expression is denoted as "0" (zero).
Examples of labels expressions:
A,B - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on chunkserver(s) with label B
A,* - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on any chunkserver(s)
A,!A - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on any chunkserver(s) that doesn't have the label A
*,* - files will have two copies, stored on any chunkservers (different for each copy)
AB,C+D+E - files will have two copies, one copy will be stored on any chunkserver(s) that has both labels A and B (multiplication of labels), the other on any chunkserver(s) that has either the C label or the D label or the E label (sum of labels)
A,B[X+Y],C[X+Y] - files will have three copies, one copy will be stored on any chunkserver(s) with A label, the second on any chunserver(s) that has the B label and either X or Y label, the third on any chunkserver(s), that has the C label and either X or Y label
2A expression is equivalent to A,A expression
A,3BC expression is equivalent to A,BC,BC,BC expression
2 expression is equivalent to 2* expression is equivalent to *,* expression
3*/[IP] - files will have 3 copies, each copy will be kept on a chunkserver with different IP address
A,B/[RACK] - files will have two copies, one copy will be stored on chunkserver(s) with label A, the other on chunkserver(s) with label B in a different rack than the other copy
S,H,H/ABX-Z - files will have 3 copies, one on server with label S, two on servers with label H, but each copy will be on a server with different label from the set of A, B, X, Y, Z
@4+1 - files will be kept in EC format, 4 data parts and 1 parity part
@8+3 - files will be kept in EC format, 8 data parts and 3 parity parts
@2 - files will be kept in EC format, default number of data parts, 2 parity parts
@4+3,Z - files will be kept in EC format, 4 data parts and 3 parity parts - all on chunkservers with label Z.
@2,A(X+Y) - files will be kept in EC format, default number of data parts, 2 parity parts, all parts will be kept on chunsevers with label A and either X or Y
@3,S,H - files will be kept in EC format, default number of data parts will be kept on chunkservers with label S, 3 parity parts will be kept on chunkservers with label H
AB,AC:l - files will be kept in copies format, one copy on a server with labels A and B, the second on a server with labels A and C and the behaviour of this should be LOOSE
@4+2,X,Y:s - files will be kept in EC format, 4 data parts will be kept on servers with label X, 2 parity (checksum) parts should be kept on servers with label Y and the behaviour of this should be STRICT
2A/[IP]:s - files should be kept in 2 copies, both copies on servers with label A, but each server should have different IP, behaviour of this when accounting for labels should be STRICT
LABEL MODES
It is important to specify what to do when it is not possible to meet the labels requirement of a storage class, i.e.: there is no space available on all servers with needed labels, there is not enough servers with needed labels or servers with needed labels are all busy. The question is if the system should create chunks on other servers (with non-matching labels) or not. This decision must be made by the user.
There are 3 modes of operation: STANDARD, LOOSE and STRICT. The modes work a bit different depending on if a chunk is stored in copies or EC format, due to the different nature and algorithms that each of those format uses.
For copies format the 3 modes behave as follows:
In STANDARD mode in case of overloaded servers the system will wait for them, but in case of no space available it will use other servers and will replicate data to correct servers when it becomes possible. This means if some servers are in busy state for a long time, it might not be possible to create new chunks with certain storage classes and endangered (undergoal) chunks from those classes are at higher risk of being completely lost due to delayed replications.
In STRICT mode, during writing a new file, the system will return error (ENOSPC) in case of no space available on servers marked with labels specified for chunk creation. It will still wait for overloaded servers. Undergoal repliactions will not be performed if there is no space on servers with labels matching the storage class. This means high risk of losing data if servers with some labels are permanently filled up with data!
In LOOSE mode the system will immediately use other servers in case of overloaded servers or no space on servers and will replicate data to correct servers when it becomes possible. There is no delay or error on file creation and undergoal replications are always done as soon as possible.
This table sums up the modes behaviour for chunks stored in copy format:
| STANDARD | STRICT | LOOSE | |
| CREATE - OVERLOADED | WAIT | WAIT | WRITE ANY |
| CREATE - NO SPACE | WRITE ANY | ENOSPC | WRITE ANY |
| REPLICATE - OVERLOADED | WAIT | WAIT | WRITE ANY |
| REPLICATE - NO SPACE | WRITE ANY | NO COPY | WRITE ANY |
For chunks stored in EC format the 3 modes behave as follows:
In general, chunks will only be converted from copy
format to EC format if there are enough servers in the
system to safely store all the parts of the EC format. For
EC @N+X format, where N is number of data parts and can be
either 4 or 8 and X is number of parity/checksum parts and
can be equal to 1 (CE version) or any number from 1 to 9
(PRO version), the general requirements are:
- at least N+2X chunk servers to convert new chunks from
copy format to EC format
- at least N+X chunk servers to keep chunks that are already
in EC format still in this format
- if there are less than N+X servers, all chunks will revert
to copy (KEEP definition) format.
In LOOSE mode the system will try to use first the servers matching the label expression defined in the used storage class, but if not enough servers with "correct" labels are available (because they are busy or have no space or are just not defined), it will use any available chunk servers regardless of label; so the N+2X and N+X are calculated from all available chunk servers when the system decides what format to use to keep a chunk. Also, when one part of a chunk in EC format becomes unavailable or corrupted, restoration of such part will also be done to any available server, if a server with "correct" labels cannot currently be used.
It's important to remember that if not enough servers with "correct" labels are available for a chunk in LOOSE mode, the system may use however many it wants of the "other" chunk servers, not just the minimal amount that is missing from the "correct" number of servers.
In STRICT mode the system will only use the servers matching the label expression defined in the used storage class, so only available or short-term busy servers matching defined label expression will be used for calculation of N+2X and N+X when the system decides what format to use to keep a chunk. When one part of a chunk in EC format becomes unavailable or corrupted, restoration of such part can only be done to a server with "correct" label; if such a server is unavailable long term (i.e. is not available outright or only temporarily busy), this will automatically mean that the chunk needs to be reverted to keep format anyway (if the missing part is a parity/checksum part, the chunk will just revert to copy format using all available data parts, if a data part is missing, it will be restored to a chunk server hosting another part of the same chunk - which is not allowed under normal circumstances - and then the conversion to copy format will follow immediately).
In STANDARD mode the system will behave like in STRICT mode when it needs to make a decision whether it will convert a new chunk from copy format to EC format, that is the N+2X in this step is calculated only from "correctly" labeled servers. But to make a decision whether existing chunks need to be converted back from EC format to copy format it will look at all available servers, regardless of labels, so the N+X in this step is calculated from all available servers, like in LOOSE mode. X. In case of missing parts, if it's not possible to restore them to chunk servers with "correct" labels, the system will also adapt the LOOSE mode behaviour and try to use any available servers.
Notice! When a chunk is converted from copy format to EC format, the system first performs a "local split" operation, that is it picks one copy of the chunk and calculates all EC parts necessary on the server occupied by this selected copy. Then these parts are moved to separate chunkservers, matching the labels in the storage class definition for used EC mode. But temporarily, between the split and the "moving out" of the parts, they can be recorded on a "wrong" chunk server even in STRICT mode. This is because of the mechanics of the "local split" operation.
TIME
For time variables their value can be defined as a number of seconds or hours (integer), depending on minimum unit of the variable, or as a time period in one of two possible formats:
first format: #.#T where T is one of: s-seconds, m-minutes, h-hours, d-days or w-weeks; fractions of minimum unit will be rounded to integer value
second format: #w#d#h#m#s, any number of definitions can be omitted, but the remaining definitions must be in order (so #d#m is still a valid definition, but #m#d is not); ranges: s,m: 0 to 59, h: 0 to 23, d: 0 t o 6, w is unlimited and the first definition is also always unlimited (i.e. for #d#h#m d will be unlimited)
If a minimum unit of a variable is larger than seconds, units below the minimum one will not be accepted. For example, a variable that has hours as a minimum unit will not accept s and m units.
Examples:
1.5d is the same as 1d12h, is the same as 36h
2.5w is the same as 2w3d12h, is the same as 420h; 2w84h is not a valid time period (h is not the first definition, so it is bound by range 0 to 23)
ARCHIVE BEHAVIOUR
Chunks have archive flag set during file maintenance loop, which means that the time to archiving defined by -d option is the minimum time that has to pass before the flag is set, not the exact time.
Default behaviour of the system is that once a chunk has the archive bit set on, it IS NOT switched off even if atime/ctime/mtime changes, unless R flag is set by option -o. Writing to a chunk will always switch its archive flag off.
Archive flags:
C - use file's ctime to determine if archive flag should be set on - this is the default flag
M - use file's mtime to determine if archive flag should be set on
A - use file's atime to determine if archive flag should be set on
R - reversible, if atime/mtime/ctime changes for a file, system verifies if archive flag should be turned off for its chunks
F - fastmode, chunk has archive flag set to on as soon as possible, whatever is defined with -d option is disregarded
P - "per chunk" mode, use chunk's mtime to determine if archive flag should be set on
Archive flag can be modified manually. See mfsarchive (1)
Note: the ARCHIVE state does not mean that chunks or files are somehow unavailable, blocked for writing, etc. The ARCHIVE designation is only used to indicate that such data is already intended for long-term storage. In contrast to the KEEP state, which means that the data is 'hot' and can be changed frequently. The distinction between the KEEP and ARCHIVE states is therefore only to allow for an appropriate definition of the physical way of storing this data. From the user's point of view, the file is equally accessible: it has identical permissions, name, and is in the same directory. For example, due to the fact that writing in EC mode may be slower, we suggest storing 'hot' data (KEEP) using a copy and changing it to EC format (by assigning ARCHIVE mode) after some time. Another example is automatic transfer of 'cold' data to older/slower/cheaper disks and chunkservers - using different labels.
STORAGE CLASSES PRIORITY
Storage classes are assigned to files, but one chunk (one fragment of a file) can belong to many files, courtesy of the snapshot mechanism (see mfssnapshots (1)). If one chunk belongs to many files with different storage classes, one storage class must be picked to specify, how this chunk's copies should be kept in the system. Up to MooseFS version 4.56.0 a predefined class was artificially assigned to such chunk. Currently one of the files' classes will be used, according to priorities assigned by the user, to be exact: the system will pick the class with highest priority out of all the files' classes.
Example 1: there are 3 classes defined:
ClassA, with priority 100,
ClassB, with priority 206,
ClassC, with priority 1001.
A chunk, that belongs to 2 files, one in ClassA, the
other in ClassB, will be stored according to the definition
provided by classB (higher priority than ClassA).
A chunk, that belongs to 3 files, one in ClassA, one in
ClassB, one in ClassC, will be stored according to the
definition provided by classC (higher priority than both
ClassA and ClassB).
If two or more classes have the same priority, then the
following factors will be considered, in order of
importance, to determine, which class will be picked:
- a class with higher redundancy level (RL) will be picked
(maximum from each class's KEEP and ARCHIVE redundancy
levels will be considered as this class's redundancy
level),
- a class that has EC format in ARCHIVE state will be picked
over a class without EC,
- a class that uses labels for KEEP or ARCHIVE state will be
picked over a class without labels,
- a class that has EC format in TRASH state will be picked
over a class without EC,
- a class that uses labels for TRASH state will be picked
over a class without labels,
- if none of the above conditions are used, a class with
higher class id will be used.
Example 2: there are 5 classes defined, all with the same
priority (e.g. the default priority 0):
ClassA (id=1) has 3 copies in KEEP state (RL=2),
ClassB (id=2) has 2 copies in KEEP state and EC4+1 in
ARCHIVE state (RL=1, has EC in ARCHIVE),
ClassC (id=3) has 2 copies in KEEP state, stored on labels X
(RL=1, no EC in ARCHIVE, has labels in KEEP),
ClassD (id=4) has 2 copies in KEEP state, stored on labels Y
(RL=1, no EC in ARCHIVE, has labels in KEEP),
ClassE (id=5) has 2 copies in KEEP state (RL=1).
There is also a class ClassF defined, which has a priority
of 77 and 2 copies in KEEP state (RL=1).
A chunk, that belongs to files in classes: ClassA, ClassC
and ClassE will be stored according to definition of ClassA
(highest RL).
A chunk, that belongs to files in classes: ClassB, ClassC
will be stored according to definition of ClassB (same RL,
but ClassB has EC).
A chunk, that belongs to files in classes: ClassC and ClassE
will be stored according to definition of ClassC (same RL,
but ClassC has labels).
A chunk, that belongs to files in classes: ClassC and ClassD
will be stored according to definition of ClassD (same RL,
no EC, both have labels, so higher class ID is
picked).
A chunk, that belongs to files in 6 classes, from ClassA to
ClassF, will be stored according to definition of ClassF,
because this one has higher priority than all the other
classes (77>0).
In a system with these 6 storage classes classE will never
be used for a chunk belonging to multiple files, it has the
lowest possible priority (0) and no extra conditions to
justify its choice (lowest existing RL, no EC and no
labels).
LOCATIONS (Pro only)
In Pro version of MooseFS the user can divide all modules in an instance into locations, using IP mapping. For details on how to define, activate and deactivate locations, see mfslocadmin(1).
In an instance with more than one defined location, a storage policy for chunks is defined separately for each location. That means that definitions for CREATE, KEEP, ARCHIVE and TRASH states can be defined for each location separately. When using the mfscreatesclass or mfsmodifysclass tools, a user can specify a location using -l option, and all subsequent -C, -K, -A and -T options are interpreted as definitions for the indicated location. Option -l may be used more than once, but each occurence should be followed by at least one of the -C, -K, -A and -T options with appropriate labels expressions. If a location is not specified before any of the -C, -K, -A or -T options, the definition is asssumed to be for the default location.
It is important to remember, that lack of a definition for a certain state from CREATE, ARCHIVE and TRASH states does not mean an empty (zero) definition - it means that the definition for KEEP will be used in case of CREATE and ARCHIVE states and KEEP or ARCHIVE defintion will be used for TRASH state. But for an instance with multiple locations a user may want to specify a zero definition for a combination of state and location, and to achieve that, a specific option must be used, with the zero ("0") labels expression.
Any storage class in a multi location MooseFS instance has to be defined in such a way, that for each possible chunk state (CREATE, KEEP, ARCHIVE and TRASH) at least one location has either a specific definition for this state or has a definition for another state, that is a fallback state for this state. The following rules must be observed:
CREATE - at least one location has an explicit definition
for this state, if not, then at least one location has an
explicit definition for KEEP state and in this location
CREATE is not set to a zero definition
KEEP - at least one location has an explicit definition for
this state
ARCHIVE - at least one location has an explicit definition
for this state, if not, then at least one location has an
explicit definition for KEEP state and in this location
ARCHIVE is not set to a zero definition
TRASH - at least one location has an explicit definition for
this state, if not, then two conditions must be true: first
- at least one location has an explicit definition for KEEP
state and in this location TRASH is not set to a zero
definition, second - at least one location has either an
explicit definition of ARCHIVE state or an explicit
definition for KEEP state together with other than zero
definition for ARCHIVE state and in this location TRASH is
not set to a zero definition
Examples of correct combinations of definitions for a specific number of locations:
Example 1) 2 locations: first location has a specific
KEEP definition and a zero ARCHIVE definition, second
location has a zero KEEP definition and a specific ARCHIVE
definition. Chunks will be:
created (CREATE state) according to KEEP definition (KEEP is
default fallback for CREATE), only in the first
location
kept (KEEP status) according to KEEP definition, only in the
first location
archived (ARCHIVE status) according to ARCHIVE definition,
only in the second location
kept in trash (TRASH status) depending on their state in the
moment of deletion either according to KEEP definition in
the first location (chunks that were in KEEP state when
files were deleted) or according to ARCHIVE definition in
the second location (chunks that were in ARCHIVE state when
files were deleted)
Example 2) 2 locations: first location has a specific
CREATE definition and a specific KEEP definition, zero
ARCHIVE and TRASH definitions, second location has a zero
CREATE definition, specific KEEP, ARCHIVE and TRASH
definitions. Chunks will be:
created (CREATE state) according to CREATE definition, only
in the first location
kept (KEEP status) in both locations, in the first location
according to KEEP definition from the first location, in the
second location according to KEEP definition from the second
location
archived (ARCHIVE status) according to ARCHIVE definition,
only in the second location
kept in trash (TRASH status) accodring to TRASH definition,
only in the second location
Example 3) 3 locations: first location has a specific
CREATE definition, all 3 locations have specific KEEP
definitions, second and third locations have specific
ARCHIVE definitions. Chunks will be:
created (CREATE state) according to CREATE definition in the
first location and according to KEEP definitions in the
second and third location
kept (KEEP status) in all three locations according to KEEP
definition of each location
archived (ARCHIVE status) according to KEEP definition in
the first location and according to ARCHIVE definitions in
the second and third location
kept in trash (TRASH status) how they were kept in KEEP or
ARCHIVE status the moment they were deleted in all 3
locations
Note, that while the storage class definition from this
example is formally correct for 3 locations, it is a typical
example of what might happen if a user forgets to specify
that some state definitions in some locations should be zero
definitions: if CREATE definition is not set to zero for a
location, chunk creation will fall back to KEEP definiton,
same with ARCHIVE state.
Example 4) 2 locations: the first location has an
explicit definition for KEEP. Chunks will be:
created (CREATE state) only in the first location according
to KEEP definition
kept (KEEP state) only in the first location according to
KEEP definition
archived (ARCHIVE state) only in the first location
according to KEEP definition
kept in trash (TRASH state) only in the first location
according to KEEP definition
Note, that while the storage class definition from this
example is formally correct for 2 locations, the second
location is never used.
Examples of incorrect combinations of definitions for specific number of locations:
Example 1) 2 locations: first location has a zero
definition for CREATE, specific definition for KEEP, second
location has zero definitions for CREATE and KEEP, a
specific definition for ARCHIVE and TRASH
This is incorrect, there is no location with a non-zero
(specific or fallback from KEEP) definition for CREATE
state.
Example 2) 3 locations: first location has a specific
KEEP definition and a zero ARCHIVE definition, the other two
locations have zero CREATE, KEEP and TRASH definitions and
specific ARCHIVE definitions.
This is incorrect. CREATE, KEEP and ARCHIVE state chunks
have their correct definitions (fallback for CREATE,
explicit for KEEP and ARCHIVE), but for TRASH state chunks
only chunks deleted while in KEEP state can be kept in the
first location, chunks deleted while in ARCHIVE state have
no fallback definition there and are explicitely forbidden
(zero definition) in the other locations.
PREDEFINED STORAGE CLASSES
A new MooseFS instance will have the following classes predefined:
2CP - only KEEP state defined, keep 2 copies on any labels (default class for / directory)
3CP - only KEEP state defined, keep 3 copies on any labels
EC4+1 - in KEEP state, keep 2 copies on any labels, in ARCHIVE state, keep chunks in EC4+1 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC4+2 - (pro only) in KEEP state, keep 3 copies on any labels, in ARCHIVE state, keep chunks in EC4+2 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC8+1 - in KEEP state, keep 2 copies on any labels, in ARCHIVE state, keep chunks in EC8+1 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
EC8+2 - (pro only) in KEEP state, keep 3 copies on any labels, in ARCHIVE state, keep chunks in EC8+2 format on any labels, archive delay is 1 day and is calculated using file's ctime, files smaller than 512kiB will not be converted to EC format
These classes are fully modifiable and deletable and can be replaced with user's choice of classes.
Up to version 4.56.0 of MooseFS the predefined classes were different. The following information pertains to old MooseFS behaviour. In newer versions of MooseFS the classes mentioned below might exist as a result of an upgrade, but will behave exactly like any user-defined classes. This information is left here purely for informative reasons and will be removed from this manual page at some point:
(Behaviour up to version 4.56.0) "For compatibility reasons, every fresh or freshly upgraded instance of MooseFS has 9 predefined storage classes. Their names are single digits, from 1 to 9, and their definitions are * to 9*. They are equivalents of simple numeric goals from previous versions of the system. In case of an upgrade, all files that had goal N before upgrade, will now have N storage class. These classes can be modified only when option -f is specified. It is advised to create new storage classes in an upgraded system and migrate files with mfsxchgsclass tool, rather than modify the predefined classes. The predefined classes CANNOT be deleted."
EXAMPLES
Create a new storage class named "Class2Copies" with 2 copies. The first copy on one of chunkservers with label "A" and the second copy on one of chunkservers with label "B":
mfscreatesclass -K A,B Class2Copies
Create a new storage class named "Class3Copies". Freshly created chunks (-C) will be kept anywhere with 2 copies (2*). Soon after creation they will be kept (-K) in 3 copies: the first copy on one of chunkservers with label "A", the second copy on one of chunkservers with label "B" and the third on any chunkserver:
mfscreatesclass -C '2*' -K 'A,B,*' Class3Copies
Create a new storage class named "ClassEC4_1". Chunks will be created and kept (-K) anywhere in 2 copies (2*) for 24 hours from the file creation. Later they will be archived (-A) in EC4+1 format on any chunkserver:
mfscreatesclass -K '2*' -A '@4+1' ClassEC4_1
Modify the "ClassEC4_1" storage class to archive chunks after 48 hours from the file modification (not creation):
mfsmodifysclass -d 2d -o m ClassEC4_1
Modify the "Class3Copies" storage class to keep chunks on other chunkservers (LOOSE mode) if the ones with labels "A" and "B" are too busy (overloaded):
mfsmodifysclass -m l Class3Copies
In a Pro instance with 2 locations ('foo' and 'bar'), create a storage class named "WorkData", that keeps fresh chunks in 2 copies in location 'foo', and archived chunks in EC8+1 format in location 'bar':
mfscreatesclass -l foo -K '2*' -A 0 -l bar -K 0 -A '@8+1' WorkData
In a Pro instance with 3 locations ('here','there','far away'), create a storage class named "ImportantData", that keeps all chunks in 2 copies in each location and archived chunks in EC8+2 format in locations 'there' and 'far away':
mfscreatesclass -l 'here' -K '2*' -A 0 -l 'there' -K '2*' -A '@8+2' -l 'far away' -K '2*' -A '@8+2' ImportantData
REPORTING BUGS
Report bugs to bugs@moosefs.com
COPYRIGHT
Copyright Jakub Kruszona-Zawadzki, Saglabs SA
This file is part of MooseFS.
READ THIS BEFORE INSTALLING THE SOFTWARE. BY INSTALLING, ACTIVATING OR USING THE SOFTWARE, YOU ARE AGREEING TO BE BOUND BY THE TERMS AND CONDITIONS OF MooseFS LICENSE AGREEMENT FOR VERSION 1.7 AND HIGHER IN A SEPARATE FILE. THIS SOFTWARE IS LICENSED AS PROPRIETARY SOFTWARE. YOU DO NOT ACQUIRE ANY OWNERSHIP RIGHT, TITLE OR INTEREST IN OR TO ANY INTELLECTUAL PROPERTY OR OTHER PROPRIETARY RIGHTS.
SEE ALSO
mfsmount(8), mfstools(1), mfssclass(1), mfsarchive(1), mfsmaster.cfg(5), mfschunkserver.cfg(5), mfstopology.cfg(5)