split creates output files containing consecutive or interleaved sections of input (standard input if none is given or input is ‘-’). Synopsis:
split [option] [input [prefix]]
By default, split puts 1000 lines of input (or whatever is left over for the last section), into each output file.
The output files' names consist of prefix (‘x’ by default) followed by a group of characters (‘aa’, ‘ab’, ... by default), such that concatenating the output files in traditional sorted order by file name produces the original input file (except -nr/n). By default split will initially create files with two generated suffix characters, and will increase this width by two when the next most significant position reaches the last character. (‘yz’, ‘zaaa’, ‘zaab’, ...). In this way an arbitrary number of output files are supported, which sort as described above, even in the presence of an --additional-suffix option. If the -a option is specified and the output file names are exhausted, split reports an error without deleting the output files that it did create.
The program accepts the following options. Also see Common options.
For compatibility split also supports an obsolete
option syntax -lines. New scripts should use
-l lines instead.
‘b’ => 512 ("blocks") ‘KB’ => 1000 (KiloBytes) ‘K’ => 1024 (KibiBytes) ‘MB’ => 1000*1000 (MegaBytes) ‘M’ => 1024*1024 (MebiBytes) ‘GB’ => 1000*1000*1000 (GigaBytes) ‘G’ => 1024*1024*1024 (GibiBytes)
and so on for ‘T’, ‘P’, ‘E’, ‘Z’, and ‘Y’.
xz -dc BIG.xz | split -b200G --filter='xz > $FILE.xz' - big-
Assuming a 10:1 compression ratio, that would create about fifty 20GiB files
with names big-aa.xz, big-ab.xz, big-ac.xz, etc.
n generate n files based on current size of input
k/n only output kth of n to stdout
l/n generate n files without splitting lines
l/k/n likewise but only output kth of n to stdout
r/n like ‘l’ but use round robin distribution
r/k/n likewise but only output kth of n to stdout
Any excess bytes remaining after dividing the input into n chunks, are assigned to the last chunk. Any excess bytes appearing after the initial calculation are discarded (except when using ‘r’ mode).
All n files are created even if there are fewer than n lines, or the input is truncated.
For ‘l’ mode, chunks are approximately input size / n. The input is partitioned into n equal sized portions, with the last assigned any excess. If a line starts within a partition it is written completely to the corresponding file. Since lines are not split even if they overlap a partition, the files written can be larger or smaller than the partition size, and even empty if a line is so long as to completely overlap the partition.
For ‘r’ mode, the size of input is irrelevant,
and so can be a pipe for example.
An exit status of zero indicates success, and a nonzero value indicates failure.
Here are a few examples to illustrate how the --number (-n) option works:
Notice how, by default, one line may be split onto two or more:
$ seq -w 6 10 > k; split -n3 k; head xa? ==> xaa <== 06 07 ==> xab <== 08 0 ==> xac <== 9 10
Use the "l/" modifier to suppress that:
$ seq -w 6 10 > k; split -nl/3 k; head xa? ==> xaa <== 06 07 ==> xab <== 08 09 ==> xac <== 10
Use the "r/" modifier to distribute lines in a round-robin fashion:
$ seq -w 6 10 > k; split -nr/3 k; head xa? ==> xaa <== 06 09 ==> xab <== 07 10 ==> xac <== 08
You can also extract just the Kth chunk. This extracts and prints just the 7th "chunk" of 33:
$ seq 100 > k; split -nl/7/33 k 20 21 22