There is a 'split' utility in unix, which allows one to split a
sufficiently big file into small chunks of equal sizes. The man pages of
split give out more options that could be used based on the need,
however a simple example has been shown below in which a text file
having 1000 lines has been split into 10 equal chunks of 100 lines each.
The split files have been merged with the help of a 'cat' command in a 'for' loop. Extreme care must be taken when splitting binary/dmp files as splitting succeeds but the merging shows misleading results and the file size of the subjected file before and after the split does not seem to match.
First off, we create a file named 'un_split_file.out', which has 1000 lines - a partial look of it is shown below
Now comes the usage of 'split' command, here 'split' has been passed with 4 arguments
The old file 'un_split_file.out' will be moved to 'un_split_file.out.deleted' so it
does not conflict with the new file that will be created by merging the split files
using the 'cat' command.
The result is the creation of a file named 'un_split_file.out' which has just all
the contents like it did before being split. The split/merge operation does not
remove the source or the original files as seen below.
The split files have been merged with the help of a 'cat' command in a 'for' loop. Extreme care must be taken when splitting binary/dmp files as splitting succeeds but the merging shows misleading results and the file size of the subjected file before and after the split does not seem to match.
First off, we create a file named 'un_split_file.out', which has 1000 lines - a partial look of it is shown below
UNIX:/prd/u01/acme> export i=0
UNIX:/prd/u01/acme> echo $i
0
UNIX:/prd/u01/acme> while [ "$i" -ne 1000 ]
> do
> echo "This is line $i" >> un_split_file.out
> i=`expr $i \+ 1`
> done &
UNIX:/prd/u01/acme> wc -l split_file.out
1000 split_file.out
UNIX:/prd/u01/acme > head -10 un_split_file.out
This is line 0
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
This is line 9
UNIX:/prd/u01/acme > tail -10 un_split_file.out
This is line 990
This is line 991
This is line 992
This is line 993
This is line 994
This is line 995
This is line 996
This is line 997
This is line 998
This is line 999
UNIX:/prd/u01/acme > ls -ltr un_split_file.out
-rw-r--r-- 1 oracle dba 16890 Aug 25 05:19 un_split_file.out
Now comes the usage of 'split' command, here 'split' has been passed with 4 arguments
-l 100 -> Line Count, which means after every 100 lines from the beginning of the files, a new file will be created -a 2 -> Based on the line count parameter,required number of split files will be created with a 2 characted substring. The substring by default has the following trend aa, ab, ac and so on. Third argument is the name of the file to be split Fourth argument is the text for naming of the split files UNIX:/prd/u01/acme> split -l 100 -a 2 un_split_file.out split_file.part_ UNIX:/prd/u01/acme> ls -ltr total 544 -rw-r--r-- 1 oracle dba 16890 Aug 25 05:19 un_split_file.out -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_aj -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ai -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ah -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ag -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_af -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ae -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ad -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ac -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ab -rw-r--r-- 1 oracle dba 1590 Aug 25 05:22 split_file.part_aa UNIX:/prd/u01/acme>
The old file 'un_split_file.out' will be moved to 'un_split_file.out.deleted' so it
does not conflict with the new file that will be created by merging the split files
using the 'cat' command.
UNIX:/prd/u01/acme> mv un_split_file.out un_split_file.out.deleted UNIX:/prd/u01/acme> for i in `ls split_file.part*` > do > cat $i >> un_split_file.out > done &
The result is the creation of a file named 'un_split_file.out' which has just all
the contents like it did before being split. The split/merge operation does not
remove the source or the original files as seen below.
UNIX:/prd/u01/acme> ls -ltr un_split* -rw-r--r-- 1 oracle dba 16890 Aug 25 05:19 un_split_file.out.deleted -rw-r--r-- 1 oracle dba 16890 Aug 25 05:25 un_split_file.out UNIX:/prd/u01/acme> wc -l un_split_file.out 1000 split_file.out UNIX:/prd/u01/acme> UNIX:/prd/u01/acme > ls -ltr total 578 -rw-r--r-- 1 oracle dba 16890 Aug 25 05:19 un_split_file.out.deleted -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_aj -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ai -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ah -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ag -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_af -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ae -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ad -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ac -rw-r--r-- 1 oracle dba 1700 Aug 25 05:22 split_file.part_ab -rw-r--r-- 1 oracle dba 1590 Aug 25 05:22 split_file.part_aa -rw-r--r-- 1 oracle dba 16890 Aug 25 05:25 un_split_file.out
No comments:
Post a Comment