The HPC team is updating this page. Check back for new information.
Compressibility By Files Types
Compressing files types that are ≥0.80 should be avoided unless it’s part of a directory.
Type | Extension | Avg Ratio (compressed size/original size) |
---|---|---|
binary | <null>/.bin/.exe | 0.25-1+ |
java binary | .jar | 0.75-0.90 |
docx | .docx | 0.80-0.85 |
gif | .gif | 0.80-0.95 |
compressed files | .gzip/.zip/.bz2 | >1 |
image files | .jpg/.jpeg/.png | 0.93-1+ |
data files | .json/.xml | 0.30-0.60 |
audio files | .mp3/.ogg/.mp4 | 0.80-0.95 |
0.50-0.95 | ||
svg | .svg | 0.30-0.57 |
fonts | .ttf | 0.46-0.71 |
txt | <null>/.txt | 0.32-0.55 |
wav | .wav | 0.45-0.95 |
source files | .c/.cpp/.h/.java/.js/.py/.html/.css/.hpp/.lua | 0.10-0.45 |
library files | .so | 0.25-0.45 |
log files | .log | 0.05-0.25 |
To determine file type:
Code Block |
---|
% ls -l -rw-r--r-- 1 root root 2625604 Jun 15 2022 mstflint-4.16.0-1.53100.x86_64.rpm -rwx------ 1 root root 1415 Mar 16 10:35 weka_install.sh % file mstflint-4.16.0-1.53100.x86_64.rpm mstflint-4.16.0-1.53100.x86_64.rpm: RPM v3.0 bin i386/x86_64 mstflint-4.16.0-1.53100 % file weka_install.sh weka_install.sh: POSIX shell script, ASCII text executable |
...
Code Block |
---|
# List directory % ls -l drwxr-xr-x 5 aaa0000 Domain_Users 4096 Jun 22 2018 data1 drwxr-xr-x 5 aaa0000 Domain_Users 4096 Jun 22 2018 data2 drwxr-xr-x 5 aaa0000 Domain_Users 4096 Jun 22 2018 data3 # Make compressed tarballs directly % tar czf data1.tgz data1 % tar czf data2.tgz data2 % tar czf data3.tgz data3 # Make tarballs % tar czf data1.tar data1 % tar czf data2.tar data2 % tar czf data3.tar data3 # Compress tarballs % gzip data1.tar % gzip data2.tar % gzip data3.tar |
It is VERY IMPORTANT that you remove the original copy after taring/compressing the file/directory.
...