Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The HPC team is updating this page. Check back for new information.

...

Compressing files types that are ≥0.80 should be avoided unless it’s part of a directory.

Category

Type

Extension

Avg Ratio (compressed size/original size)

Programs

binary

<null>/.bin/.exe

0.25-1+

java binary

.jar

0.75-0.90

docx

.docx

compressed files

.gzip/.zip/.bz2

>1

Text related files

fonts

.ttf

0.46-0.71

txt

<null>/.txt

0.

80

32-0.

85

55

gif

docx

.

gif

docx

0.80-0.

95

85

compressed

source files

.

gzip

c/.cpp/.

zip

h/.

bz2

>1

image files

.jpg

java/.js/.

jpeg

py/.

png

0.93-1+

data files

.json/.xml

html/.css/.hpp/.lua

0.

30

10-0.

60

45

audio

log files

.

mp3/.ogg/.mp4

log

0.

80

05-0.

95

25

pdf

.pdf

0.50-0.95

svg

library files

.

svg

so

0.

30

25-0.

57

45

fonts

data files

.

ttf

json/.xml

0.

46

30-0.

71

60

txt

audio files

<null>

.mp3/.

txt

0.32-0.55

wav

.wav

0.45

ogg/.mp4/.wav

0.80-0.95*

source

Image related files

image files

.

c

jpg/.

cpp/.h/.java/.js/.py/.html/.css/.hpp/.lua

jpeg/.png

0.

10-0.45library files

93-1+

svg

.

so

svg

0.

25

30-0.

45

57

log files

gif

.

log

gif

0.

05

80-0

.25

.95

* certain types of .wav files can compress very well

Source

To determine file type:

...

Code Block
# List directory
% ls -l
drwxr-xr-x  5 aaa0000 Domain_Users       4096 Jun 22  2018 data1
drwxr-xr-x  5 aaa0000 Domain_Users       4096 Jun 22  2018 data2
drwxr-xr-x  5 aaa0000 Domain_Users       4096 Jun 22  2018 data3

# Make compressed tarballs directly
% tar czf data1.tgz  data1
% tar czf data2.tgz  data2
% tar czf data3.tgz  data3

# Make tarballs
% tar czf data1.tar  data1
% tar czf data2.tar  data2
% tar czf data3.tar  data3

# Compress tarballs
% gzip data1.tar
% gzip data2.tar
% gzip data3.tar

It is VERY IMPORTANT that you remove the original copy after taring/compressing the file/directory.

...