Compressed data files: Difference between revisions
m (fixed one small typo) |
|||
Line 10: | Line 10: | ||
=== Dealing with truncated compressed files === |
=== Dealing with truncated compressed files === |
||
A situation sometimes encountered is that a compressed file is truncated (this most frequently occurs when a program aborts and a file buffer was not flushed). Naive application of <tt>gzip -d</tt> (or <tt>gunzip</tt>) will lead to a "unexpected end of file" error message. However, it is easy to access most of the data in this file by |
A situation sometimes encountered is that a compressed file is truncated (this most frequently occurs when a program aborts and a file buffer was not flushed). Naive application of <tt>gzip -d</tt> (or <tt>gunzip</tt>) will lead to a "unexpected end of file" error message. However, it is easy to access most of the data in this file by decompressing "on the fly" and redirecting the data stream to a file. For example, if <tt>output.dat.gz</tt> is truncated, use |
||
<pre> |
<pre> |
||
gzip -dc output.dat.gz > output.dat |
gzip -dc output.dat.gz > output.dat |
Latest revision as of 14:30, 29 October 2014
Many computer simulation packages offer the option to write data files in compressed form. This is strongly recommended.
Advantages
- Save disk/tape space
- Faster writing of data (the CPU time expenditure for the compression is negligible compared to the time save by writing smaller files)
- File corruption is detected more easily
- Easier transfer to other computer systems
Dealing with truncated compressed files
A situation sometimes encountered is that a compressed file is truncated (this most frequently occurs when a program aborts and a file buffer was not flushed). Naive application of gzip -d (or gunzip) will lead to a "unexpected end of file" error message. However, it is easy to access most of the data in this file by decompressing "on the fly" and redirecting the data stream to a file. For example, if output.dat.gz is truncated, use
gzip -dc output.dat.gz > output.dat
This will yield the same error message as gzip -d output.dat.gz but now the decompressed file contents (until close to the truncation point) will be saved to output.dat
Avoiding decompression of compressed files
Ideally, compressed data files are not decompressed during analysis. Decompression would require additional diskspace, as well as time to compress the data again after the analysis. Instead, analysis tools such as the generic analyzer and autocorrelation can handle compressed data directly. Indeed, even gnuplot can plot compressed data files, see gnuplot usage notes.