Skip to main content

Fast multi-source storage method

4 replies [Last post]
byhisdeeds
Offline
Joined: 2006-01-06
Points: 0

I'm trying to determine the fastest method to store and retrieve compressed data from a file.

I have 256 blocks of data, with each block holding 16 tiles, and each tile being 10K uncompressed. I want to store all of this in one file with each tile being able to be stored in a compressed state and accessed as fast as possible.

Instead of writing my own file handler with pointers to the blocks and tiles, I was thinking that I would use the Zip classes to store the tiles, indexing them by their block/tile index (according to my scheme).

My question is, how fast are the Zip classes in doing this type of access repeatably, both for reading and writing the Zip archive. Would I be better off writing my own class.

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
tmarble
Offline
Joined: 2003-08-22
Points: 0

We have really optimized the Zip classes a fair amount.

To the extent you try alternative methods, I
strongly encourage you to explore the NIO API's.

Regards,

--Tom

alexlamsl
Offline
Joined: 2004-09-02
Points: 0

What will be the access pattern on block and/or tile level?

The point being that a better compression ratio would be achieved if more data is compressed in a single run.

By the nature of LZW (the algorithm which ZIP employs) it is impossible to random access a data point in the compressed stream without having to decode from the start of the stream.

So for instance, compressing by blocks (so in 16 tiles' group) would give you worse case penalty of decompressing 16 tiles to get what you want, if you need to randomly access tiles in any blocks.

byhisdeeds
Offline
Joined: 2006-01-06
Points: 0

I need to randomly access any tile. I also need to be able to update the tile data. The blocking is just to make efficient use of file access or compression schemes.

So If I'm understanding you then I could just store multiple tiles in blocks (compressed), and then when I need a tile I just jump to a point in the stream where my tile is and decompress from there.

However to manage the updating of tiles I would have to read the whole block theat was compressed, decompress, replace the portion, then compress the whole block and store. Right?

I was hoping that Zip would handle a lot of the issues related to reading, writing, and updating of my data.

alexlamsl
Offline
Joined: 2004-09-02
Points: 0

Yes - inflating / deflating 160KB of data shouldn't be too bad performance-wise....

Alternatively, if you are going for ease of coding you can just use java.util.zip.ZipFile and ZipEntry and put your blocks / tiles in there ;)

Finally, going for more performant codec like RLE could be a solution as well (yes, it'll seriously depends on the nature of your data)