algorowara
algorowara

Reputation: 1720

The Efficiency of Hard-Coding vs. File Input

I'm working on a machine learning project in Java which will involve a very large model (the output of a Support Vector Machine, for those of you familiar with that) that will need to be retrieved fairly frequently for use by the end user. The bulk of the model consists of large two-dimensional array of fairly small objects.

Unfortunately, I do not know exactly how large the model is going to be (I've been working with benchmark data so far, and the data I'm actually going to be using isn't ready yet), nor do I know the specifications of the machine it will run on, as that is also up in the air.

I already have a method to write the model to a file as a string, but the write process takes a great deal of time and the read process takes the better part of a minute. I'd like to cut down on that time, so I had the either bright or insanely convoluted idea of writing the model to a .java file in such a way that it could be compiled and then run to produce a fully formed model.

My questions to you are, will storing and compiling the model in Java be significantly faster than reading it from the file, under the assumption that the model is about 1 MB in size? And is there some reason I haven't seen yet that this could be a fantastically stupid idea that I should not pursue under any circumstances?

Thank you for any ideas you can give me.

EDIT: apparently trying to automatically write several thousand values into code makes a method that is roughly two orders of magnitude larger than the compiler can handle. Ah well, live and learn.

Upvotes: 2

Views: 262

Answers (4)

ping
ping

Reputation: 1239

Will storing and compiling the model in Java be significantly faster than reading it from the file ?

That depends on the way you fashion your custom datastructure to contain your model.

Upvotes: 1

Karsten Müller
Karsten Müller

Reputation: 81

The question IMHO is if the reading of the file takes long because of IO or because of computing time (=> CPU). If the later is the case then tough luck. If your IO (e.g. hard disc) is the cause then you can compress the file and extract it after/while reading. There is (of course) ZIP-support in Java (even for Streams).

Upvotes: 1

Jaco Van Niekerk
Jaco Van Niekerk

Reputation: 4192

I agree with the answer given above to use a binary input format. Let's try optimising that first. Can you provide some information? ...or have you googled working with binary data? ...buffering it? etc.?

Writing a .java file and compiling it will be quiet interesting... but it is bound to give your issues at some point. However, I think you will find that it will be slightly slower than an optimised binary format, but faster than text-based input.

Also, be very careful for early optimisation. Usually, "highly-configurable" and "blinding fast" is mutual exclusive. Rather, get everything to work first and then use a profiler to optimise the really slow sections of the application.

Upvotes: 0

Colin D
Colin D

Reputation: 5661

Instead of writing to a string or to a java file, you might consider creating a compact binary format for you data.

Upvotes: 2

Related Questions